Skip to content
Advertisement

How to remove certain results from my search count?

I received a few irrelevant results to my count that I would like to remove from my output.

When using only the first “where lower (person) not like ‘%c%'” worked but when adding person b, c etc. as shown it gave me the original irrelevant results.

What should I change?

SELECT
    person,
    COUNT(1) AS count_mentions,
    COUNT(DISTINCT url) AS count_distinct_urls
FROM
    project.dataset.keyword
WHERE
    LOWER(person) NOT LIKE '%b%' 
    OR LOWER(person) NOT LIKE '%c%' 
    OR LOWER(person) NOT LIKE '%e%' 
    OR LOWER(person) NOT LIKE '%f%' 
GROUP BY
    person
ORDER BY
    count_mentions DESC
LIMIT
    5;

Expected results:

Row person  count_mentions  count_distinct_urls 
-----------------------------------------------
1     a          ---               ----
2     d 
3     g
4     h
5     i

Advertisement

Answer

You want AND, not OR:

where lower(person) not like '%b%' and
      lower(person) not like '%c%' and
      lower(person) not like '%e%' and
      lower(person) not like '%f%'  

This is logically equivalent to:

where not (lower(person) like '%b%' or
           lower(person) not like '%c%' or
           lower(person) not like '%e%' or
           lower(person) not like '%f%' 
          ) 

Or more concisely:

 where not regexp_contains(person, '[BbCcEeFf]')

Or if you want to be a bit inscrutable:

where not regexp_contains(person, '(?i)[bcef]')

The (?i) makes the pattern matching case-insensitive.

User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement