SQL Window Function to get addresses with more tha…

I have a Snowflake table which includes addresses, state, first names and last names. I would like to get a query that shows me only the addresses where more than 1 individual with a different last name is present.

So for example, assume that I have

address     | fname | lname    |State
10 lake road| John  | Smith    |FL
10 lake road| Julie | Gallagher|FL
3 gator cove| Jack  | Hoyt     |FL
3 gator cove| Debra | Hoyt     |FL

​x
 
address     | fname | lname    |State10 lake road| John  | Smith    |FL10 lake road| Julie | Gallagher|FL3 gator cove| Jack  | Hoyt     |FL3 gator cove| Debra | Hoyt     |FL​

I would like the query to return only 1 row in that example: 10 lake road. Because it’s the only house where there is more than 1 unique last name present.

I am currently using

SELECT  distinct a.address, a.fname, a.lname, a.state
FROM clients_addresses a
WHERE a.state = 'FL'
qualify count(1) over( partition by a.lname) > 1 
order by a.address

 
SELECT  distinct a.address, a.fname, a.lname, a.stateFROM clients_addresses aWHERE a.state = 'FL'qualify count(1) over( partition by a.lname) > 1 order by a.address​

However, this is just returning the addresses where there is more than 1 person, it doesn’t care if the last name is repeated. That’s what I’m trying to avoid.

I can’t quite understand where the query is going wrong. Snowflake doesn’t like using any distinct keyword after the initial select, and even if I use it, it only returns 1 occurrence of each address, but it’s still just addresses with more than 1 person, even if there was only 1 last name in the address.

It doesn’t need to involve the keyword “qualify”, I know Snowflake also accepts other things such as subselects that might help with this problem.

Answer

I would like the query to return only 1 row in that example: 10 lake road.

This sounds like aggregation:

SELECT a.address, count(*)
FROM clients_addresses a
WHERE a.state = 'FL'
GROUP BY a.address
HAVING COUNT(DISTINCT a.lname) > 1;

 
SELECT a.address, count(*)FROM clients_addresses aWHERE a.state = 'FL'GROUP BY a.addressHAVING COUNT(DISTINCT a.lname) > 1;​

If you want the original rows (which is not what your question asks for), you can use:

SELECT a.*
FROM clients_addresses a
WHERE a.state = 'FL'
QUALITY COUNT(DISTINCT a.lname) OVER (PARTITION BY a.address) > 1;

 
SELECT a.*FROM clients_addresses aWHERE a.state = 'FL'QUALITY COUNT(DISTINCT a.lname) OVER (PARTITION BY a.address) > 1;​

SQL Window Function to get addresses with more than 1 unique last name present (Snowflake)

Advertisement

Answer