Skip to content
Advertisement

How to Select ID’s in SQL (Databricks) in which at least 2 items from a list are present

I’m working with patient-level data in Azure Databricks and I’m trying to build out a cohort of patients that have at least 2 diagnoses from a list of specific diagnosis codes. This is essentially what the table looks like:

The list of ICD_CD codes of interest is something like [2500, 3850, 8888]. In this case, I would want to return TOTAL UNIQUE PTNT_ID = 2. These would be PTNT_ID = (101, 222) as these are the only two patients that have at least 2 ICD_CD codes of interest.

When I use something like this, I’m able to return all of the relevant PTNT_ID values, but I’m not able to get the total count of these PTNT_ID:

When I try to add a COUNT statement in, it returns an error

Advertisement

Answer

Just select from the query:

User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement