Skip to content
Advertisement

Aggregated row count differences between tables

I have two MySQL tables A and B both with this schema

ID entity_id asset asset_type
0 12345 x 1
.. ……… ….. ……….

I would like to get an aggregated top 10/50/whatever entity_ids with the largest row count difference between the two tables. I think I could do this manually by just getting the highest row count by entity_id like so

select count(*), entity_id
-> from A
-> group by entity_id
-> order by count(*) desc;

and just manually comparing to the same query for table B but I’m wondering if there’s a way to do this in just one query, that compares row counts for each distinct entity_id and aggregates the differences between row counts. A few notes

  • There is an index on entity_id for both tables
  • Table B will always have an equivalent or greater number of rows for each entity_id

Sample output

entity_id difference
12345 100
3232 75
5992 40
and so on for top 10/50

Advertisement

Answer

Aggregate in each table and join the results to get the difference:

SELECT a.entity_id, b.counter - a.counter diff
FROM (SELECT entity_id, COUNT(*) counter FROM A GROUP BY entity_id) a
INNER JOIN (SELECT entity_id, COUNT(*) counter FROM B GROUP BY entity_id) b
ON a.entity_id = b.entity_id
ORDER BY diff DESC LIMIT 10
User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement