How to get the size of the duplicates in this query?

Question

Let's say I have the following data: I want to find out how much data I have in duplicates -- that is -- if we have one of each of the files (unique by md5) how much space do we save? The answer should be: Here is the base query I have thus far: I think the simplest way to

Accepted Answer

You could use:SELECT    md5,    SUM(file_size) - MIN(file_size) AS size_savedFROM yourTableGROUP BY    md5;DemoNote that my answer assumes that all records for a given md5 would always have the same file_size values, in the event that there be more than one record.  If not, then my answer would not work, but we would have to redefine the logic anyway in this case.

Advertisement

Answer

Demo