Skip to content
Advertisement

How can I select an ID from the row with the maximum value for each of several columns?

I want to extract the values of an ID column from the rows containing the maximum values of each of several other columns, and then collate these in a new table which has the column headers in one column and the row IDs of the maximum values in another.

To clarify:

My data is structured like this, extending into a roughly 20×20 grid:

And I want a query that returns the name which contains the highest value for each column A, B, C etc., like so:

In this case I know the maximum values will all be the same so I can search for values = 4, but it might not be in future so finding the max would be a more robust method.

The best I’ve come up with so far is:

And so on, repeating for each column header.

I hoped this would generate the top_scorer column above, which I could then join to another table containing the column headers of my original table. However it does not return any name more than once, so I only get as many rows are there are distinct entries in the top_scorer column. The resulting values are also sorted alphabetically rather than in the order they’re first found.

So rather than

the code above returns

I also tried:

But this gave an error: “misuse of aggregate function MAX()”

Can anyone suggest a query that would return the top_scorers table above?

Similar questions I’ve found are either looking to return maximums from one column using GROUP BY, or the maximum value across a range of columns for each row (equivalent to the highest score achieved by each name in this case), which aren’t what I’m looking for.

Is there maybe something clever that can be done with PIVOT or some other transposing function? I’m a SQL beginner and haven’t tried using anything like that before.

This is running on DB Browser for SQLite.

Advertisement

Answer

The following approach takes advantage of one of sqlite3’s documented but non-standard behaviors when mixing aggregate and non-aggregate results in a query – when using max(), all non-aggregate values are taken from one of the rows with the maximum value (Ties broken at random):

which gives


However, a database design that uses a one-to-many relationship with a second table instead of one column per thing is going to be a better approach.

Consider this schema:

populated with your test data:

you can use this query:

which avoids having to hard code each id being tracked like the current one-per-column design you’re using. Much, much cleaner and flexible.

User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement