I have a table like this (unsorted):
risk | category |
---|---|
Low | A |
Medium | B |
High | C |
Medium | A |
Low | B |
High | A |
Low | C |
Low | E |
Low | D |
High | B |
I need to sort rows by category
, but first based on the value of risk
. The desired result should look like this (sorted):
risk | category |
---|---|
Low | A |
Low | B |
Low | C |
Low | D |
Low | E |
Medium | A |
Medium | B |
High | A |
High | B |
High | C |
I’ve come up with below query but wonder if it is correct:
SELECT * FROM some_table ORDER BY CASE WHEN risk = 'Low' THEN 0 WHEN risk = 'Medium' THEN 1 WHEN risk = 'High' THEN 2 ELSE 3 END, category;
Just want to understand whether the query is correct or not. The actual data set is huge and there are many other values for risk and categories and hence I can’t figure out if the results are correct or not. I’ve just simplified it here.
Advertisement
Answer
Basically correct, but you can simplify:
SELECT * FROM some_table ORDER BY CASE risk WHEN 'Low' THEN 0 WHEN 'Medium' THEN 1 WHEN 'High' THEN 2 -- rest defaults to NULL and sorts last END , category;
A “switched” CASE
is shorter and slightly cheaper.
In the absence of an ELSE
branch, all remaining cases default to NULL, and NULL sorts last in default ascending sort order. So you don’t need to do anything extra.
Many other values
… there are many other values for risk
While all other values are lumped together at the bottom of the sort order, this seems ok.
If all of those many values get their individual ranking, I would suggest an additional table to handle ranks of risk values. Like:
CREATE TABLE riskrank ( risk text PRIMARY KEY , riskrank real ); INSERT INTO riskrank VALUES ('Low' , 0) , ('Medium', 1) , ('High' , 2) -- many more? ;
Data type real
, so it’s easy to squeeze in rows with fractional digits in different positions (like enum
values do it internally).
Then your query is:
SELECT s.* FROM some_table s LEFT JOIN risk_rank rr USING (risk) ORDER BY rr.riskrank, s.category;
LEFT JOIN
, so missing entries in riskrank
don’t eliminate rows.
enum
?
I already mentioned the data type enum
. That’s a possible alternative as enum values are sorted in the order they are defined (not how they are spelled). They only occupy 4 bytes on disk (real
internally), are fast and enforce valid values implicitly. See:
However, I would only even consider an enum if the sort order of your values is immutable. Changing sort order and adding / removing allowed values is cumbersome. The manual:
Although enum types are primarily intended for static sets of values, there is support for adding new values to an existing enum type, and for renaming values (see
ALTER TYPE
). Existing values cannot be removed from an enum type, nor can the sort ordering of such values be changed, short of dropping and re-creating the enum type.