This might be a novice question – I’m still learning. I’m on PostgreSQL 9.6 with the following query:
SELECT locales, count(locales) FROM ( SELECT lower((regexp_matches(locale, '([a-z]{2,3}(-[a-z]{2,3})?)', 'i'))[1]) AS locales FROM users) AS _ GROUP BY locales
My query returns the following dynamic rows:
locales | count |
---|---|
en | 10 |
fr | 7 |
de | 3 |
n additional locales (~300)… | n-count |
I’m trying to rotate it so that locale values end up as columns with a single row, like this:
en | fr | de | n additional locales (~300)… |
---|---|---|---|
10 | 7 | 3 | n-count |
I’m having to do this to play nice with a time-series db/app
I’ve tried using crosstab()
, but all the examples show better defined tables with 3 or more columns.
I’ve looked at examples using join
, but I can’t figure out how to do it dynamically.
Advertisement
Answer
Base query
In Postgres 10 or later you could use the simpler and faster regexp_match()
instead of regexp_matches()
. (Since you only take the first match per row anyway.) But don’t bother and use the even simpler substring()
instead:
SELECT lower(substring(locale, '(?i)[a-z]{2,3}(?:-[a-z]{2,3})?')) AS locale , count(*)::int AS ct FROM users WHERE locale ~* '[a-z]{2,3}' -- eliminate NULL, allow index support GROUP BY 1 ORDER BY 2 DESC, 1
Simpler and faster than your original base query.
About those ordinal numbers in GROUP BY
and ORDER BY
:
Subtle difference: regexp_matches()
returns no row for no match, while substring()
returns null
. I added a WHERE
clause to eliminate non-matches a-priori – and allow index support if applicable, but I don’t expect indexes to help here.
Note the prefixed (?i)
, that’s a so-called “embedded option” to use case-insensitive matching.
Added a deterministic ORDER BY
clause. You’d need that for a simple crosstab()
.
Aside: you might need _
in the pattern instead of -
for locales like “en_US”.
Pivot
Try as you might, SQL does not allow dynamic result columns in a single query. You need two round trips to the server. See;
You can use a dynamically generated crosstab()
query. Basics:
Dynamic query:
But since you generate a single row of plain integer values, I suggest a simple approach:
SELECT 'SELECT ' || string_agg(ct || ' AS ' || quote_ident(locale), ', ') FROM ( SELECT lower(substring(locale, '(?i)[a-z]{2,3}(?:-[a-z]{2,3})?')) AS locale , count(*)::int AS ct FROM users WHERE locale ~* '[a-z]{2,3}' GROUP BY 1 ORDER BY 2 DESC, 1 ) t
Generates a query of the form:
SELECT 10 AS en, 7 AS fr, 3 AS de, 3 AS "de-at"
Execute it to produce your desired result.
In psql you can append gexec
to the generating query to feed the generated SQL string back to the server immediately. See: