This might be a novice question ā I’m still learning. I’m on PostgreSQL 9.6 with the following query:
SELECT locales, count(locales) FROM (
SELECT lower((regexp_matches(locale, '([a-z]{2,3}(-[a-z]{2,3})?)', 'i'))[1])
AS locales FROM users)
AS _ GROUP BY locales
My query returns the following dynamic rows:
| locales | count |
|---|---|
| en | 10 |
| fr | 7 |
| de | 3 |
| n additional locales (~300)… | n-count |
I’m trying to rotate it so that locale values end up as columns with a single row, like this:
| en | fr | de | n additional locales (~300)… |
|---|---|---|---|
| 10 | 7 | 3 | n-count |
I’m having to do this to play nice with a time-series db/app
I’ve tried using crosstab(), but all the examples show better defined tables with 3 or more columns.
I’ve looked at examples using join, but I can’t figure out how to do it dynamically.
Advertisement
Answer
Base query
In Postgres 10 or later you could use the simpler and faster regexp_match() instead of regexp_matches(). (Since you only take the first match per row anyway.) But don’t bother and use the even simpler substring() instead:
SELECT lower(substring(locale, '(?i)[a-z]{2,3}(?:-[a-z]{2,3})?')) AS locale
, count(*)::int AS ct
FROM users
WHERE locale ~* '[a-z]{2,3}' -- eliminate NULL, allow index support
GROUP BY 1
ORDER BY 2 DESC, 1
Simpler and faster than your original base query.
About those ordinal numbers in GROUP BY and ORDER BY:
Subtle difference: regexp_matches() returns no row for no match, while substring() returns null. I added a WHERE clause to eliminate non-matches a-priori – and allow index support if applicable, but I don’t expect indexes to help here.
Note the prefixed (?i), that’s a so-called “embedded option” to use case-insensitive matching.
Added a deterministic ORDER BY clause. You’d need that for a simple crosstab().
Aside: you might need _ in the pattern instead of - for locales like “en_US”.
Pivot
Try as you might, SQL does not allow dynamic result columns in a single query. You need two round trips to the server. See;
You can use a dynamically generated crosstab() query. Basics:
Dynamic query:
But since you generate a single row of plain integer values, I suggest a simple approach:
SELECT 'SELECT ' || string_agg(ct || ' AS ' || quote_ident(locale), ', ')
FROM (
SELECT lower(substring(locale, '(?i)[a-z]{2,3}(?:-[a-z]{2,3})?')) AS locale
, count(*)::int AS ct
FROM users
WHERE locale ~* '[a-z]{2,3}'
GROUP BY 1
ORDER BY 2 DESC, 1
) t
Generates a query of the form:
SELECT 10 AS en, 7 AS fr, 3 AS de, 3 AS "de-at"
Execute it to produce your desired result.
In psql you can append gexec to the generating query to feed the generated SQL string back to the server immediately. See: