Skip to content
Advertisement

How to rotate a two-column table?

This might be a novice question – I’m still learning. I’m on PostgreSQL 9.6 with the following query:

SELECT locales, count(locales) FROM (
    SELECT lower((regexp_matches(locale, '([a-z]{2,3}(-[a-z]{2,3})?)', 'i'))[1]) 
    AS locales FROM users) 
AS _ GROUP BY locales

My query returns the following dynamic rows:

locales count
en 10
fr 7
de 3
n additional locales (~300)… n-count

I’m trying to rotate it so that locale values end up as columns with a single row, like this:

en fr de n additional locales (~300)…
10 7 3 n-count

I’m having to do this to play nice with a time-series db/app

I’ve tried using crosstab(), but all the examples show better defined tables with 3 or more columns.

I’ve looked at examples using join, but I can’t figure out how to do it dynamically.

Answer

Base query

In Postgres 10 or later you could use the simpler and faster regexp_match() instead of regexp_matches(). (Since you only take the first match per row anyway.) But don’t bother and use the even simpler substring() instead:

SELECT lower(substring(locale, '(?i)[a-z]{2,3}(?:-[a-z]{2,3})?')) AS locale
     , count(*)::int AS ct
FROM   users
WHERE  locale ~* '[a-z]{2,3}'  -- eliminate NULL, allow index support
GROUP  BY 1
ORDER  BY 2 DESC, 1

Simpler and faster than your original base query.

About those ordinal numbers in GROUP BY and ORDER BY:

Subtle difference: regexp_matches() returns no row for no match, while substring() returns null. I added a WHERE clause to eliminate non-matches a-priori – and allow index support if applicable, but I don’t expect indexes to help here.

Note the prefixed (?i), that’s a so-called “embedded option” to use case-insensitive matching.

Added a deterministic ORDER BY clause. You’d need that for a simple crosstab().

Aside: you might need _ in the pattern instead of - for locales like “en_US”.

Pivot

Try as you might, SQL does not allow dynamic result columns in a single query. You need two round trips to the server. See;

You can use a dynamically generated crosstab() query. Basics:

Dynamic query:

But since you generate a single row of plain integer values, I suggest a simple approach:

SELECT 'SELECT ' || string_agg(ct || ' AS ' || quote_ident(locale), ', ')
FROM  (
   SELECT lower(substring(locale, '(?i)[a-z]{2,3}(?:-[a-z]{2,3})?')) AS locale
        , count(*)::int AS ct
   FROM   users
   WHERE  locale ~* '[a-z]{2,3}'
   GROUP  BY 1
   ORDER  BY 2 DESC, 1
   ) t

Generates a query of the form:

SELECT 10 AS en, 7 AS fr, 3 AS de, 3 AS "de-at"

Execute it to produce your desired result.

In psql you can append gexec to the generating query to feed the generated SQL string back to the server immediately. See: