Skip to content
Advertisement

PostgreSQL dynamic query: find most recent timestamp of several unrelated tables

I have a number of tables, and many of them have a timestamp column. I can get a list of every table with a timestamp column:

SELECT table_name 
FROM information_schema.columns 
WHERE table_schema='my_schema' AND column_name='timestamp';

--------------
 table_name               
--------------
 apples
 bananas
 sharks
 lemons

I can get the hightest timestamp in the sharks table as follows

SELECT MAX(timestamp) FROM sharks;

-------------------------
           max           
-------------------------
 2021-11-24 00:00:00.000

I would like to get a table like

 table_name  |       last_updated               
-------------+-------------------------
 apples      | 2021-11-23 00:02:00.000
 bananas     | 2019-10-16 00:04:00.000
 sharks      | 2021-11-24 00:00:00.000
 lemons      | 1970-01-03 10:00:00.000

I’m suspecting this requires dynamic SQL, so I’m trying something like

SELECT (
    EXECUTE 'SELECT MAX(timestamp) FROM my_schema.' || table_name
) FROM (
    SELECT table_name 
    FROM information_schema.columns 
    WHERE table_schema='my_schema' AND column_name='timestamp'
);

But it seems like EXECUTE doesn’t work in subqueries.

Performance is not particularly a concern, just producing the desired results.

Advertisement

Answer

Dynamic Queries cannot function outside of PL/pgSQL blocks, so you need to wrap your code in one.

I set up test tables similar to yours, with only the “timestamp” column shared between them:

drop table if exists public.sharks_70099803 cascade; 
create table public.sharks_70099803 
as  select  1::int integer_column, 
            now()::timestamp as "timestamp";
drop table if exists public.bananas_70099803 cascade; 
create table public.bananas_70099803
as select   'some text'::text text_column, 
            now()::timestamp as "timestamp";

Wrap a dynamic query in PL/pgSQL function. Inside I build a query to pull the max(timestamp) from each table with the column, then aggregate those into one query with a union all in between, that I later execute.

CREATE OR REPLACE FUNCTION public.test_70099803()
RETURNS SETOF RECORD
LANGUAGE 'plpgsql'
AS $BODY$
BEGIN 
    return query 
    execute (
        select string_agg(select_per_table,' union all ') 
        from (  
            select 'select '''||
                    table_name||
                    ''' as table_name, max(timestamp) from public.'||
                    table_name "select_per_table" 
            from information_schema.columns 
            where table_schema='public' 
            and column_name='timestamp'
        ) a
    );
END 
$BODY$;

select * from public.test_70099803() as t(table_name text, max_timestamp timestamp);
--    table_name    |       max_timestamp
--------------------+----------------------------
-- sharks_70099803  | 2021-11-24 17:12:03.24951
-- bananas_70099803 | 2021-11-24 17:12:03.253614
--(2 rows)

You can parametrise your function to be applicable to more groups of tables, or to have a predefined output table structure that’ll let you just select * from test_70099803();

User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement