Context
When a function returns a TABLE
or a SETOF composite-type
, like this one:
CREATE FUNCTION func(n int) returns table(i int, j bigint) as $$ BEGIN RETURN QUERY select 1,n::bigint union all select 2,n*n::bigint union all select 3,n*n*n::bigint; END $$ language plpgsql;
the results can be accessed by various methods:
select * from func(3)
will produce these output columns :
i | j ---+--- 1 | 3 2 | 9 3 | 27
select func(3)
will produce only one output column of ROW type.
func ------- (1,3) (2,9) (3,27)
select (func(3)).*
will produce like #1:
i | j ---+--- 1 | 3 2 | 9 3 | 27
When the function argument comes from a table or a subquery, the syntax #3 is the only possible one, as in:
select N, (func(N)).* from (select 2 as N union select 3 as N) s;
or as in this related answer. If we had LATERAL JOIN
we could use that, but until PostgreSQL 9.3 is out, it’s not supported, and the previous versions will still be used for years anyway.
Problem
The problem with syntax #3 is that the function is called as many times as there are columns in the result. There’s no apparent reason for that, but it happens.
We can see it in version 9.2 by adding a RAISE NOTICE 'called for %', n
in the function. With the query above, it outputs:
NOTICE: called for 2 NOTICE: called for 2 NOTICE: called for 3 NOTICE: called for 3
Now, if the function is changed to return 4 columns, like this:
CREATE FUNCTION func(n int) returns table(i int, j bigint,k int, l int) as $$ BEGIN raise notice 'called for %', n; RETURN QUERY select 1,n::bigint,1,1 union all select 2,n*n::bigint,1,1 union all select 3,n*n*n::bigint,1,1; END $$ language plpgsql stable;
then the same query outputs:
NOTICE: called for 2 NOTICE: called for 2 NOTICE: called for 2 NOTICE: called for 2 NOTICE: called for 3 NOTICE: called for 3 NOTICE: called for 3 NOTICE: called for 3
2 function calls were needed, 8 were actually made. The ratio is the number of output columns.
With syntax #2 that produces the same result except for the output columns layout, these multiple calls don’t happen:
select N, func(N) from (select 2 as N union select 3 as N) s;
gives:
NOTICE: called for 2 NOTICE: called for 3
followed by the 6 resulting rows:
n | func ---+------------ 2 | (1,2,1,1) 2 | (2,4,1,1) 2 | (3,8,1,1) 3 | (1,3,1,1) 3 | (2,9,1,1) 3 | (3,27,1,1)
Questions
Is there a syntax or a construct with 9.2 that would achieve the expected result by doing only the minimum required function calls?
Bonus question: why do the multiple evaluations happen at all?
Advertisement
Answer
You can wrap it up in a subquery but that’s not guaranteed safe without the OFFSET 0
hack. In 9.3, use LATERAL
. The problem is caused by the parser effectively macro-expanding *
into a column list.
Workaround
Where:
SELECT (my_func(x)).* FROM some_table;
will evaluate my_func
n
times for n
result columns from the function, this formulation:
SELECT (mf).* FROM ( SELECT my_func(x) AS mf FROM some_table ) sub;
generally will not, and tends not to add an additional scan at runtime. To guarantee that multiple evaluation won’t be performed you can use the OFFSET 0
hack or abuse PostgreSQL’s failure to optimise across CTE boundaries:
SELECT (mf).* FROM ( SELECT my_func(x) AS mf FROM some_table OFFSET 0 ) sub;
or:
WITH tmp(mf) AS ( SELECT my_func(x) FROM some_table ) SELECT (mf).* FROM tmp;
In PostgreSQL 9.3 you can use LATERAL
to get a saner behaviour:
SELECT mf.* FROM some_table LEFT JOIN LATERAL my_func(some_table.x) AS mf ON true;
LEFT JOIN LATERAL ... ON true
retains all rows like the original query, even if the function call returns no row.
Demo
Create a function that isn’t inlineable as a demonstration:
CREATE OR REPLACE FUNCTION my_func(integer) RETURNS TABLE(a integer, b integer, c integer) AS $$ BEGIN RAISE NOTICE 'my_func(%)',$1; RETURN QUERY SELECT $1, $1, $1; END; $$ LANGUAGE plpgsql;
and a table of dummy data:
CREATE TABLE some_table AS SELECT x FROM generate_series(1,10) x;
then try the above versions. You’ll see that the first raises three notices per invocation; the latter only raise one.
Why?
Good question. It’s horrible.
It looks like:
(func(x)).*
is expanded as:
(my_func(x)).i, (func(x)).j, (func(x)).k, (func(x)).l
in parsing, according to a look at debug_print_parse
, debug_print_rewritten
and debug_print_plan
. The (trimmed) parse tree looks like this:
:targetList ( {TARGETENTRY :expr {FIELDSELECT :arg {FUNCEXPR :funcid 57168 ... } :fieldnum 1 :resulttype 23 :resulttypmod -1 :resultcollid 0 } :resno 1 :resname i ... } {TARGETENTRY :expr {FIELDSELECT :arg {FUNCEXPR :funcid 57168 ... } :fieldnum 2 :resulttype 20 :resulttypmod -1 :resultcollid 0 } :resno 2 :resname j ... } {TARGETENTRY :expr {FIELDSELECT :arg {FUNCEXPR :funcid 57168 ... } :fieldnum 3 :... } :resno 3 :resname k ... } {TARGETENTRY :expr {FIELDSELECT :arg {FUNCEXPR :funcid 57168 ... } :fieldnum 4 ... } :resno 4 :resname l ... } )
So basically, we’re using a dumb parser hack to expand wildcards by cloning nodes.