I wrote a function that outputs a PostgreSQL SELECT
query well formed in text form. Now I don’t want to output a text anymore, but actually run the generated SELECT
statement against the database and return the result – just like the query itself would.
What I have so far:
CREATE OR REPLACE FUNCTION data_of(integer) RETURNS text AS $BODY$ DECLARE sensors varchar(100); -- holds list of column names type varchar(100); -- holds name of table result text; -- holds SQL query -- declare more variables BEGIN -- do some crazy stuff result := 'SELECTrnDatahora,' || sensors || 'rnrnFROMrn' || type || 'rnrnWHERErid=' || $1 ||'rnrnORDER BY Datahora;'; RETURN result; END; $BODY$ LANGUAGE 'plpgsql' VOLATILE; ALTER FUNCTION data_of(integer) OWNER TO postgres;
sensors
holds the list of column names for the table type
. Those are declared and filled in the course of the function. Eventually, they hold values like:
sensors
:'column1, column2, column3'
Except forDatahora
(timestamp
) all columns are of typedouble precision
.type
:'myTable'
Can be the name of one of four tables. Each has different columns, except for the common columnDatahora
.
Definition of the underlying tables.
The variable sensors
will hold all columns displayed here for the corresponding table in type
. For example: If type
is pcdmet
then sensors
will be 'datahora,dirvento,precipitacao,pressaoatm,radsolacum,tempar,umidrel,velvento'
The variables are used to build a SELECT
statement that is stored in result
. Like:
SELECT Datahora, column1, column2, column3 FROM myTable WHERE id=20 ORDER BY Datahora;
Right now, my function returns this statement as text
. I copy-paste and execute it in pgAdmin or via psql. I want to automate this, run the query automatically and return the result. How can I do that?
Advertisement
Answer
Dynamic SQL and RETURN
type
(I saved the best for last, keep reading!)
You want to execute dynamic SQL. In principal, that’s simple in plpgsql with the help of EXECUTE
. You don’t need a cursor – in fact, most of the time you are better off without explicit cursors.
Find examples on SO with a search.
The problem you run into: you want to return records of yet undefined type. A function needs to declare the return type with the RETURNS
clause (or with OUT
or INOUT
parameters). In your case you would have to fall back to anonymous records, because number, names and types of returned columns vary. Like:
CREATE FUNCTION data_of(integer) RETURNS SETOF record AS ...
However, this is not particularly useful. This way you’d have to provide a column definition list with every call of the function. Like:
SELECT * FROM data_of(17) AS foo (colum_name1 integer , colum_name2 text , colum_name3 real);
But how would you even do this, when you don’t know the columns beforehand?
You could resort to a less structured document data types like json
, jsonb
, hstore
or xml
:
But for the purpose of this question let’s assume you want to return individual, correctly typed and named columns as much as possible.
Simple solution with fixed return type
The column datahora
seems to be a given, I’ll assume data type timestamp
and that there are always two more columns with varying name and data type.
Names we’ll abandon in favor of generic names in the return type.
Types we’ll abandon, too, and cast all to text
since every data type can be cast to text
.
CREATE OR REPLACE FUNCTION data_of(_id integer) RETURNS TABLE (datahora timestamp, col2 text, col3 text) LANGUAGE plpgsql AS $func$ DECLARE _sensors text := 'col1::text, col2::text'; -- cast each col to text _type text := 'foo'; BEGIN RETURN QUERY EXECUTE ' SELECT datahora, ' || _sensors || ' FROM ' || quote_ident(_type) || ' WHERE id = $1 ORDER BY datahora' USING _id; END $func$;
How does this work?
The variables
_sensors
and_type
could be input parameters instead.Note the
RETURNS TABLE
clause.Note the use of
RETURN QUERY EXECUTE
. That is one of the more elegant ways to return rows from a dynamic query.I use a name for the function parameter, just to make the
USING
clause ofRETURN QUERY EXECUTE
less confusing.$1
in the SQL-string does not refer to the function parameter but to the value passed with theUSING
clause. (Both happen to be$1
in their respective scope in this simple example.)Note the example value for
_sensors
: each column is cast to typetext
.This kind of code is very vulnerable to SQL injection. I use
quote_ident()
to protect against it. Lumping together a couple of column names in the variable_sensors
prevents the use ofquote_ident()
(and is typically a bad idea!). Ensure that no bad stuff can be in there some other way, for instance by individually running the column names throughquote_ident()
instead. AVARIADIC
parameter comes to mind …
Simpler with PostgreSQL 9.1+
With version 9.1 or later you can use format()
to further simplify:
RETURN QUERY EXECUTE format(' SELECT datahora, %s -- identifier passed as unescaped string FROM %I -- assuming the name is provided by user WHERE id = $1 ORDER BY datahora' ,_sensors, _type) USING _id;
Again, individual column names could be escaped properly and would be the clean way.
Variable number of columns sharing the same type
After your question updates it looks like your return type has
- a variable number of columns
- but all columns of the same type
double precision
(aliasfloat8
)
As we have to define the RETURN
type of a function I resort to an ARRAY
type in this case, which can hold a variable number of values. Additionally, I return an array with column names, so you could parse the names out of the result, too:
CREATE OR REPLACE FUNCTION data_of(_id integer) RETURNS TABLE (datahora timestamp, names text[], values float8[] ) AS $func$ DECLARE _sensors text := 'col1, col2, col3'; -- plain list of column names _type text := 'foo'; BEGIN RETURN QUERY EXECUTE format(' SELECT datahora , string_to_array($1) -- AS names , ARRAY[%s] -- AS values FROM %s WHERE id = $2 ORDER BY datahora' , _sensors, _type) USING _sensors, _id; END $func$ LANGUAGE plpgsql;
Various complete table types
If you are actually trying to return all columns of a table (for instance one of the tables at the linked page, then use this simple, very powerful solution with a polymorphic type:
CREATE OR REPLACE FUNCTION data_of(_tbl_type anyelement, _id int) RETURNS SETOF anyelement AS $func$ BEGIN RETURN QUERY EXECUTE format(' SELECT * FROM %s -- pg_typeof returns regtype, quoted automatically WHERE id = $1 ORDER BY datahora' , pg_typeof(_tbl_type)) USING _id; END $func$ LANGUAGE plpgsql;
Call (important!):
SELECT * FROM data_of(NULL::pcdmet, 17);
Replace pcdmet
in the call with any other table name.
How does this work?
anyelement
is a pseudo data type, a polymorphic type, a placeholder for any non-array data type. All occurrences ofanyelement
in the function evaluate to the same type provided at run time. By supplying a value of a defined type as argument to the function, we implicitly define the return type.PostgreSQL automatically defines a row type (a composite data type) for every table created, so there is a well defined type for every table. This includes temporary tables, which is convenient for ad-hoc use.
Any type can be
NULL
. So we hand in aNULL
value, cast to the table type:NULL::pcdmet
.Now the function returns a well-defined row type and we can use
SELECT * FROM data_of(...)
to decompose the row and get individual columns.pg_typeof(_tbl_type)
returns the name of the table as object identifier typeregtype
. When automatically converted totext
, identifiers are automatically double-quoted and schema-qualified if needed. Therefore, SQL injection is not a possible. This can even deal with schema-qualified table-names wherequote_ident()
would fail.