I’m working on a project in Python that interacts with a PostgreSQL data warehouse, and I’m using the psycopg2 API. I am looking to create dynamically-typed tables.
For example: I would like to be able to execute the following code:
from psycopg2 import connect, sql
connection = connect(host="host", port="port", database="database", user="user", password="pw")
def create_table(tbl_name, col_name, col_type):
query = sql.SQL("CREATE TABLE {} ({} {})".format(sql.Identifier(tbl_name), sql.Identifier(col_name), sql.Identifier(column_type)))
connection.execute(query)
create_table('animals', 'name', 'VARCHAR')
and end up with a table named “animals” that contains a column “name” of type VARCHAR. However, when I attempt to run this, I get an error: ‘type “VARCHAR” does not exist’. I assume psycopg2’s built-in formatter is putting double quotes around the VARCHAR type when there should not be any. Normally, I would just work around this myself, but the documentation is very clear that Python string concatenation should never be used for fear of SQL injection attacks. Security is a concern for this project, so I would like to know if it’s possible to create dynamically-typed tables in this fashion using pyscopg2, and if not, whether there exists another third-party API that can do so securely. Thank you!
Danny
Advertisement
Answer
I was also having so much trouble with this aspect. sql.Identifier
is for double-quoted, well, SQL Identifiers which the datatypes (INTEGER
, TEXT
, etc.) are not. Looks like just making it plain SQL does the trick.
N.B. In your code, you should have pre-defined columns
tuples and not expose their definition to the front-end. This is also why tuples are useful here as they are immutable.
import psycopg2.sql as sql
def create_table( name, columns ):
# name = "mytable"
# columns = (("col1", "TEXT"), ("col2", "INTEGER"), ...)
fields = []
for col in columns:
fields.append( sql.SQL( "{} {}" ).format( sql.Identifier( col[0] ), sql.SQL( col[1] ) ) )
query = sql.SQL( "CREATE TABLE {tbl_name} ( {fields} );" ).format(
tbl_name = sql.Identifier( name ),
fields = sql.SQL( ', ' ).join( fields )
)
print( query.as_string(conn) ) # CREATE TABLE "mytable" ( "col1" TEXT, "col2" INTEGER );
# Get cursor and execute...