Skip to content
Advertisement

Create Redshift table with new Geometry type through psycopg2

After Redshift announced support for Geometry types and spatial functions, I’d like to create a table with polygons for all countries. I’m failing to do the INSERT and would appreciate help.

Here is what I’ve tried:

I’ve downloaded the geojson and unzipped (https://datahub.io/core/geo-countries)

Then the following python snippet was used to create the table successfully (I’ve used the type GEOMETRY, not sure if I can optimise and use the sub-type POLYGON):

import psycopg2

conn = psycopg2.connect(...connection params)
cur = conn.cursor()
cur.execute("CREATE TABLE engagement.geospatial_countries (id INTEGER PRIMARY KEY, name VARCHAR(25), code VARCHAR(10), polygon GEOMETRY);")

The following script successfully reads the geojson, each entry in “countries” holding a Polygon GeoJson feature:

f = open("geospatial-data/countries.geojson", "r")
countries_file_contents = f.read()
countries_geojson = json.loads(countries_file_contents)
countries = countries_geojson["features"]

For those not familiar with GeoJson, it’s simply a set of JSON data that describes geospatial shapes. Here is an excerpt of the data:

{ "type": "FeatureCollection", "features": [{ "type": "Feature", "properties": { "ADMIN": "Aruba", "ISO_A3": "ABW" }, "geometry": { "type": "Polygon", "coordinates": [ [ [ -69.996937628999916, 12.577582098000036 ], [ -69.936390753999945, 12.531724351000051 ], [ -69.924672003999945, 12.519232489000046 ], [ -69.915760870999918, 12.497015692000076 ], [ -69.880197719999842, 12.453558661000045 ], [ -69.876820441999939, 12.427394924000097 ], [ -69.888091600999928, 12.417669989000046 ], [ -69.908802863999938, 12.417792059000107 ], [ -69.930531378999888, 12.425970770000035 ], [ -69.945139126999919, 12.44037506700009 ], [ -69.924672003999945, 12.44037506700009 ], [ -69.924672003999945, 12.447211005000014 ], [ -69.958566860999923, 12.463202216000099 ], [ -70.027658657999922, 12.522935289000088 ], [ -70.048085089999887, 12.531154690000079 ], [ -70.058094855999883, 12.537176825000088 ], [ -70.062408006999874, 12.546820380000057 ], [ -70.060373501999948, 12.556952216000113 ], [ -70.051096157999893, 12.574042059000064 ], [ -70.048736131999931, 12.583726304000024 ], [ -70.052642381999931, 12.600002346000053 ], [ -70.059641079999921, 12.614243882000054 ], [ -70.061105923999975, 12.625392971000068 ], [ -70.048736131999931, 12.632147528000104 ], [ -70.00715084499987, 12.5855166690001 ], [ -69.996937628999916, 12.577582098000036 ] ] ] } }, ... more countries }]}

Before I insert all countries, I first just want to try and create it for a single country:

country = countries[0]
geometry_to_insert = (
    country["properties"]["ADMIN"],
    country["properties"]["ISO_A3"],
    Json.dumps(country["geometry"]) # Have also tried psycopg2.extras.Json(country["geometry"]), as well as just using the dict
)

The following fails:

cur.execute(
  "INSERT INTO engagement.geospatial_countries (name, code, polygon) VALUES %%s",
  geometry_to_insert
)

With the following error: TypeError: not all arguments converted during string formatting

I’ve also tried

cur.execute(
  "INSERT INTO engagement.geospatial_countries (name, code, polygon) VALUES (%%s, %%s, %%s)",
  geometry_to_insert
)

But that gives the following error: psycopg2.errors.InternalError_: Compass I/O exception: Invalid hexadecimal character(s) found

How do I insert a polygon into redshift, using the new Geometry types?

Advertisement

Answer

Here I give the steps that worked to insert it into the DB.

First, a minor correction in creating a table for the geometries, using IDENTITY to have an auto-incrementing ID:

conn = psycopg2.connect(...connection params)
cur = conn.cursor()
cur.execute("CREATE TABLE engagement.geospatial_countries (id INTEGER IDENTITY(0,1) PRIMARY KEY, name VARCHAR(25), code VARCHAR(10), polygon GEOMETRY);")

Onto the Geometries. To insert the value, use a WKT value:

import geojson
from shapely.geometry import shape
...
# exact same steps as in question to read file, then
country = countries[0]
geom = shape(country["geometry"])
geometry_to_insert = (
    country["properties"]["ADMIN"],
    country["properties"]["ISO_A3"],
    geom.wkt
)

Then the following command to insert the value:

cur.execute(
  "INSERT INTO engagement.geospatial_countries (name, code, polygon) VALUES (%%s, %%s, ST_GeomFromText(%%s))",
  geometry_to_insert
)

Answers from both @Maurice Meyer and @piro guided me to this answer.

User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement