Skip to content
Advertisement

Tag: amazon-web-services

Athena geospatial SQL joins never complete

A very basic geospatial join, based on this example, times out every time. The table polygons contains 340K polygons, while points contains 5K rows with latitude/longitude pairs (and an ID). Both are single .csv files in S3. Query: The SQL query above never completes in the default 30-minute Athena query time limit. I’ve found vanilla Athena queries on large-ish data

Search for exact string value in JSON

I have a column stored in JSON that looks like column name: s2s_payload Values: I want to query exact values in the array rather than returning all values for a certain data type. I was using JSON_EXTRACT to get distinct counts. If I want to filter where “”eventtype””:””search”” how can I do this? I tried using CAST(s2s_payload AS CHAR) =

Querying rows by index in S3 Select

With mysql the following code: would pull the 5th through 10th rows of the table. What is the equivalent for doing this through the SQL engine in S3 select (PrestoDB I believe)? Is there a rownumber constructor or operator that works with S3 select? Answer The S3 Select documentation is at: SQL Reference for Amazon S3 Select and Amazon Glacier

Advertisement