A very basic geospatial join, based on this example, times out every time. The table polygons contains 340K polygons, while points contains 5K rows with latitude/longitude pairs (and an ID). Both are single .csv files in S3. Query: The SQL query above never completes in the default 30-minute Athena query time limit. I’ve found vanilla Athena queries on large-ish data
Tag: amazon-web-services
Search for exact string value in JSON
I have a column stored in JSON that looks like column name: s2s_payload Values: I want to query exact values in the array rather than returning all values for a certain data type. I was using JSON_EXTRACT to get distinct counts. If I want to filter where “”eventtype””:””search”” how can I do this? I tried using CAST(s2s_payload AS CHAR) =
Format pivot data with multiple conditions
My current query is SELECT COUNT (DISTINCT(“json_extract_scalar”(“data”, ‘$.user_id’))) AS users, event, date(timestamp) FROM tableName WHERE category=’category’ GROUP BY event, date(timestamp) ORDER …
S3 Select Invalid Path component
I’m trying to figure out how to use AWS S3 Select, everything seems pretty straight forward, but the following query just doesn’t want to work: select r.value from S3Object[*].outputs.private_subnets …
Date_Trunc not function working as expected
I am trying to use the Date_Trunc for MONTH function in a SQL statement but somehow it is not working for me. I am trying to pull entries which happen after April 1st, 2019. The raw date format from …
Querying rows by index in S3 Select
With mysql the following code: would pull the 5th through 10th rows of the table. What is the equivalent for doing this through the SQL engine in S3 select (PrestoDB I believe)? Is there a rownumber constructor or operator that works with S3 select? Answer The S3 Select documentation is at: SQL Reference for Amazon S3 Select and Amazon Glacier