I am using Presto. I want to assign a row with multiple ‘tags’ using different criteria that are not mutually exclusive. For example, let’s say there’s a table with 4 columns: | food | color |…
Tag: presto
How to get top N rows with some conditions
I have a query something like this: But now, I want to see the top 10 product_id for each site and category_id based on the clicks. When I write the LIMIT function, it only shows the top 10 products but it does not group it by category_id and shop_id. How can I do this? Answer Use window functions. You can
SQL statement to query new buyers on rolling basis
I currently have a order table that looks like this: I have been trying to create an SQL statement that will return something like this by doing a count(distinct user_id) : Of course, there will be multiple item_ids in the order table. What I’m trying to achieve is to obtain the rolling number of buyers that have never bought that
SQL: Efficient way to count and group results by like value
I have a table that looks like this: What is the most efficient way to query it and return the following ? I was thinking to use case when statements but it seems messy. Answer In Presto you can split the delimited list into an array, then unnest the array. This gives you one record per element in each list.
Unable to run simple presto shell query
I am trying to run simplest query. However it is not working. -bash-4.2$ prestosql –execute “select 1;” Exception in thread “main” io.airlift.airline.ParseArgumentsUnexpectedException: Found …
How to break a row into multiple rows based on a column value in Athena (Presto)?
I have a Athena table that has a column containing array of values. I want to create multiple rows from one row such that the column of array can be changed to contain only 1 value. E.g. : to look like : How can I write my query to achieve this? Answer I think unnest() does what you want:
Presto how to find start date given week
I want to find start date from given ISO week (which can range from 1-53, Monday as starting day) and year using Presto SQL query. i.e. year – 2020 and week – 2 should return 06/01/2020 Is there any inbuilt function for this ? Table structure: Answer There’s no direct way for constructing a date from a year + week
PrestoDB/AWS Athena- Retrieve a large SELECT by chunks
I have to select more than 1.9 billion rows. I am trying to query a table hosted in a DB in AWS ATHENA console. The table is reading parquet files from the a S3 bucket. When I run this query: My query seems to time-Out as there are 1.9 billion rows that are returned when I run a COUNT on
Athena geospatial SQL joins never complete
A very basic geospatial join, based on this example, times out every time. The table polygons contains 340K polygons, while points contains 5K rows with latitude/longitude pairs (and an ID). Both are single .csv files in S3. Query: The SQL query above never completes in the default 30-minute Athena query time limit. I’ve found vanilla Athena queries on large-ish data
How to convert list of comma separated Ids into their name?
I have a table that contains: I have the table that has the names of this tasks: I want to generate the following output I know this structure isn’t ideal but this is legacy table which I will not change now. Is there easy way to get the output ? I’m using Presto but I think this can be solved