result of my query is being used in aws quicksight. even though quicksight offers percentileCont() which does the job for us I want to use it in the query instead of using calculated field. eventually what I want to do is create a point column where depending on a column that ranges from [a, b]. Right now I f…
Tag: presto
Keep the sum even on days without revenue in cumulative sum when using window function in Presto
So my problem is that I have sales data from, say for the sake of clarity, 3 different products. I will be selling 10 of these a week and I want to visualize them in a cumulative sum. I have been using the following little snippet to get the cumulative sum of the revenue. However, this is not sufficient since
Presto – pivot table
Hi I have a table like this: I want to convert it into like this: Answer For a fixed list of properties, you can do conditional aggregation: This puts the session id in the first column and 0/1 values in each column, depending on whether the given session owns the given property. To generate the exact output …
sql query for stratified sampling with dynamic sample size
Let say we have a table in this format: From this example, we see two stratas s1 and s2. What I want to do is stratified sampling and the sample size is the last column. For example, I want to randomly sample 2 instances from s1 and 1 random sample from s2. Any help is appreciated. Please keep in mind
SQL: Create an extra column with last 3 days date as a value
I have a table(users) with these sample data user location name 111 usa aaa 222 canada bbb 333 usa ccc 444 mexico ddd 555 japan eee …
Date distinct count over week
Im trying to get a distinct count of user ids logs per day with every week as a partition for the distinct identification. e.g. if one user logs on Friday/Saturday of week 1, and on Monday/Friday of …
How to assign a field name to an SQL Count in AWS Athena SQL
I’m still new to Athena. I think I got my database defined correctly, as shown in Example 1 below. However, when I run a count query, I get results unlike what I would expect. Example 1: Works Fine except count is called “_col3” Result: Example 2: syntax error This query shows a syntax error…
Need an SQL query that will left join with another table, which will in turn return the latest values based on time, grouped into a single row
I need help with a SELECT SQL query that will left join with another table on id column, which will in turn return the latest values based on time, grouped into a single row? Basically, join the two tables in such a way that for each record in users table that exists in time_series table, return the latest va…
SQL Data cleaning
I have a data set where I am trying to clean data. I want to remove the ** from email-address and phone_number and have just numbers in the phone_number column. how can i do it. Answer Here is one option using string functions: This removes ‘**’ from email, and all non-digit characters from phone_…
Complex SQL query aggregation and grouping on athena
I have a table like this: I would like to retrieve the number of chat performed by users for each database (db) and the last part where I fail, retrieve also a list of all mentors by users. The final output should be like this for example (notice there is only one time max for greg in the admin column)