result of my query is being used in aws quicksight. even though quicksight offers percentileCont() which does the job for us I want to use it in the query instead of using calculated field. eventually what I want to do is create a point column where depending on a column that ranges from [a, b]. Right now I find out
Tag: presto
Keep the sum even on days without revenue in cumulative sum when using window function in Presto
So my problem is that I have sales data from, say for the sake of clarity, 3 different products. I will be selling 10 of these a week and I want to visualize them in a cumulative sum. I have been using the following little snippet to get the cumulative sum of the revenue. However, this is not sufficient since
Presto – pivot table
Hi I have a table like this: I want to convert it into like this: Answer For a fixed list of properties, you can do conditional aggregation: This puts the session id in the first column and 0/1 values in each column, depending on whether the given session owns the given property. To generate the exact output you showed (which
sql query for stratified sampling with dynamic sample size
Let say we have a table in this format: From this example, we see two stratas s1 and s2. What I want to do is stratified sampling and the sample size is the last column. For example, I want to randomly sample 2 instances from s1 and 1 random sample from s2. Any help is appreciated. Please keep in mind
SQL: Create an extra column with last 3 days date as a value
I have a table(users) with these sample data user location name 111 usa aaa 222 canada bbb 333 usa ccc 444 mexico ddd 555 japan eee …
Date distinct count over week
Im trying to get a distinct count of user ids logs per day with every week as a partition for the distinct identification. e.g. if one user logs on Friday/Saturday of week 1, and on Monday/Friday of …
How to assign a field name to an SQL Count in AWS Athena SQL
I’m still new to Athena. I think I got my database defined correctly, as shown in Example 1 below. However, when I run a count query, I get results unlike what I would expect. Example 1: Works Fine except count is called “_col3” Result: Example 2: syntax error This query shows a syntax error when I click “Run Query”: Answer
Need an SQL query that will left join with another table, which will in turn return the latest values based on time, grouped into a single row
I need help with a SELECT SQL query that will left join with another table on id column, which will in turn return the latest values based on time, grouped into a single row? Basically, join the two tables in such a way that for each record in users table that exists in time_series table, return the latest values based
SQL Data cleaning
I have a data set where I am trying to clean data. I want to remove the ** from email-address and phone_number and have just numbers in the phone_number column. how can i do it. Answer Here is one option using string functions: This removes ‘**’ from email, and all non-digit characters from phone_number.
Complex SQL query aggregation and grouping on athena
I have a table like this: I would like to retrieve the number of chat performed by users for each database (db) and the last part where I fail, retrieve also a list of all mentors by users. The final output should be like this for example (notice there is only one time max for greg in the admin column)