Tag: google-bigquery

Calculate distinct values per day that resets each month (Big Query)

distinct distinct-values google-bigquery sql

If I have table like this that contains unique id of a store and the date when they make an order store_id order_date A 01-Jun-21 A 02-Jun-21 B 02-Jun-21 C 02-Jun-21 A 03-Jun-21 A 04-Jun-21 D 04-Jun-21 A 01-Jul-21 B 01-Jul-21 I need to aggregate it by day, but each day should only consist of store_id that never appeared before

Google BigQuery Federated Query connection Error

google-bigquery google-cloud-platform postgresql sql

I am trying to use the Federated Query functionality in BigQuery to query some data from my Cloud SQL Postgres DB. Unfortunately I get the following error: It is quite similar to the error discussed here, but it was not solved (the word that appears in the quotes is different though): Connection Error while running federated query in BigQuery to

How to add a Jinja function to .sqlfluff config

dbt google-bigquery jinja2 sql

I’m using the jinja functions run_query and execute. https://docs.getdbt.com/reference/dbt-jinja-functions/run_query But when sqlfluff lint I get the following error: Undefined jinja template variable: ‘run_query’ I’m trying to add it to the .sqlfluff config but there doesn’t seem to be any guidance anywhere on how to add this to the config file. Any help would be greatly appreciated! Thanks Answer Add templater=dbt

How to use date_diff for two adjacent sessions using BigQuery?

date-difference datediff google-bigquery sql

I’m trying to calculate average hours between two adjacent sessions using the data from the following table: user_id event_timestamp session_num A 2021-04-16 10:00:00.000 UTC 1 A 2021-04-16 11:00:00.000 UTC 2 A 2021-04-16 13:00:00.000 UTC 3 A 2021-04-16 16:00:00.000 UTC 4 B 2021-04-16 12:00:00.000 UTC 1 B 2021-04-16 14:00:00.000 UTC 2 B 2021-04-16 19:00:00.000 UTC 3 C 2021-04-16 10:00:00.000 UTC 1

Summing field in other rows conditionally

google-bigquery sql

I have table in the form like below: Pilot Leg Duration Takeoff John 1 60 9:00:00 John 2 60 9:00:00 John 3 30 9:00:00 Paul 1 60 12:00:00 Paul 2 30 12:00:00 Paul 3 30 12:00:00 Paul 4 60 12:00:00 And I am trying to figure out is a query to get the following: Pilot Leg Duration Takeoff LegStart John

Is it possible to get active sessions per hour in SQL?

data-analysis google-bigquery sql

start_time end_time HostID gameID 6/14/2021 20:13 6/14/2021 22:22 1 AB1 6/14/2021 20:20 6/14/2021 21:47 2 AB2 6/14/2021 20:22 6/14/2021 22:07 3 AB3 6/14/2021 20:59 6/14/2021 21:15 4 AB4 6/15/2021 21:24 6/15/2021 22:09 1 AB5 6/15/2021 21:24 6/15/2021 21:59 2 AB6 6/15/2021 23:11 6/16/2021 01:22 4 AB7 6/16/2021 20:13 6/16/2021 21:23 3 AB8 I have a table that has a start

BigQuery correlated subqueries – transform array to array

google-bigquery sql

I’m trying to join array elements in BigQuery but I am getting the following error message: Correlated subqueries that reference other tables are not supported unless they can be de-correlated, such as by transforming them into an efficient JOIN. In my first table I have something like: field1 | field2 | some_list Elements in some_list have ids and other data

How to do the equivalent of ‘distinct’ on array field in BQ?

arrays google-bigquery sql

Let’s take the following data: It can also be generated in BQ with the following statement: How would I do the following two ‘distinct’ totals on the right of the following: That is, I want to get a “total” that doesn’t double-count, or rather, only gets the distinct items based on the RowID. Answer If I understand correctly, it is

How to use two where conditions in SQL?

google-bigquery sql

Following is the query I have written and I need to where conditions. Admin_Level_3_palika is not null Year = ‘2021’ However, the following query is still giving me null values for Admin_Level_3_palika Please help me with how to work with this. Following is an example of my dataset, Epid_ID being unique for each row. Answer If these are your conditions:

How to get data for missing weeks in Summarised data

google-bigquery sql

I have two tables stores_data and financial_week as shown below. Stores data is a summarised data across multiple attributes. My task is to generate data for all the weeks present in the second table, if data is missing, the quantity should be listed as 0. Expected Result set is this – I have done cross join but after that I