Tag: google-bigquery

Rolling 90 days active users in BigQuery, improving preformance (DAU/MAU/WAU)

bigquery-standard-sql google-bigquery sql

I’m trying to get the number of unique events on a specific date, rolling 90/30/7 days back. I’ve got this working on a limited number of rows with the query bellow but for large data sets I get …

Can you create multiple tables with one query in Google Big query?

google-bigquery sql

New create table feature has been released and I was wondering if it’s possible to create 2 or more tables with one query. I tried, but it returns error Error: Syntax error: Unexpected keyword CREATE at [8:1] Any ideas? Is it even possible? Thanks! Answer As per the documentation link in your question: Only one CREATE statement is allowed. So,

google bigquery select from a timestamp column between now and n days ago

gcp google-bigquery sql

I have a dataset in bigquery with a TIMESTAMP column “register_date” (sample value “2017-11-19 22:45:05.000 UTC” ). I need to filter records based on x days or weeks before today criteria. Example query select all records which are 2 weeks old. Currently I have this query (which I feel like a kind of hack) that works and returns the correct

How to drop multiple tables in Big query using Wildcards TABLE_DATE_RANGE()?

google-bigquery google-cloud-platform sql

I was looking at the documentation but I haven’t found the way to Drop multiple tables using wild cards. I was trying to do something like this but it doesn’t work: Answer I just used python to loop this and solve it using Graham example:

Finding the maximum distance between events in a group in gdelt-bq.full dataset, BigQuery

gdelt google-bigquery haversine sql

I need to find the longest distance across all the event points for each country in gdelt-bq.full:events dataset. For having information about countries make groups there is a join with gdelt-bq:extra.countryinfo. So now I have this table: The difficulty is that there are around 50k events in total, and maximum within a group is 15K (for US) and I need

Using ARRAY_AGG() with DISTINCT and ORDER BY with ORDINAL

google-bigquery sql

I have some data that I am trying to aggregate (greatly simplified here). The raw data uses a schema similar to the following: There are many actual records due to the “MISC” column, but I’m only trying to focus on the first 5 columns shown above. A sample of the raw data is shown below (note that the values shown

BigQuery query nested json

google-bigquery json nested sql

I have JSON data which is saved in BigQuery as a string. { “event”:{ “action”:”prohibitedSoftwareCheckResult”, “clientTime”:”2017-07-16T12:55:40.828Z”, “clientTimeZone”:”3″, …

Alternative to BigQuery for medium-sized data

amazon-redshift google-bigquery mysql sql

This is a follow-up to the question Why doesn’t BigQuery perform as well on small data sets. Let’s suppose I have a data-set that is ~1M rows. In the current database that we’re using (mysql) aggregation queries would run quite slow, perhaps taking ~10s or so on complex aggregations. On BigQuery, the initialization time required might make this query take

WITH in BigQuery

common-table-expression google-bigquery sql

Does BigQuery support the WITH clause? I don’t like formatting too many subqueries. For example: WITH alias_1 AS (SELECT foo1 c FROM bar) , alias_2 AS (SELECT foo2 c FROM bar a, alias_1 b WHERE b.c =…

BigQuery SQL for 28-day sliding window aggregate (without writing 28 lines of SQL)

google-bigquery sliding-window sql

I’m trying to compute a 28 day moving sum in BigQuery using the LAG function. The top answer to this question Bigquery SQL for sliding window aggregate from Felipe Hoffa indicates that that you can use the LAG function. An example of this would be: Is there a way to do this without having to write out 28 lines of