Tag: google-bigquery

Query error: Column name ICUSTAY_ID is ambiguous. Using multiple subqueries in BigQuery

Hi, I receive the following query error “Query error: Column name ICUSTAY_ID is ambiguous” referred to the third last line of code (see the following code). Please can you help me? Thank you so much! I am an SQL beginner.. Answer Both t1 and t2 have a column called ICUSTAY_ID. When you join them together into a single dataset you

sql: Calculate the correlations and convert rows into columns

database google-bigquery sql

So my current table has more than 100 fields and I am trying to calculate the correlations between the input variables and output variable, and then convert all those columns into rows. For example, my current table looks like this: input_1 input_2 output 3 6 5 4 7 5 6 4 4 6 9 3 7 10 5 9 9

SQL: calculate the sum of each column and convert them into rows

database google-bigquery sql

My current table includes more than 100 columns and I need to calculate the sum of each column and convert them into rows. Since there are more than 100 column, it is not convenient to use the unpivote clause. Is there any other way to do that? Below is a snapshot of the original table: col1 col2 23 44 33

Bigquery – How to filter records with certain condition

google-bigquery mysql sql

I’m trying to write SQL (in BigQuery) the following in order to get a result that satisfies the following conditions. for ex: my table contains the following data I want to filter out records where it contains only value as “p” from the table. The expected result would be I have tried with the following query but it returns other

How to Unpivot a Struct in BigQuery?

google-bigquery google-cloud-platform sql unpivot

I have a table with an ID and then a struct. I want to turn each element of the struct into a new row with the struct field name being a value in the column Period and the value being the structures value. See table below Query that generated the table: Current data I tried this: But I get this

BigQuery SQL reshape four columns into one column with name of column as value

google-bigquery sql

I currently have a table that looks like: Is it possible to reshape my data into: Where the name of the column equal to 1 becomes the value of code in the new reshaped dataset. I have already presorted the dataset such that there will be no case where more than one column is equal to 1. Answer Consider below

Efficient syntax to update 5K rows in BQ table

bq google-bigquery python-3.x sql

I’m trying to update ~5K rows in bq using python client. Here is my current try: and How can I map the account id list to a string as follows which seems more efficient(?) UPDATE mytable SET somefield=( CASE WHEN (id=100) THEN ‘some value removed’ WHEN (id=101) THEN ‘some value removed’ END ) WHERE id IN (100,101); I’ve tried: Plus

Count distinct customers who bought in previous period and not in next period Bigquery

google-bigquery sql

I have a dataset in bigquery which contains order_date: DATE and customer_id. I try to count distinct customer_id between the months of the previous year and the same months of the current year. For example, from 2019-01-01 to 2020-01-01, then from 2019-02-01 to 2020-02-01, and then who not bought in the same period of next year 2020-01-01 to 2021-01-01, then

Generate a random number for each group and assign it to all rows in the group

google-bigquery sql

I have a table in the form The goal is to group the table by ID and for each group, select a random number from the number of groups (in this case, select a random number from [1, 3]) and assign all rows of each group one number. One possible configuration is I was thinking of using ROW_NUMBER() and PARTITION()

Query table with multiple “duplicates”, getting the most recent

date datetime google-bigquery sql

I have a table which stores predictions from a machine learning model. This is a model that each hour (“predicted_at”) predicts a value for the next 24 hours(“predicted_for”). This means that the table have many different values for each “id” and “predicted_for”. Example of how the the table looks like for one ID and one predicted_for timestamp: value id predicted_at