I would like to write another table by partition date the table in bigquery. But I couldn’t find how to do it. I use Python and google cloud library. I want to create a table using standard SQL.But I get an error.
Error : google.api_core.exceptions.BadRequest: 400 GET https://bigquery.googleapis.com/bigquery/v2/projects/astute-baton-272707/queries/f4b9dadb-1390-4260-bb0e-fb525aff662c?maxResults=0&location=US: The number of columns in the column definition list does not match the number of columns produced by the query at [2:72]
Please let me know if there is another solution. Day to day İnsert to table the next stage of the project.
I may have been doing it wrong from the beginning. I am not sure.
Thank You.
client = bigquery.Client()
sql = """
CREATE OR REPLACE TABLE zzzzz.xxxxx.yyyyy (visitStartTime_ts INT64,date TIMESTAMP,hitsTime_ts INT64,appId STRING,fullVisitorId STRING,cUserId STRING,eventCategory STRING,eventLabel STRING,player_type STRING,PLAY_SESSION_ID STRING,CHANNEL_ID STRING,CONTENT_EPG_ID STRING,OFF_SET STRING)
PARTITION BY date
OPTIONS (
description="weather stations with precipitation, partitioned by day"
) AS
select
FORMAT_TIMESTAMP("%Y-%m-%d %H:%M:%S", TIMESTAMP_SECONDS(SAFE_CAST(visitStartTime AS INT64)), "Turkey") AS visitStartTime_ts,
date
,FORMAT_TIMESTAMP("%Y-%m-%d %H:%M:%S", TIMESTAMP_SECONDS(SAFE_CAST(visitStartTime+(h.time/1000) AS INT64)), "Turkey") AS hitsTime_ts
,h.appInfo.appId as appId
,fullVisitorId
,(SELECT value FROM h.customDimensions where index=1) as cUserId
,h.eventInfo.eventCategory as eventCategory
,h.eventInfo.eventAction as eventAction
,h.eventInfo.eventLabel as eventLabel
,REPLACE(SPLIT(h.eventInfo.eventCategory,'/{')[OFFSET(1)],'}','') as player_type
,SPLIT(h.eventInfo.eventLabel,'|')[OFFSET(0)] as PLAY_SESSION_ID
,SPLIT(h.eventInfo.eventLabel,'|')[OFFSET(1)] as CHANNEL_ID
,SPLIT(h.eventInfo.eventLabel,'|')[OFFSET(2)] as CONTENT_EPG_ID
,SPLIT(h.eventInfo.eventLabel,'|')[OFFSET(3)] as OFF_SET
FROM `zzzzz.yyyyyy.xxxxxx*` a,
UNNEST(hits) AS h
where
1=1
and SPLIT(SPLIT(h.eventInfo.eventCategory,'/{')[OFFSET(0)],'/')[OFFSET(0)] like 'player'
and _TABLE_SUFFIX BETWEEN FORMAT_DATE('%Y%m%d',DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY))
AND FORMAT_DATE('%Y%m%d',DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY))
AND (BYTE_LENGTH(h.eventInfo.eventCategory) - BYTE_LENGTH(REPLACE(h.eventInfo.eventCategory,'/{','')))/2 + 1 = 2
AND h.eventInfo.eventAction='heartBeat'
"""
job = client.query(sql) # API request.
job.result()
query_job.result() # Waits for the query to finish
print('Query results loaded to table {}'.format(table_ref.path))
Advertisement
Answer
A quick solution for the problem presented here: When creating a table, you don’t need to declare the schema of it, if there’s a query where data is coming from. Right now there’s a conflict between the data and the declared schema. So remove one.
Instead of starting the query with:
CREATE OR REPLACE TABLE zzzzz.xxxxx.yyyyy (visitStartTime_ts INT64,date TIMESTAMP,hitsTime_ts INT64,appId STRING,fullVisitorId STRING,cUserId STRING,eventCategory STRING,eventLabel STRING,player_type STRING,PLAY_SESSION_ID STRING,CHANNEL_ID STRING,CONTENT_EPG_ID STRING,OFF_SET STRING)
PARTITION BY date
Start the query with:
CREATE OR REPLACE TABLE zzzzz.xxxxx.yyyyy
PARTITION BY date