Skip to content
Advertisement

Use SQL to ensure I have data for each day of a certain time period

I’m looking to only select one data point from each date in my report. I want to ensure each day is accounted for and has at least one row of information, as we had to do a few different things to move a large data file into our data warehouse (import one large Google Sheet for some data, use Python for daily pulls of some of the other data – want to make sure no date was left out), and this data goes from now through last summer. I could do a COUNT DISTINCT clause to just make sure the number of days between the first data point and yesterday (the latest data point), but I want to verify each day is accounted for. Should mention I am in BigQuery. Also, an example of the created_at style is: 2021-02-09 17:05:44.583 UTC

This is what I have so far:

**I know FIRST is probably not the best clause for this case, and it’s currently acting to grab the very first data point in created_at, but just as a jumping-off point.

Advertisement

Answer

You can use aggregation:

Note: This assumes that created_at is a date — or at least only has one value per date. You might need to convert it to a date:

User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement