Understanding the tables referred to in a BigQuery array cross join

Question

I know the namespace isn't the correct term for this but it conveys what I'm trying to understand. Take this query: WITH Movies AS ( SELECT 'Titanic' AS title, 1997 AS year, ['Drama',' Romance']...

Accepted Answer

When you need to use array in cross (or other types of) join &#8211; you have below optionsfrom tableA cross join unnest(array) as elementfrom tableA, unnest(array) as element &#8212; here comma is actually shortcut for cross joinif array is a column of tableA, you can use yet another shortcut (now for unnest)from tableA, tableA.array as elementthe benefit of using UNNEST is that you can define OFFSET &#8211; like in below examplefrom tableA, unnest(array) as element with offset      Having offset is extremely important in many use casesIn my practice  &#8211; i am using all of above options depends on specific caseNow, as of difference in two queries:So, the first querySELECT title, year, Movies.Genres FROM Movies, Movies.Genres where Genres='Drama'    is equivalent to below (notice aliase)SELECT title, year, Movies.Genres FROM Movies, Movies.Genres as Genreswhere Genres='Drama' Genres in where Genres='Drama' refers to element from implicitly unnested Movies.Genres &#8211; so one of element is Drama which returns 1 element out of total twoAnd, now trick is that in select statement you explicitly calling out Movies.Genres which is the original array in  that row &#8211; so that explains the outputYou can try below to kind of confirm above explanationSELECT title, year, Movies.GenresFROM Movies, Movies.Genres as Genreswhere Genres in ('Drama', ' Romance')for above output will beWhile above explained the output for first query &#8211; I hope it is now clear why second query returns actual genre (Drama) instead of arrayHope this helped in understanding differences :o)

Advertisement

Answer