I have a table with 100k records. I want to create an array for each 10k items and loop this, i.e. first 10k items-> array(first 10k items), then seconds array(10k different items), then third and so on. In other words, each array contains 10k different records.
This is what I tried:
x
DECLARE
i int64 DEFAULT 0;
LOOP
SET
i=ROW_NUMBER() OVER();
IF
MOD(i,10000) THEN
SELECT
ARRAY(
SELECT
item
FROM
`project.dataset.table`
LIMIT
100000) AS items;
END IF
;
END LOOP
;
This is the error I get:
Query error: SELECT without FROM clause cannot use analytic functions at [5:5]
I also tried this:
DECLARE
i int64 DEFAULT 0;
LOOP
SET
i= (
SELECT
item,
ROW_NUMBER() OVER()
FROM
`project.dataset.table`);
IF
MOD(i,10000) THEN
SELECT
ARRAY(
SELECT
item
FROM
`project.dataset.table`
LIMIT
100000) AS items;
END IF
;
END LOOP
;
The error I get is:
Query error: Scalar subquery cannot have more than one column unless using SELECT AS STRUCT to build STRUCT values at [5:6]
Advertisement
Answer
Are you looking for something like this
with cte as
(
select row_number() over (order by (select null)) as rn from `data-to-insights`.ecommerce.all_sessions_raw
limit 50
),
cte2 as
(
select rn,cast(floor(((rn - 1) /10)) as int64) as bucket from cte
)
select bucket,format("%t",array_agg(rn)) as array_of_rownumber
from cte2
group by bucket