I want to ‘batch’ my result (coming from a union of several queries) with a predefined ‘batch size’ but I can’t figure out why the first batch is always incorrect?
For instance with the following code:
x
DECLARE @BATCHSIZE AS INT = 2;
DECLARE @TEMPTABLE TABLE(ITEMID VARCHAR (10))
INSERT INTO @TEMPTABLE VALUES ('100'),('200'),('300'),('400'),('500'),('600'),('700'),('800'),('900'),('1000'),('1100'),('1200'),('1300'),('1400'),('1500')
;
WITH TEMP AS
(
SELECT * FROM @TEMPTABLE
)
SELECT *, BatchId = (dense_rank() over (order by ITEMID) / @BatchSize + 1)
FROM (
SELECT * From TEMP
UNION ALL
SELECT * From TEMP
) AS temptable
I get a result:
100 1
100 1
1000 2
1000 2
1100 2
1100 2
1200 3
1200 3
1300 3
1300 3
1400 4
1400 4
1500 4
1500 4
200 5
200 5
300 5
300 5
400 6
400 6
500 6
500 6
600 7
600 7
700 7
700 7
800 8
800 8
900 8
900 8
It seems like they’re all ok except for batch 1 which only consist of itemid 100? Must be doing something wrong here..
Advertisement
Answer
dense_rank()
starts with 1. Shift it to start with 0 :
SELECT *, BatchId = (dense_rank() over (order by ITEMID) - 1 )/ @BatchSize + 1