I have “duplicated” rows in Bigquery and I need to keep just the last occurrence grouped by id of element.
As you can see, these are not duplicated rows, those are duplicated Timestamps with different values. I need to keep one registry per Timestamp.
I run this query to get the example:
x
select
oid_asset_element,
TimeStamp_600s,
AmplifierTemperature_S0 ,
AmplifierTemperature_S1
from
`XXX.YYY.ZZZ`
where TimeStamp_600s = '2020-03-29T03:50:00'
and oid_asset_element in (5483, 5490)
order by oid_asset_element;
Table with data example:
Advertisement
Answer
In your sample data, the timestamps all look the same. Assuming they are really different, you can use qualify
:
select t.*
from `table` t
where 1=1
qualify row_number() over (partition by oid_asset_element order by timestamp_600s desc) = 1;
If you want to actually delete rows (and I would instead suggest putting the data into a new table), you could do:
delete from `table` t
where timestamp_600s < (select max(t2.timestamp_600s)
from `table` t2
where t2.oid_asset_element = t.oid_asset_element
);