Skip to content
Advertisement

Azure Stream Analytics Job delayed output of not matching records by 1 minutes between two event hubs

Could anyone help me, why the not matching records are delayed by exactly 1 minutes but the matching records are writing into blog storage container immediately.

Is there any way to avoid the delay even though eventA its not matching with other eventB (being my downstream system will take care in my use-case)

select b.eventenqueuedutctime as btime,a.Id,a.SysTime,a.UTCTime
,b.Id as BId,b.SysTime as BSysTime 
into outputStorage -- to blob storage (container)
from eventA a TIMESTAMP BY eventenqueuedutctime
left outer join eventB b TIMESTAMP BY eventenqueuedutctime
on a.id = b.id
and datediff(minute,b,a) between 0 and 180 -- join with last 3 hours of eventB data

Below is the output but look at the last row (Id:99) currentTime:T19:42:13.1690000Z which delayed by 1 minute compared top 4 rows (currentTime:T19:41:13.1690000Z)

FYI, Sending all the eventA Id (2,4,1,101,99) at once via EventDataBatch via Json serialization

{"btime":"2020-11-03T17:00:50.6360000Z","Id":2,"SysTime":"2020-11-03T11:41:12.860466-08:00","UTCTime":"2020-11-03T19:41:12.8604646Z","BId":2,"BSysTime":"2020-11-03T09:00:49.6751336-08:00","fullname":"cc","currentTime":"2020-11-03T19:41:13.1690000Z"}
{"btime":"2020-11-03T17:00:50.6360000Z","Id":4,"SysTime":"2020-11-03T11:41:12.8605138-08:00","UTCTime":"2020-11-03T19:41:12.8605135Z","BId":4,"BSysTime":"2020-11-03T09:00:49.6751371-08:00","fullname":null,"currentTime":"2020-11-03T19:41:13.1690000Z"}
{"btime":"2020-11-03T17:00:50.6360000Z","Id":1,"SysTime":"2020-11-03T11:41:12.8605561-08:00","UTCTime":"2020-11-03T19:41:12.8605559Z","BId":1,"BSysTime":"2020-11-03T09:00:49.6749841-08:00","fullname":"test","currentTime":"2020-11-03T19:41:13.1690000Z"}
{"btime":"2020-11-03T19:39:04.0100000Z","Id":101,"SysTime":"2020-11-03T11:41:12.860598-08:00","UTCTime":"2020-11-03T19:41:12.8605978Z","BId":101,"BSysTime":"2020-11-03T11:39:03.7462454-08:00","fullname":"test-101","currentTime":"2020-11-03T19:41:13.1690000Z"}
{"btime":null,"Id":99,"SysTime":"2020-11-03T11:41:12.860322-08:00","UTCTime":"2020-11-03T19:41:12.8602803Z","BId":null,"BSysTime":null,"fullname":null,"currentTime":"2020-11-03T19:42:13.1690000Z"}

Advertisement

Answer

This is because you use JOIN with DATEDIFF.

The use of temporal joins, such as JOIN with DATEDIFF:

Matches generate as soon as both sides of the matched events arrive.

Data that lacks a match, like LEFT OUTER JOIN, is generated at the end of the DATEDIFF window, for each event on the left side.

More details, you can refer to https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-troubleshoot-output#the-first-output-is-delayed.

User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement