Skip to content
Advertisement

Getting count depending on subquery

In my SELECT statement below, I’m looking to get a COUNT of connections from a table, where connections are defined as 1 connection per unique transactionID.
Thus, in my COUNT, I want to check to make sure that a connection has not already been made (i.e. same transactionID but a lower primary key id).

I tried using COUNT(DISTINCT pc.TransactionID) but the problem was that the same transactionID can occur within each grouping (ThisDate, SiteName, UserName, Command), so it was counting some connections more than once.

Then I tried joining with the same table and used SUM instead of COUNT, which seemed to work, but that increased BytesTransferred value due to the extra records that were pulled in.

Ideas on the best way to handle this?

ProtocolCommands table

ProtocolCommandID, int, not null, primary key
Time_Stamp Datetime
SiteName nvarchar(50) null
UserName nvarchar(50) null
Command nvarchar(50) null
BytesTransferred numeric(18,0) null
TransactionID int not null

Select statement

SELECT
    CONVERT(date,pc.Time_stamp) As ThisDate,
    pc.SiteName,
    pc.UserName,
    pc.Command,
    SUM(CASE WHEN p.TransactionID IS NULL THEN 1 ELSE 0 END) As Connections,    
    -- COUNT(DISTINCT pc.TransactionID) AS Connections,
    SUM(pc.BytesTransferred) AS DataTransferred 
FROM ProtocolCommands pc
    LEFT OUTER JOIN ProtocolCommands p ON p.TransactionID = pc.TransactionID AND p.ProtocolCommandID < pc.ProtocolCommandID
WHERE pc.Time_Stamp >= '01/01/2019'
GROUP BY CONVERT(date,pc.Time_stamp),pc.SiteName, pc.UserName, pc.Command

To give an example, here are some records:

Time_stamp  SiteName    UserName    Command BytesTransferred    TransactionID
2019-04-20  MySite  Joe connect 2000    12345
2019-04-20  MySite  Joe send    20000   12345
2019-04-21  MySite  Joe retrieve    25952   12345
2019-04-21  MySite  Joe send    30000   12345

When executing the SELECT with the COUNT(DISTINCT pc.TransactionID) and removing the inner join, the results are this (note incorrect DataTransferred for the last two rows):

ThisDate    SiteName    UserName    Command Connections DataTransferred
2019-04-20  MySite  Joe connect 1   2000
2019-04-20  MySite  Joe send    0   20000
2019-04-21  MySite  Joe retrieve    0   51904
2019-04-21  MySite  Joe send    0   90000

When executing the SELECT with the SUM and adding the inner join, the results are this (note incorrect Connections – there should only be 1 total):

ThisDate    SiteName    UserName    Command Connections DataTransferred
2019-04-20  MySite  Joe connect 1   2000
2019-04-20  MySite  Joe send    1   20000
2019-04-21  MySite  Joe retrieve    1   25952
2019-04-21  MySite  Joe send    1   30000

Expected result:

ThisDate    SiteName    UserName    Command Connections DataTransferred
2019-04-20  MySite  Joe connect 1   2000
2019-04-20  MySite  Joe send    0   20000
2019-04-21  MySite  Joe retrieve    0   25952
2019-04-21  MySite  Joe send    0   30000

Advertisement

Answer

Not sure if I understood correctly, but you seem to want to detect the oldest occurance of a connection. You can do that using this query:

SELECT ROW_NUMBER() OVER (PARTITION BY TransactionId ORDER BY ProtocolCommandID ASC) as RowNum
, ProtocolCommandId
FROM ProtocolCommands

This will have RowNum = 1 for the record where connection appears for the very first time.

Joining that to your main table you can get the correct connection count

SELECT 
  CONVERT(date,pc.Time_stamp) As ThisDate,
    pc.SiteName,
    pc.UserName,
    pc.Command,
    CASE WHEN first_conn.RowNum = 1 THEN SUM(first_conn.RowNum) ELSE 0 END AS Connection,
    SUM(pc.BytesTransferred) AS DataTransferred 
FROM ProtocolCommands pc
    LEFT OUTER JOIN 
        (SELECT ROW_NUMBER() OVER (PARTITION BY TransactionId ORDER BY ProtocolCommandID ASC) as RowNum
            , ProtocolCommandId
        FROM ProtocolCommands) as first_conn ON pc.ProtocolCommandID = first_conn.ProtocolCommandID

GROUP BY CONVERT(date,pc.Time_stamp),pc.SiteName, pc.UserName, pc.Command, first_conn.RowNum
User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement