I have two tables with many-to-many relationship. I am trying to get values from both of the table where UserId
is unique (I’m joining these table on this value)
I am rying to use pre aggregated query, but I get error
Column 'clv.ProbabilityAlive' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
I understand that I should add these all values to group by clause, but then I am getting duplicates because peakClv values repeat.
If i am using simple join then it takes forever because of many to many relationship.
this is my query:
SELECT
distinct(s.userid) as userId,
s.ProbabilityAlive AS ProbabilityAlive,
a.PeakClv as PeakClv
FROM (
SELECT [UserId], ([sb].[ProbabilityAlive]) AS ProbabilityAlive
FROM clv as sb
WHERE sb.[CalculationDate] = '20200311'
GROUP BY [UserId]
) s
LEFT JOIN (
SELECT [UserId], PeakClv
FROM [dbo].[AdditionalClvData] where peakClv > 1
GROUP BY [UserId]
) a ON a.[UserId] = s.[UserId]
I am a bit out of ideas could someone lend a hand?
I also tried using distinct like one answer suggested:
SELECT
distinct (s.userid) as userId,
s.ProbabilityAlive AS ProbabilityAlive,
a.PeakClv as PeakClv
FROM (
SELECT DISTINCT ([UserId]), ([sb].[ProbabilityAlive]) AS
ProbabilityAlive
FROM clv as sb
WHERE sb.[CalculationDate] = '10/09/2020 00:00:00' AND sb.
[EstimatedNumberOfTransactionsLong] >= 0 AND sb.
[EstimatedNumberOfTransactionsLong] <= 5680 AND sb.[ClientId] = '16'
AND sb.[Product] = 'Total'
ORDER BY sb.[userId] asc OFFSET (1 - 1) * 10 ROWS FETCH NEXT 10 ROWS
ONLY
) s
LEFT JOIN (
SELECT DISTINCT [UserId], PeakClv
FROM [dbo].[AdditionalClvData]
) a ON a.[UserId] = s.[UserId]
Advertisement
Answer
If you have not aggregation function like SUM(), MAX() .. you can’t use GROUP BY
SELECT
distinct s.userid as userId,
s.ProbabilityAlive AS ProbabilityAlive,
a.PeakClv as PeakClv
FROM (
SELECT DISTINCT [UserId], ([sb].[ProbabilityAlive]) AS ProbabilityAlive
FROM clv as sb
WHERE sb.[CalculationDate] = '20200311'
) s
LEFT JOIN (
SELECT DISTINCT [UserId], PeakClv
FROM [dbo].[AdditionalClvData] where peakClv > 1
) a ON a.[UserId] = s.[UserId]
if you need distinct (not repeated rows) use distinct
but looking to you img seems you need an aggregation function on PeakClv eg max() and group by
SELECT
s.userid as userId,
s.ProbabilityAlive AS ProbabilityAlive,
max(a.PeakClv) as PeakClv
FROM (
SELECT DISTINCT [UserId], ([sb].[ProbabilityAlive]) AS ProbabilityAlive
FROM clv as sb
WHERE sb.[CalculationDate] = '20200311'
) s
LEFT JOIN (
SELECT DISTINCT [UserId], PeakClv
FROM [dbo].[AdditionalClvData] where peakClv > 1
) a ON a.[UserId] = s.[UserId]
GROUP BY s.userid,
s.ProbabilityAlive