I have two tables with many-to-many relationship. I am trying to get values from both of the table where UserId
is unique (I’m joining these table on this value)
I am rying to use pre aggregated query, but I get error
Column 'clv.ProbabilityAlive' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
I understand that I should add these all values to group by clause, but then I am getting duplicates because peakClv values repeat.
If i am using simple join then it takes forever because of many to many relationship.
this is my query:
SELECT distinct(s.userid) as userId, s.ProbabilityAlive AS ProbabilityAlive, a.PeakClv as PeakClv FROM ( SELECT [UserId], ([sb].[ProbabilityAlive]) AS ProbabilityAlive FROM clv as sb WHERE sb.[CalculationDate] = '20200311' GROUP BY [UserId] ) s LEFT JOIN ( SELECT [UserId], PeakClv FROM [dbo].[AdditionalClvData] where peakClv > 1 GROUP BY [UserId] ) a ON a.[UserId] = s.[UserId]
I am a bit out of ideas could someone lend a hand?
I also tried using distinct like one answer suggested:
SELECT distinct (s.userid) as userId, s.ProbabilityAlive AS ProbabilityAlive, a.PeakClv as PeakClv FROM ( SELECT DISTINCT ([UserId]), ([sb].[ProbabilityAlive]) AS ProbabilityAlive FROM clv as sb WHERE sb.[CalculationDate] = '10/09/2020 00:00:00' AND sb. [EstimatedNumberOfTransactionsLong] >= 0 AND sb. [EstimatedNumberOfTransactionsLong] <= 5680 AND sb.[ClientId] = '16' AND sb.[Product] = 'Total' ORDER BY sb.[userId] asc OFFSET (1 - 1) * 10 ROWS FETCH NEXT 10 ROWS ONLY ) s LEFT JOIN ( SELECT DISTINCT [UserId], PeakClv FROM [dbo].[AdditionalClvData] ) a ON a.[UserId] = s.[UserId]
Advertisement
Answer
If you have not aggregation function like SUM(), MAX() .. you can’t use GROUP BY
SELECT distinct s.userid as userId, s.ProbabilityAlive AS ProbabilityAlive, a.PeakClv as PeakClv FROM ( SELECT DISTINCT [UserId], ([sb].[ProbabilityAlive]) AS ProbabilityAlive FROM clv as sb WHERE sb.[CalculationDate] = '20200311' ) s LEFT JOIN ( SELECT DISTINCT [UserId], PeakClv FROM [dbo].[AdditionalClvData] where peakClv > 1 ) a ON a.[UserId] = s.[UserId]
if you need distinct (not repeated rows) use distinct
but looking to you img seems you need an aggregation function on PeakClv eg max() and group by
SELECT s.userid as userId, s.ProbabilityAlive AS ProbabilityAlive, max(a.PeakClv) as PeakClv FROM ( SELECT DISTINCT [UserId], ([sb].[ProbabilityAlive]) AS ProbabilityAlive FROM clv as sb WHERE sb.[CalculationDate] = '20200311' ) s LEFT JOIN ( SELECT DISTINCT [UserId], PeakClv FROM [dbo].[AdditionalClvData] where peakClv > 1 ) a ON a.[UserId] = s.[UserId] GROUP BY s.userid, s.ProbabilityAlive