I have a query to get the most popular questions from the SEDE by
SELECT TOP 10 'https://stackoverflow.com/questions/' + CONVERT(VARCHAR, Posts.Id) as url, (ViewCount / (SELECT - DATEDIFF(DAY, GETDATE(), CreationDate))) AS ave_views_per_day, Body as QUESTION, Posts.Id, Title FROM Posts LEFT JOIN PostTags ON Posts.Id = PostTags.PostId LEFT JOIN Tags ON PostTags.TagId = Tags.Id WHERE TagName IN ('##tag##') AND ViewCount > 10000 ORDER BY ave_views_per_day DESC
I have an additional query to get the highest scoring answer given a question id:
SELECT 'https://stackoverflow.com/questions/' + CONVERT(VARCHAR, Id) as url, HighestScoringAnswer FROM (SELECT TOP 1 Id, Body AS HighestScoringAnswer, DENSE_RANK() OVER (ORDER BY Score DESC) Rank FROM (SELECT p.Id, p.ParentId, p.Body, p.Score FROM Posts p WHERE p.ParentId = ##QuestionId##) x ORDER BY Rank ASC) x
How can I combine these queries such that the ##QuestionId##
from the second query is sourced from the Posts.Id
from the first query, and the resulting HighestScoringAnswer
column is added to the results of the first query? I tried using the Union
operation but could not figure out how to “pass” the Id from one query to the other.
Advertisement
Answer
You have to make a JOIN
between both queries (questions and answers); not an UNION
(an UNION
return the rows from the first query plus the rows from second query, excluding duplicated rows; but you want the top ten questions and more scored answer for each one in the same row).
Try this query, I think this is the query you are looking for:
SELECT DISTINCT TOP 10 'https://stackoverflow.com/questions/' + CONVERT(VARCHAR, q.Id) as url, (q.ViewCount / -DATEDIFF(DAY, GETDATE(), q.CreationDate)) AS ave_views_per_day, q.Body as QUESTION, q.Id, q.Title, FIRST_VALUE(a.Body) OVER (PARTITION BY q.id ORDER BY a.Score DESC) AS HighestScoringAnswer FROM Posts q LEFT JOIN Posts a ON q.id = a.ParentId WHERE q.PostTypeId = 1 AND q.Tags LIKE '%postgresql%' AND q.ViewCount > 10000 ORDER BY ave_views_per_day DESC;