Skip to content
Advertisement

Combining two sql queries into one to get most popular questions & answers in single table from SEDE

I have a query to get the most popular questions from the SEDE by

SELECT TOP 10
    'https://stackoverflow.com/questions/' + CONVERT(VARCHAR, Posts.Id) as url,
    (ViewCount / (SELECT - DATEDIFF(DAY, GETDATE(), CreationDate))) AS ave_views_per_day,
    Body as QUESTION,
    Posts.Id,
    Title
FROM
    Posts
LEFT JOIN 
    PostTags ON Posts.Id = PostTags.PostId
LEFT JOIN 
    Tags ON PostTags.TagId = Tags.Id
WHERE 
    TagName IN ('##tag##')
    AND ViewCount > 10000
ORDER BY 
    ave_views_per_day DESC

I have an additional query to get the highest scoring answer given a question id:

SELECT 
    'https://stackoverflow.com/questions/' + CONVERT(VARCHAR, Id) as url, HighestScoringAnswer
FROM
    (SELECT TOP 1 
         Id, Body AS HighestScoringAnswer, 
         DENSE_RANK() OVER (ORDER BY Score DESC) Rank 
     FROM 
         (SELECT p.Id, p.ParentId, p.Body, p.Score
          FROM Posts p 
          WHERE p.ParentId = ##QuestionId##) x
     ORDER BY 
         Rank ASC) x

How can I combine these queries such that the ##QuestionId## from the second query is sourced from the Posts.Id from the first query, and the resulting HighestScoringAnswer column is added to the results of the first query? I tried using the Union operation but could not figure out how to “pass” the Id from one query to the other.

Advertisement

Answer

You have to make a JOIN between both queries (questions and answers); not an UNION (an UNION return the rows from the first query plus the rows from second query, excluding duplicated rows; but you want the top ten questions and more scored answer for each one in the same row).

Try this query, I think this is the query you are looking for:

SELECT DISTINCT TOP 10
    'https://stackoverflow.com/questions/' + CONVERT(VARCHAR, q.Id) as url,
    (q.ViewCount / -DATEDIFF(DAY, GETDATE(), q.CreationDate)) AS ave_views_per_day,
    q.Body as QUESTION,
    q.Id,
    q.Title,
    FIRST_VALUE(a.Body) OVER (PARTITION BY q.id ORDER BY a.Score DESC) AS HighestScoringAnswer
    
FROM Posts q
LEFT JOIN Posts a ON q.id = a.ParentId

WHERE q.PostTypeId = 1 
      AND q.Tags LIKE '%postgresql%'
      AND q.ViewCount > 10000
      
ORDER BY ave_views_per_day DESC;
User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement