I am working with a bike share dataset located here and here is a sample month of it. I want to get the percent of rides per day per membership type, so I applied this query through PostgreSQL:
SELECT Start_day, member_casual AS Membership, COUNT(*) AS Trips_Count, (COUNT(*) / (SELECT count(*) from rides)) AS Trips_percent, AVG(duration) AS Average_Trip_Duration FROM (SELECT member_casual, ended_at - started_at AS duration, EXTRACT(DOW from started_at) AS Start_day_num, TO_CHAR(started_at,'DY') AS Start_day FROM rides ) AS member_duration GROUP BY Start_day, Start_day_num, member_casual ORDER BY Start_day_num, Trips_Count;
The query returned all the fields correct, except the Trips_percent
it is all zeros!
I searched through several solutions; I found one suggesting to define the sum first through WITH clause, then use it in the query as follows:
WITH total AS (SELECT COUNT(member_casual) AS records FROM rides) SELECT Start_day, member_casual AS Membership, COUNT(*) AS Trips_Count, (COUNT(*) / total.records) AS Trips_percent, AVG(duration) AS Average_Trip_Duration FROM (SELECT member_casual, ended_at - started_at AS duration, EXTRACT(DOW from started_at) AS Start_day_num, TO_CHAR(started_at,'DY') AS Start_day FROM rides ) AS member_duration GROUP BY Start_day, Start_day_num, member_casual ORDER BY Start_day_num, Trips_Count;
Unfortunately it gave me the following error:
ERROR: missing FROM-clause entry for table "total" LINE 7: (count(*) / total.records) AS Trips_percent, ^ SQL state: 42P01 Character: 178
Is it because the FROM
clause in the main query does not refer to the table of the WITH
query?
What is wrong with either query and how can I do to solve them?
Advertisement
Answer
I would suggest phrasing this as a window function. But the key is avoiding integer division:
SELECT Start_day, member_casual AS Membership, COUNT(*) AS Trips_Count, (COUNT(*) * 1.0 / SUM(COUNT(*)) OVER ()) AS Trips_percent, AVG(duration) AS Average_Trip_Duration FROM (SELECT member_casual, ended_at - started_at AS duration, EXTRACT(DOW from started_at) AS Start_day_num, TO_CHAR(started_at,'DY') AS Start_day FROM rides ) AS member_duration GROUP BY Start_day, Start_day_num, member_casual ORDER BY Start_day_num, Trips_Count;