I have a database with tables for
- equipment we service (table
e
, fielde_id
) - contracts on the equipment (table
c
, fieldsc_id
,e_id
,c_start
,c_end
) - maintenance we have performed in the past (table
m
,e_id
,m_id
,m_date
)
I am trying to build a query that will show me all equipment records, if it is currently in contract with the start/end date, and a count of any maintenance performed since the start date of the contract.
I have a subquery to get the current contract (this table is large and has a new line for each contract revision), but I can’t work out how to use the result of the contract subquery to return the maintenance visits since that date without returning multiple lines.
select e.e_id, c2.c_id, c2.c_start, c2.c_end, m2.count from e left join ( select c_id, c_start, c_end, e_id ...other things and filtering by joining the table to itself from c ) as c2 on c2.e_id = e.e_id
I would also like to be able to add this
m-subquery v1
left join ( select count(*), e_id from m where m.m_date >= c2.start ) as m2 on m2.e_id = e.e_id
But I’m unable to access c2.C_start
from within the second subquery.
I am able to return this table by joining outside the subquery, but this returns multiple lines.
m-subquery v2
left join ( select e_id, m_date, from m ) as m2 on m2.e_id = e.e_id and m.m_date >= c2.start
Is there a way to:
- Get the subquery field c2.start into the m-subquery v1?
- Aggregate the result of the m-subquery v2 without using group by (there are a lot of columns in the main select query)?
- Do this differently?
I’ve seen lateral
which I kind of think might be what I need but I have tried the keyword in front of both subqueries individually and together and it didn’t work to let me use c2.c_start inside at any point.
I am a little averse to using group by, mainly as the BI analyst at work says “slap a group by on it” when there are duplicates in reports rather than trying to understand the business process/database properly. I feel like having a group by on the main query shouldn’t be needed when I know for certain that the e
table has one record per e_id
, and the mess that having probably 59 out of 60 columns named in the group by would cause might make the query less maintainable.
Thanks, Sam
Advertisement
Answer
Since not all RDBMS support lateral
, I would like to present you the following general solution. You can make use of CTEs (WITH queries) to help structuring the query and reuse partial results. E.g. in the following code, you can think of current_contracts as a kind of virtual table existing only during query execution.
Part 1: DDLs and test data
DROP TABLE IF EXISTS e; CREATE TABLE e ( e_id INTEGER ); DROP TABLE IF EXISTS c; CREATE TABLE c ( c_id INTEGER, e_id INTEGER, c_start DATE, c_end DATE ); DROP TABLE IF EXISTS m; CREATE TABLE m ( e_id INTEGER, m_id INTEGER, m_date DATE ); INSERT INTO e VALUES (101),(102),(103); INSERT INTO c VALUES (201, 101, DATE '2021-01-01', DATE '2021-12-31'), (202, 102, DATE '2021-03-01', DATE '2021-04-15'), (203, 102, DATE '2021-04-16', DATE '2021-04-30'), (204, 103, DATE '2003-01-01', DATE '2003-12-31'), (205, 103, DATE '2021-04-01', DATE '2021-04-30'); INSERT INTO m VALUES (101, 301, DATE '2021-01-01'), (101, 302, DATE '2021-02-01'), (101, 303, DATE '2021-03-01'), (102, 304, DATE '2021-04-02'), (102, 305, DATE '2021-04-03'), (103, 306, DATE '2021-04-03');
Part 2: the actual query
WITH -- find currently active contracts per equipment: -- we assume there is 0 or 1 contract active per equipment at any time current_contracts AS ( SELECT * FROM c WHERE c.c_start <= CURRENT_DATE -- only active contracts AND c.c_end >= CURRENT_DATE -- only active contracts ), -- count maintenance visits during the (single) active contract per equipment, if any: current_maintenance AS ( SELECT m.e_id, COUNT(*) AS count_m_per_e -- a count of maintenance visits per equipment FROM m INNER JOIN current_contracts cc ON cc.e_id = m.e_id -- match maintenance to current contracts via equipment AND cc.c_start <= m.m_date -- only maintenance that was done during the current contract GROUP BY m.e_id ) -- bring the parts together for our result: -- we start with equipment and use LEFT JOINs to assure we retain all equipment SELECT e.*, cc.c_start, cc.c_end, CASE WHEN cc.e_id IS NOT NULL THEN 'yes' ELSE 'no' END AS has_contract, COALESCE(cm.count_m_per_e, 0) -- to replace NULL when no contract is active FROM e LEFT JOIN current_contracts cc ON cc.e_id = e.e_id LEFT JOIN current_maintenance cm ON cm.e_id = e.e_id ORDER BY e.e_id;
Please note that your real pre-processing logic for contracts and maintenance visits may be more complex, e.g. due to overlapping periods of active contracts per equipment.