I’d like to have one row for each different hadm_id this is my query :
select DISTINCT (hadm_id) as Admission_ID, subject_id as Patient_ID, icustay_id as Care_Unit_ID , intime as In_Time , outtime Out_Time, age as Age, heartrate_max as Heartrate_max, -- heartrate_min as Heartrate_min, sysbp_max as Systolic_Blood_Pressure_max, -- sysbp_min as Systolic_Blood_Pressure_min, tempc_max as Temperature_max, -- tempc_min as Temperature_min, pao2fio2_vent_min as PAO2_FIO2, urineoutput as Urine_Otput, -- bun_min as Blood_Urea_Nitrogen_min, bun_max as Blood_Urea_Nitrogen_max, wbc_min as White_Blood_Cells_min, -- wbc_max as White_Blood_Cells_max, -- potassium_min as Potassium_min, potassium_max as Potassium_max, -- sodium_min as Sodium_min, sodium_max as Sodium_max, -- bicarbonate_min as Bicarbonate_min, bicarbonate_max as Bicarbonate_max, -- bilirubin_min as Bilirubin_min, bilirubin_max as Bilirubin_max, mingcs as mingcs, aids as aids, hem as hem, mets as mets, admissiontype as admissiontype, resprate_max as resprate_max, pco2 as pco2, infection as infection, organ_dysfunction as organ_dysfunction from sepsiscategories -- where hadm_id in (select DISTINCT(hadm_id) from sepsiscategories) and subject_id in (select DISTINCT(subject_id) from sepsiscategories) -- group by -- hadm_id,subject_id order by hadm_id asc limit 100
the result is not what I am expecting:
admission_id | patient_id | care_unit_id | in_time | out_time | age | heartrate_max | systolic_blood_pressure_max | temperature_max | pao2_fio2 | urine_otput |> blood_urea_nitrogen_max | white_blood_cells_min | potassium_max | sodium_max | bicarbonate_max | bilirubin_max | mingcs | aids | hem | mets | admissiontype | resprate_max | pco2 | inf 100003 | 54610 | 209281 | 2150-04-17 15:35:42 | 2150-04-19 14:12:52 | 59.91 | 104 | 146 | 36.7777777777778 | | 2580 | 51 | 13.4 | 5.7 | 133 | 21 | 5.5 | 14 | | | | Medical | 21 | 29 | 0 | 0 100006 | 9895 | 291788 | 2108-04-06 15:50:15 | 2108-04-11 15:18:03 | 48.92 | 127 | 149 | 36.5555572509766 | | 2950 | 16 | 9.6 | 4.4 | 131 | 24 | | 15 | | 1 | | Medical | 29 | 37 | 1 | 0 100006 | 9895 | 291788 | 2108-04-06 15:50:15 | 2108-04-11 15:18:03 | 48.92 | 127 | 149 | 36.5555572509766 | | 2950 | 16 | 9.6 | 4.4 | 131 | 24 | | 15 | | 1 | | Medical | 29 | 40 | 1 | 0 100006 | 9895 | 291788 | 2108-04-06 15:50:15 | 2108-04-11 15:18:03 | 48.92 | 127 | 149 | 36.5555572509766 | | 2950 | 16 | 9.6 | 4.4 | 131 | 24 | | 15 | | 1 | | Medical | 29 | 48 | 1 | 0 100009 | 533 | 253656 | 2162-05-17 10:18:31 | 2162-05-19 22:05:14 | 60.80 | 92 | 132 | | 126 | 2490 | 13 | 12.3 | 4.9 | 143 | 24 | | 15 | | | | UnscheduledSurgical | 37 | 30 | 0 | 0 100009 | 533 | 253656 | 2162-05-17 10:18:31 | 2162-05-19 22:05:14 | 60.80 | 92 | 132 | | 126 | 2490 | 13 | 12.3 | 4.9 | 143 | 24 | | 15 | | | | UnscheduledSurgical | 37 | 35 | 0 | 0 100009 | 533 | 253656 | 2162-05-17 10:18:31 | 2162-05-19 22:05:14 | 60.80 | 92 | 132 | | 126 | 2490 | 13 | 12.3 | 4.9 | 143 | 24 | | 15 | | | | UnscheduledSurgical | 37 | 37 | 0 | 0 100009 | 533 | 253656 | 2162-05-17 10:18:31 | 2162-05-19 22:05:14 | 60.80 | 92 | 132 | | 126 | 2490 | 13 | 12.3 | 4.9 | 143 | 24 | | 15 | | | | UnscheduledSurgical | 37 | 39 | 0 | 0 100009 | 533 | 253656 | 2162-05-17 10:18:31 | 2162-05-19 22:05:14 | 60.80 | 92 | 132 | | 126 | 2490 | 13 | 12.3 | 4.9 | 143 | 24 | | 15 | | | | UnscheduledSurgical | 37 | 40 | 0 | 0 100009 | 533 | 253656 | 2162-05-17 10:18:31 | 2162-05-19 22:05:14 | 60.80 | 92 | 132 | | 126 | 2490 | 13 | 12.3 | 4.9 | 143 | 24 | | 15 | | | | UnscheduledSurgical | 37 | 41 | 0 | 0 100009 | 533 | 253656 | 2162-05-17 10:18:31 | 2162-05-19 22:05:14 | 60.80 | 92 | 132 | | 126 | 2490 | 13 | 12.3 | 4.9 | 143 | 24 | | 15 | | | | UnscheduledSurgical | 37 | 44 | 0 | 0 100009 | 533 | 253656 | 2162-05-17 10:18:31 | 2162-05-19 22:05:14 | 60.80 | 92 | 132 | | 126 | 2490 | 13 | 12.3 | 4.9 | 143 | 24 | | 15 | | | | UnscheduledSurgical | 37 | 45 | 0 | 0 100009 | 533 | 253656 | 2162-05-17 10:18:31 | 2162-05-19 22:05:14 | 60.80 | 92 | 132 | | 126 | 2490 | 13 | 12.3 | 4.9 | 143 | 24 | | 15 | | | | UnscheduledSurgical | 37 | 59 | 0 | 0 100011 | 87977 | 214619 | 2177-08-29 04:52:21 | 2177-09-09 16:56:35 | 21.50 | 142 | 165 | 38.7222222222222 | 308.333333333333 | 5160 | 12 | 7.6 | 4.5 | 144 | 24 | | 15 | | | | Medical | 30 | 33 | 1 | 0 100011 | 87977 | 214619 | 2177-08-29 04:52:21 | 2177-09-09 16:56:35 | 21.50 | 142 | 165 | 38.7222222222222 | 308.333333333333 | 5160 | 12 | 7.6 | 4.5 | 144 | 24 | | 15 | | | | Medical | 30 | 35 | 1 | 0 100011 | 87977 | 214619 | 2177-08-29 04:52:21 | 2177-09-09 16:56:35 | 21.50 | 142 | 165 | 38.7222222222222 | 308.333333333333 | 5160 | 12 | 7.6 | 4.5 | 144 | 24 | | 15 | | | | Medical | 30 | 47 | 1 | 0
How can I do to fix it.
Advertisement
Answer
SELECT DISTINCT
operates on the whole dataset, not a particular column. So you get one record per distinct combination of all values in all columns returned by the query.
Since you are using Postgres, you might want to try DISTINCT ON
. This gives you one record per group defined in the ON clause. The ORDER BY
clauses determines which row should be returned in each group.
In your query:
select distinct on(hadm_id) hadm_id as Admission_ID, subject_id as Patient_ID, icustay_id as Care_Unit_ID , ... from sepsiscategories order by hadm_id, intime limit 100
This would give you one record per hadm_id
, with the record that has the smallest intime
. You would need to adapt the order by
clause with the correct column for your use case.