I know there are many similar questions, but I haven’t managed to solve my problem by reading them. I’d appreciate some pointers.
Here’s some example data from my dummy
table below:
id foo bar baz moo ins_date percentage yes no maybe 38 foothing bar_one pizazz amoosing 2018-05-26 06:59:00 81 25 529 196 41 foothing bar_one pizazz amoosing 2018-05-29 06:43:00 83 441 144 49 23 foothing bar_one pizazz amoosing 2018-06-24 08:48:00 62 9 1 16 20 foothing bar_one pizazz amoosing 2018-06-27 10:37:00 94 676 16 400 65 foothing bar_one pizazz amoosing 2018-07-01 08:34:00 92 121 64 225 68 foothing bar_one pizazz amoosing 2018-07-04 01:46:00 91 324 25 289 71 foothing bar_one pizazz amoosing 2018-07-06 23:44:00 65 196 676 100 74 foothing bar_one pizazz amoosing 2018-07-10 09:41:00 92 1024 121 81 77 foothing bar_one pizazz amoosing 2018-07-13 06:47:00 64 576 169 1 96 foothing bar_one pizazz amoosing 2018-08-02 10:34:00 78 1369 256 81 99 foothing bar_one pizazz amoosing 2018-08-04 08:25:00 82 2809 9 256 102 foothing bar_one pizazz amoosing 2018-08-07 06:49:00 87 576 9 676 105 foothing bar_one pizazz amoosing 2018-08-10 03:29:00 68 4225 1089 196 108 foothing bar_one pizazz amoosing 2018-08-13 03:59:00 92 1156 169 484 111 foothing bar_one pizazz amoosing 2018-08-16 05:34:00 63 1764 100 108
I want to achieve the following through a single query:
- Filter all rows to get an ins_date between a date range that I choose
- Get the maximum ins_date per group of foo, bar, baz and moo
- Be able to filter rows by foo, bar, baz and moo
- Additionally show non-grouped values in the query, such as percentage, yes, no and maybe.
Overall this is proving to be complex. So far I’ve managed to achieve the first 3 points in the query below, and hopefully this explains what I’m looking for:
SELECT s.foo, s.bar, s.baz, s.moo, MAX(s.ins_date) mdate FROM ( SELECT * FROM dummy WHERE ins_date -- My arbitrary date range goes here BETWEEN '2018-07-01 00:00:00' AND '2019-11-01 23:59:59' ) s GROUP BY foo, bar, baz, moo -- I could add other filters into the 'HAVING' clause HAVING moo LIKE "%moo%" AND baz = "baz"
This gives the output:
foo bar baz moo mdate foothing bar_one baz amoosing 2018-11-29 05:31:00 foothing bar_one baz mooman_being 2019-04-21 10:31:00 foothing bar_one baz strawberry_moosse 2019-03-17 06:37:00
In this example, if I were to change the date constraint to show only dates between 2018-05-01
and 2018-05-29
, then for the first row mdate
would show 2018-05-29 06:43:00
, because that’s the latest (most recent) date within that date range, for that particular grouping of foo/bar/baz/moo.
But I haven’t been able to append the other columns which aren’t related to this grouping. I’ve tried by using a JOIN …
SELECT s1.foo, s1.bar, s1.baz, s1.moo, MAX(s1.ins_date) mdate, s2.percentage, s2.yes, s2.maybe, s2.no FROM ( SELECT * FROM dummy WHERE ins_date -- My arbitrary date range goes here BETWEEN '2018-07-01 00:00:00' AND '2019-11-01 23:59:59' ) s1 -- Attempting to a do a self-join to get the non-aggregated columns INNER JOIN ( SELECT id, percentage, yes, maybe, no FROM dummy ) s2 ON s2.id = s1.id GROUP BY foo, bar, baz, moo -- I could add other filters into the 'HAVING' clause HAVING moo LIKE "%moo%" AND baz = "baz"
But this returns the error:
Expression #6 of SELECT list is not in GROUP BY clause and contains nonaggregated column ‘s2.percentage’ which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by
How do I add the non-aggregated columns to my query without breaking it?
I’m using mysql 5.7, so fancy mysql 8 options aren’t available.
Fiddle below:
https://dbfiddle.uk/?rdbms=mysql_5.7&fiddle=1980dd582c2235dc0938cb14c781e3c6
Advertisement
Answer
Maybe by aggregating the extra columns?
SELECT foo, bar, baz, moo , MAX(ins_date) AS mdate , AVG(percentage) AS avg_perc , MAX(yes) AS YEAHBABY , MAX(maybe) AS MAYBEBABY , MAX(no) AS NONONONONOOO FROM dummy dum WHERE ins_date BETWEEN '2018-07-01 00:00:00' AND '2019-11-01 23:59:59' AND moo LIKE '%moo%' AND baz = 'baz' GROUP BY foo, bar, baz, moofoo | bar | baz | moo | mdate | avg_perc | YEAHBABY | MAYBEBABY | NONONONONOOO :------- | :------ | :-- | :---------------- | :------------------ | -------: | -------: | --------: | -----------: foothing | bar_one | baz | amoosing | 2018-11-29 05:31:00 | 82.8000 | 11236 | 625 | 841 foothing | bar_one | baz | mooman_being | 2019-04-21 10:31:00 | 70.0000 | 3969 | 16 | 121 foothing | bar_one | baz | strawberry_moosse | 2019-03-17 06:37:00 | 80.0000 | 23716 | 529 | 49
db<>fiddle here
Or join to the grouped fields and MAX date.
Or use an emulated row_number.
Or use an EXISTS
.
SELECT foo, bar, baz, moo , ins_date , percentage , yes, maybe, no FROM dummy dum WHERE EXISTS ( SELECT 1 FROM dummy dum2 WHERE dum2.ins_date BETWEEN '2018-07-01 00:00:00' AND '2019-11-01 23:59:59' AND dum2.moo LIKE '%moo%' AND dum2.baz = 'baz' AND dum2.foo = dum.foo AND dum2.bar = dum.bar AND dum2.baz = dum.baz AND dum2.moo = dum.moo GROUP BY foo, bar, baz, moo HAVING MAX(dum2.ins_date) = dum.ins_date );foo | bar | baz | moo | ins_date | percentage | yes | maybe | no :------- | :------ | :-- | :---------------- | :------------------ | ---------: | ----: | ----: | --: foothing | bar_one | baz | strawberry_moosse | 2019-03-17 06:37:00 | 80 | 23716 | 529 | 49 foothing | bar_one | baz | mooman_being | 2019-04-21 10:31:00 | 70 | 3969 | 16 | 121 foothing | bar_one | baz | amoosing | 2018-11-29 05:31:00 | 97 | 9025 | 361 | 1
db<>fiddle here