So I have the following model in Django:
class MemberLoyalty(models.Model): date_time = models.DateField(primary_key=True) member = models.ForeignKey(Member, models.DO_NOTHING) loyalty_value = models.IntegerField()
My goal is to have all the tuples grouped by the member with the most recent date. There are many ways to do it, one of them is using a subquery that groups by the member with max date_time and filtering member_loyalty with its results. The working sql for this solution is as follows:
SELECT * FROM member_loyalty WHERE (date_time , member_id) IN (SELECT max(date_time), member_id FROM member_loyalty GROUP BY member_id);
Another way to do this would be by joining with the subquery.
How could i translate this on a django query? I could not find a way to filter with two fields using IN, nor a way to join with a subquery using a specific ON statement.
I’ve tried:
cls.objects.values('member_id', 'loyalty_value').annotate(latest_date=Max('date_time'))
But it starts grouping by the loyalty_value.
Also tried building the subquery, but cant find how to join it or use it on a filter:
subquery = cls.objects.values('member_id').annotate(max_date=Max('date_time'))
Also, I am using Mysql so I can not make use of the .distinct(‘param’) method.
Advertisement
Answer
This is a typical greatest-per-group query. Stack-overflow even has a tag for it.
I believe the most efficient way to do it with the recent versions of Django is via a window query. Something along the lines should do the trick.
MemberLoyalty.objects.all().annotate(my_max=Window( expression=Max('date_time'), partition_by=F('member') )).filter(my_max=F('date_time'))
Update: This actually won’t work, because Window
annotations are not filterable
. I think in order to filter on window annotation you need to wrap it inside a Subquery
, but with Subquery
you are actually not obligated to use a Window
function, there is another way to do it, which is my next example.
If either MySQL or Django does not support window queries, then a Subquery comes into play.
MemberLoyalty.objects.filter( date_time=Subquery( (MemberLoyalty.objects .filter(member=OuterRef('member')) .values('member') .annotate(max_date=Max('date_time')) .values('max_date')[:1] ) ) )
If event Subqueries
are not available (pre Django 1.11) then this should also work:
MemberLoyalty.objects.annotate( max_date=Max('member__memberloyalty_set__date_time') ).filter(max_date=F('date_time'))