Skip to content
Advertisement

Django ORM filter multiple fields using ‘IN’ statement

So I have the following model in Django:

class MemberLoyalty(models.Model):
    date_time = models.DateField(primary_key=True)
    member = models.ForeignKey(Member, models.DO_NOTHING)
    loyalty_value = models.IntegerField()

My goal is to have all the tuples grouped by the member with the most recent date. There are many ways to do it, one of them is using a subquery that groups by the member with max date_time and filtering member_loyalty with its results. The working sql for this solution is as follows:

SELECT 
    *
FROM
    member_loyalty
WHERE
    (date_time , member_id) IN (SELECT 
            max(date_time), member_id
        FROM
            member_loyalty
        GROUP BY member_id);

Another way to do this would be by joining with the subquery.

How could i translate this on a django query? I could not find a way to filter with two fields using IN, nor a way to join with a subquery using a specific ON statement.

I’ve tried:

cls.objects.values('member_id', 'loyalty_value').annotate(latest_date=Max('date_time'))

But it starts grouping by the loyalty_value.

Also tried building the subquery, but cant find how to join it or use it on a filter:

subquery = cls.objects.values('member_id').annotate(max_date=Max('date_time'))

Also, I am using Mysql so I can not make use of the .distinct(‘param’) method.

Advertisement

Answer

This is a typical greatest-per-group query. Stack-overflow even has a tag for it.

I believe the most efficient way to do it with the recent versions of Django is via a window query. Something along the lines should do the trick.

    MemberLoyalty.objects.all().annotate(my_max=Window(
        expression=Max('date_time'),
        partition_by=F('member')
    )).filter(my_max=F('date_time'))

Update: This actually won’t work, because Window annotations are not filterable. I think in order to filter on window annotation you need to wrap it inside a Subquery, but with Subquery you are actually not obligated to use a Window function, there is another way to do it, which is my next example.

If either MySQL or Django does not support window queries, then a Subquery comes into play.

MemberLoyalty.objects.filter(
    date_time=Subquery(
        (MemberLoyalty.objects
            .filter(member=OuterRef('member'))
            .values('member')
            .annotate(max_date=Max('date_time'))
            .values('max_date')[:1]
        )
    )
)

If event Subqueries are not available (pre Django 1.11) then this should also work:

MemberLoyalty.objects.annotate(
    max_date=Max('member__memberloyalty_set__date_time')
).filter(max_date=F('date_time'))

User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement