Skip to content
Advertisement

SQLAlchemy how to group_by relation?

Here are the models:

class User(Base):
    __tablename__ = 'users'

    id = Column(CHAR, primary_key=True)
    first_name = Column(CHAR)
    last_name = Column(CHAR)
    email = Column(CHAR)
    receive_reports = Column(Boolean)


class MailPiece(Base):
    __tablename__ = 'mail_pieces'

    id = Column(CHAR, primary_key=True)
    created_at = Column(DateTime)
    template_id = Column(CHAR, ForeignKey('templates.id'))


class Template(Base):
    __tablename__ = 'templates'

    id = Column(CHAR, primary_key=True)
    name = Column(CHAR)
    created_by_id = Column(CHAR, ForeignKey('users.id'))
    user = relationship(User, backref='templates')

I want to send to users reports with: which templates they are sent and how many mail pieces was sent for each template.

I wrote the code:

        stmt = self.session.query(MailPiece.template_id, func.count('*')
                               .label('mail_pieces_count')).filter(
            MailPiece.created_at > day_ago,
            MailPiece.created_at < now,
            ).group_by(MailPiece.template_id).subquery()

        query = self.session.query(Template, stmt.c.mail_pieces_count).
            filter(Template.user.has(receive_reports=True)).
            join(stmt, Template.id == stmt.c.template_id)

But this is slightly not what I wanted. I got result grouped by templates but need list grouped by users where each user will have templates. So in that way I can iterate through users list and send report to each user with summary.

Current result:

[<Template(id='123', name='Test', mail_pieces_count='123', user=<User(id=1212, first_name='Some name', last_name='Some lastname')>)>, <Template(id='456', name='Test2', mail_pieces_count='456', user=<User(id=1212, first_name='Some name', last_name='Some lastname')>)>]

Expected result:

[<User(id=1212, first_name='Some name', last_name='Some lastname, templates=[<Template(id='123', name='Test', mail_pieces_count='123')>, <Template(id='456', name='Test2', mail_pieces_count='456')>,])>]

In other words, now it can be represented like: Current:

templates = [
    {
        "id": 123,
        "name": "Test",
        "mail_pieces_count": 123,
        "user": {
            "id": 1212,
            "first_name": "Some name",
            "last_name": "Some lastname"
        }
    },
    {
        "id": 456,
        "name": "Test2",
        "mail_pieces_count": 456,
        "user": {
            "id": 1212,
            "first_name": "Some name",
            "last_name": "Some lastname"
        }
    }
]

Expected:

users = [
    {
        "id": 1212,
        "first_name": "Some name",
        "last_name": "Some lastname",
        "templates": [
        {
            "id": 123,
            "name": "Test",
            "mail_pieces_count": 123
        },
        {
            "id": 456,
            "name": "Test2",
            "mail_pieces_count": 456
        },]
    },
]

Advertisement

Answer

The reason of current output:

You build a query using Template class:

self.session.query(Template, stmt.c.mail_pieces_count)

This is the main reason why you got Template instances instead of User. Below just an example how to get expected result.

Note! I just try to explain how it works. My answer isn’t related to optimization, performance etc.

# I skipped filters...
stmt = (session.query(MailPiece.template_id, func.count('*')
        .label('mail_pieces_count'))
        .group_by(MailPiece.template_id).subquery())
# to tie templates with upper subquery
stmt2 = (session.query(Template.created_by_id, stmt.c.mail_pieces_count)
         .join(stmt, Template.id == stmt.c.template_id)
         .subquery())
# User query - to tie template id with user id
query = session.query(User, stmt2.c.mail_pieces_count).join(
    stmt2,
    # you can add additional conditions into JOIN ON...
    and_(User.id == stmt2.c.created_by_id, User.receive_reports.is_(True))
)

for result in query:
    print("count: %s" % result.mail_pieces_count)
    print("user: %s" % result.User)
    print("templates: %s" % result.User.templates)

JFYI. If we look at console you will find that alchemy will do 1 query per iteration:

### one more select when you use result.User.templates
2019-02-11 13:02:24,702 INFO sqlalchemy.engine.base.Engine SELECT templates.id AS templates_id, templates.name AS templates_name, templates.created_by_id AS templates_created_by_id 
FROM templates 
WHERE %(param_1)s = templates.created_by_id

You can add Template into query to avoid this, but in this case you will get the number of records equal to the number of templates (not users):

stmt2 = (session.query(Template, stmt.c.mail_pieces_count)
         .join(stmt, Template.id == stmt.c.template_id)
         .subquery())
# select User and Template 
query = session.query(User, Template, stmt2.c.mail_pieces_count).join(
    stmt2,
    and_(User.id == stmt2.c.created_by_id)
)

for result in query.all():
    print("user: %s" % result.User)
    print("template: %s" % result.Template)

So summarize, you need to use a specific class if you need to get instances of the class as a result:

session.query(ExpectedClass).other_methods()...

Hope this helps.

User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement