I’m trying to convert raw postgres-SQL into sqlalchemy which uses a count and filter
SELECT ( SELECT COUNT(user_model.uuid) FILTER ( WHERE email_vector @@ parse_websearch('search term')) ) AS email_vector, ( SELECT COUNT(user_model.uuid) FILTER ( WHERE first_name_vector @@ parse_websearch('search term')) ) AS first_name_vector, FROM user_model WHERE (user_model.uuid = '20d7c90d-ebfa-4b04-9ee7-4fdedabc6c0b' AND all_vectors @@ parse_websearch('search term') )
This works great and is very fast (sub 100ms – even when filtering 20 fields)
I’ve written the sqlalchemy as (simplified)
search_query = [] search_vectors = [ email_vector, first_name_vector] search_vector_names = [ 'email_vector', 'first_name_vector' ] for search_vector, search_vector_name in zip(search_vectors, search_vector_names): search_query.append(sa.sql.select( sa.func.count(user_model.uuid)) .filter(search_vector.op('@@') (sa.func.parse_websearch(search_term))) .label(search_vector_name) ) base_query = db.session.query(*search_query) base_query = base_query.filter(user_model.uuid==user_uuid) base_query = query.filter(search_vector.op('@@')(sa.func.parse_websearch(search_term))
This produces the following SQL
SELECT ( SELECT count(user_model.uuid) AS count_1 FROM user_model WHERE user_model.email_vector @@ parse_websearch(%(parse_websearch_1)s)) AS email_vector, (SELECT count(user_model.uuid) AS count_2 FROM user_model WHERE user_model.first_name_vector @@ parse_websearch(%(parse_websearch_2)s)) AS first_name_vector, WHERE user_model.uuid = %(uuid_1)s AND (user_model.all_vectors @@ parse_websearch(%(parse_websearch_30)s))
This works but is quite slow (+1 second). I’m not sure why it’s so slow or how to do a filter query on the sub queries.
I suspect the COUNT is being performed on each sub query where the SQL uses the filtered results from the final WHERE condition, but I’m not totally sure and I can’t find any reference to filter (which I think it shorthand for CASE?) in the sqlalchemy docs. I’ve tried adding .filter(user_model.uuid==uuid)
to the COUNT SQL but it is still slow.
Would anyone be able to offer an insight into what I’m missing?
Advertisement
Answer
I eventually did this using CASE
case_stmt = sa.case( [ ( full_text_search_vector.op('@@') (sa.func.parse_websearch(last_search_term)),1 ) ] ).label(col_name) query.append(case_stmt) base_query = db.session.query(*query)