I’m trying to convert raw postgres-SQL into sqlalchemy which uses a count and filter
SELECT
(
SELECT
COUNT(user_model.uuid)
FILTER ( WHERE email_vector @@ parse_websearch('search term'))
) AS email_vector,
(
SELECT
COUNT(user_model.uuid)
FILTER ( WHERE first_name_vector @@ parse_websearch('search term'))
) AS first_name_vector,
FROM
user_model
WHERE
(user_model.uuid = '20d7c90d-ebfa-4b04-9ee7-4fdedabc6c0b' AND all_vectors @@ parse_websearch('search term') )
This works great and is very fast (sub 100ms – even when filtering 20 fields)
I’ve written the sqlalchemy as (simplified)
search_query = []
search_vectors = [ email_vector, first_name_vector]
search_vector_names = [ 'email_vector', 'first_name_vector' ]
for search_vector, search_vector_name in zip(search_vectors, search_vector_names):
search_query.append(sa.sql.select(
sa.func.count(user_model.uuid))
.filter(search_vector.op('@@')
(sa.func.parse_websearch(search_term)))
.label(search_vector_name)
)
base_query = db.session.query(*search_query)
base_query = base_query.filter(user_model.uuid==user_uuid)
base_query = query.filter(search_vector.op('@@')(sa.func.parse_websearch(search_term))
This produces the following SQL
SELECT (
SELECT count(user_model.uuid) AS count_1
FROM user_model
WHERE user_model.email_vector @@ parse_websearch(%(parse_websearch_1)s)) AS email_vector,
(SELECT count(user_model.uuid) AS count_2
FROM user_model
WHERE user_model.first_name_vector @@ parse_websearch(%(parse_websearch_2)s)) AS first_name_vector,
WHERE user_model.uuid = %(uuid_1)s AND (user_model.all_vectors @@ parse_websearch(%(parse_websearch_30)s))
This works but is quite slow (+1 second). I’m not sure why it’s so slow or how to do a filter query on the sub queries.
I suspect the COUNT is being performed on each sub query where the SQL uses the filtered results from the final WHERE condition, but I’m not totally sure and I can’t find any reference to filter (which I think it shorthand for CASE?) in the sqlalchemy docs. I’ve tried adding .filter(user_model.uuid==uuid)
to the COUNT SQL but it is still slow.
Would anyone be able to offer an insight into what I’m missing?
Advertisement
Answer
I eventually did this using CASE
case_stmt = sa.case(
[
(
full_text_search_vector.op('@@')
(sa.func.parse_websearch(last_search_term)),1
)
]
).label(col_name)
query.append(case_stmt)
base_query = db.session.query(*query)