Skip to content
Advertisement

Can I ask Firebird to use index (with like on strings) in left join condition on other column?

I have table structures:

journal_entries (id integer, account varchar(20), doc_date, date, amount numeric(15,4))
selected_accounts (account varchar(20), selection_id integer)  

I can query this and SQL used indexes both on doc_date and on account:

select je.*
  from journal_entries je
  where je.doc_date>='01.01.2022' and
        je.doc_date<='31.03.2022' and
        (je.account like '23%' or 
         je.account like '24%')

But when I am filling the selected_accounts table with data:

23%, 1
24%, 1

And I am trying to use condition in left join:

select
  from selected_accounts sa
    left join journal_entries je on (
      je.doc_date>='01.01.2022' and
      je.doc_date<='31.03.2022' and
      je.account like sa.account)

Then SQL is not using index on journal_entries.account data, it is using index on je.doc_date only.

Can I give some hints to optimizer or SQL engine that condition je.account like sa.account should use index on je.account?

I am using Firebird 3.1 and Firebird 2.1 but I guess, this issue is on other SQL databases as well.

I am gravitation towards the need to accept that I can not make optimal query with the condition in left join…

Question supplemented: I copied the plan (e.g. given by IBExpert) from the first query as the plan clause to the second query but the SQL engine reports:

 index <index on journal_entries.account> cannot be used in the specified plan

So, there is something in my query that prevents the reference and use of journal_entries.account index.

Additional observation: In fact – my database have 1M journal entries in 2022Q1 (period specified in my examples), then the first (good) query reports less than 1M indexed record reads, but the second (bad) query reports 2*1M indexed record reads (indexed beacuse of the index on journal_entries.doc_date), so, this is even worse than the full read by doc_date and then just filter by selected_records entries.

One step forward: Thanks to @Damien_The_Unbeliever comment I made this test (sic! first string prefixed with %):

select je.*
  from journal_entries je
  where je.doc_date>='01.01.2022' and
        je.doc_date<='31.03.2022' and
        (je.account like '%23%' or 
         je.account like '24%')

And no more the je.account index is used and the number of reads increased. So – it seems to me that Firebird query engine/optimizer scans that string literals used in like conditions and decides on the possibility to use index on je.account.

So, maybe I can give some notice to Firebird (for my second/slow query) that I expect only post-%-fixed strings as select_accounts.account output? That would solve my issue in the case of Firebird engine.

Advertisement

Answer

It is not possible to optimize like in this case, because Firebird cannot know what values your column has. The fact somecolumn like '24%' can use an index is because Firebird will rewrite that expression to somecolumn starting with '24' (see also LIKE, specifically the note titled “About LIKE and the Optimizer”). It is not possible to do so with parameters or values obtained from columns.

In other words, the obvious solution to your problem is to have selected_accounts.account not be populated with '24%', but with '24', and using STARTING WITH in your join condition.

In the case the wildcard doesn’t always occur, and some times you need an exact match, you could use something like je.account starting with replace(sa.account, '%', '') and je.account like sa.account. This solution assumes that % only ever occurs as the last character, and there is no _ wildcard used.

User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement