Skip to content
Advertisement

Querying case-insensitive columns by SQL in Tarantool

We know that string Tarantool indices can be made case-insensitive by specifying the collation option: collation = "unicode_ci". E.g.:

t = box.schema.create_space("test")
t:format({{name = "id", type = "number"}, {name = "col1", type = "string"}})
t:create_index('primary')
t:create_index("col1_idx", {parts = {{field = "col1", type = "string", collation = "unicode_ci"}}})
t:insert{1, "aaa"}
t:insert{2, "bbb"}
t:insert{3, "ccc"}

Now we can do a case-insensitive query:

tarantool> t.index.col1_idx:select("AAA")
---
- - [1, 'aaa']
...

But how to do it using SQL? This doesn’t work:

tarantool> box.execute("select * from "test" where "col1" = 'AAA'")
---
- metadata:
  - name: id
    type: number
  - name: col1
    type: string
  rows: []
...

Neither does this:

tarantool> box.execute("select * from "test" indexed by "col1_idx" where "col1" = 'AAA'")
---
- metadata:
  - name: id
    type: number
  - name: col1
    type: string
  rows: []
...

There’s a dirty trick with a poor performance (full scan). We don’t want it, do we?

tarantool> box.execute("select * from "test" indexed by "col1_idx" where upper("col1") = 'AAA'")
---
- metadata:
  - name: id
    type: number
  - name: col1
    type: string
  rows:
  - [1, 'aaa']
...

At last, we have one more workaround:

tarantool> box.execute("select * from "test" where "col1" = 'AAA' collate "unicode_ci"")
---
- metadata:
  - name: id
    type: number
  - name: col1
    type: string
  rows:
  - [1, 'aaa']
...

But the question is – does it use the index? Without an index it also works…

Advertisement

Answer

One can check query plan to figure out whether particular index is used or not. To get query plan simply add ‘EXPLAIN QUERY PLAN ‘ prefix to the original query. For instance:

tarantool>  box.execute("explain query plan select * from "test" where "col1" = 'AAA' collate "unicode_ci"")
---
- metadata:
  - name: selectid
    type: integer
  - name: order
    type: integer
  - name: from
    type: integer
  - name: detail
    type: text
  rows:
  - [0, 0, 0, 'SEARCH TABLE test USING COVERING INDEX col1_idx (col1=?) (~1 row)']
...

So the answer is ‘yes’, index is used in this case.
As for another example:

box.execute("select * from "test" indexed by "col1_idx" where "col1" = 'AAA'")

Unfortunately collation in this comparison is binary, since index’s collation is ignored. In SQL only column’s collations are considered to be used during comparison. This limitation will be resolved as soon as corresponding issue is closed.

User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement