Skip to content
Advertisement

How to join two tables with the same number of rows in SQLite?

I have almost the same problem as described in this question. I have two tables with the same number of rows, and I would like to join them together one by one.

The tables are ordered, and I would like to keep this order after the join, if it is possible.

There is a rowid based solution for MSSql, but in SQLite rowid can not be used if the table is coming from a WITH statement (or RECURSIVE WITH).

It is guaranteed that the two tables have the exact same number of rows, but this number is not known beforehand. It is also important to note, that the same element may occur more than twice. The results are ordered, but none of the columns are unique.

Example code:

WITH
table_a (n) AS (
  SELECT 2
  UNION ALL
  SELECT 4
  UNION ALL
  SELECT 5
),
table_b (s) AS (
  SELECT 'valuex'
  UNION ALL
  SELECT 'valuey'
  UNION ALL
  SELECT 'valuez'
)
SELECT table_a.n, table_b.s
FROM table_a
LEFT JOIN table_b ON ( table_a.rowid = table_b.rowid )

The result I would like to achieve is:

(2, 'valuex'),
(4, 'valuey'),
(5, 'valuez')

SQLFiddle: http://sqlfiddle.com/#!5/9eecb7/6888

Advertisement

Answer

This is quite complicated in SQLite — because you are allowing duplicates. But you can do it. Here is the idea:

  • Summarize the table by the values.
  • For each value, get the count and offset from the beginning of the values.
  • Then use a join to associate the values and figure out the overlap.
  • Finally use a recursive CTE to extract the values that you want.

The following code assumes that n and s are ordered — as you specify in your question. However, it would work (with small modifications) if another column specified the ordering.

You will notice that I have included duplicates in the sample data:

WITH table_a (n) AS (
      SELECT 2 UNION ALL
      SELECT 4 UNION ALL
      SELECT 4 UNION ALL
      SELECT 4 UNION ALL
      SELECT 5
     ),
     table_b (s) AS (
      SELECT 'valuex' UNION ALL
      SELECT 'valuey' UNION ALL
      SELECT 'valuey' UNION ALL
      SELECT 'valuez' UNION ALL
      SELECT 'valuez'
     ),
     a as (
      select a.n, count(*) as a_cnt,
             (select count(*) from table_a a2 where a2.n < a.n) as a_offset
      from table_a a
      group by a.n
     ),
     b as (
      select b.s, count(*) as  b_cnt,
             (select count(*) from table_b b2 where b2.s < b.s) as b_offset
      from table_b b
      group by b.s
     ),
     ab as (
      select a.*, b.*,
             max(a.a_offset, b.b_offset) as offset,
             min(a.a_offset + a.a_cnt, b.b_offset + b.b_cnt) - max(a.a_offset, b.b_offset) as cnt
      from a join
           b
           on a.a_offset + a.a_cnt - 1 >= b.b_offset and
              a.a_offset <= b.b_offset + b.b_cnt - 1
     ),
      cte as (
      select n, s, offset, cnt, 1 as ind
      from ab
      union all
      select n, s, offset, cnt, ind + 1
      from cte
      where ind < cnt
     )
select n, s
from cte
order by n, s;

Here is a DB Fiddle showing the results.

I should note that this would be much simpler in almost any other database, using window functions (or perhaps variables in MySQL).

User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement