Skip to content
Advertisement

Constraint on size of array type in Postgres

I have the following schema for a table (simplified for the sake of discussion) :

Task
   max_solved_by int, # Number of users who can solve this task
   currently_solved_by int[], # Number of users who currently solved this task so far
   state VARCHAR(10), # State of a task can either be NOT_SOLVED or SOLVED

A task transitions from NOT_SOLVED TO SOLVED when array_length(current_solved_by) == max_solved_by

Questions:

  1. Is it a good practice to maintain another column, in this case state, which can be derived from other data (Based on size of currently_solved_by.

Main reason I added the state is to be able query on state: e.g. SELECT * FROM TASKS where state = 'NOT_SOLVED'. But if this query can be (efficiently) queried using array_length(currently_solved_by), I probably don’t need this state.

I am a beginner to postgres. So any help in what is a good practice in this case is appreciated.

Thanks

Advertisement

Answer

There are two options:

  • You may use generated column, which will hide all the business logic behind and just provide you meaningful values like SOLVED or NOT_SOLVED. But it will of course consume additional space.
  • You may use expression index on comparison of array length and max length. This will allow you to use fast searches on the expression, but you’ll need to use this exact expression for filtering every time and no business logic preserved in this case.

I’m not a Postgres guru, so I currently didn’t manage the query to use index on boolean value when filtering with this boolean value. Then I will use cast to char(1) as a workaround.

Two approaches are below and both use more or less the same access pattern, but generated column is business meaningful:

create table t (
  name varchar(100),
  x int[],
  y int
)
<some inserts here...>
create index ix_t_bool
on t((array_length(x, 1) = y))
explain analyze
select *
from t
where (array_length(x, 1) = y) = true
| QUERY PLAN                                                                                   |
| :------------------------------------------------------------------------------------------- |
| Seq Scan on t  (cost=0.00..20.09 rows=5 width=254) (actual time=0.014..0.092 rows=2 loops=1) |
|   Filter: (array_length(x, 1) = y)                                                           |
|   Rows Removed by Filter: 1004                                                               |
| Planning Time: 0.393 ms                                                                      |
| Execution Time: 0.105 ms                                                                     |

Index is not used for some reason. Let’s cast to use explicit = predicate.

create index ix_t_int
on t(( (array_length(x, 1) = y)::char(1) ))
explain analyze
select *
from t
where (array_length(x, 1) = y)::char(1) = 'Y'
| QUERY PLAN                                                                                                      |
| :-------------------------------------------------------------------------------------------------------------- |
| Bitmap Heap Scan on t  (cost=4.31..9.69 rows=5 width=254) (actual time=0.045..0.045 rows=0 loops=1)             |
|   Recheck Cond: (((array_length(x, 1) = y))::character(1) = 'Y'::bpchar)                                        |
|   ->  Bitmap Index Scan on ix_t_int  (cost=0.00..4.31 rows=5 width=0) (actual time=0.043..0.043 rows=0 loops=1) |
|         Index Cond: (((array_length(x, 1) = y))::character(1) = 'Y'::bpchar)                                    |
| Planning Time: 0.250 ms                                                                                         |
| Execution Time: 0.067 ms                                                                                        |
alter table t
add column
  state text generated always as (
    case array_length(x, 1)
      when y
      then 'SOLVED'
      else 'NOT_SOLVED'
    end
  ) stored
create index ix_t_calc
on t(state)
explain analyze
select *
from t
where state = 'SOLVED'
| QUERY PLAN                                                                                                       |
| :--------------------------------------------------------------------------------------------------------------- |
| Bitmap Heap Scan on t  (cost=4.31..11.31 rows=5 width=286) (actual time=0.025..0.026 rows=2 loops=1)             |
|   Recheck Cond: (state = 'SOLVED'::text)                                                                         |
|   Heap Blocks: exact=1                                                                                           |
|   ->  Bitmap Index Scan on ix_t_calc  (cost=0.00..4.31 rows=5 width=0) (actual time=0.020..0.020 rows=2 loops=1) |
|         Index Cond: (state = 'SOLVED'::text)                                                                     |
| Planning Time: 0.307 ms                                                                                          |
| Execution Time: 0.047 ms                                                                                         |

db<>fiddle here

User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement