Skip to content
Advertisement

Why does this SQL query get stuck in an endless loop?

The following PostgreSQL query

gets stuck loading endlessly. I know correlated sub-queries can take long, but a SELECT using the same parameters worked quickly and returned the desired results. And I have a small data set that I let run for an entire day just to make sure it wouldn’t eventually work with time.

Table_A uses hierarchical data structures and only a specific level of hierarchy has a foreign key that I can use to join and check the second table. The idea is :

  1. Find all rows in Table_A whose associated Table_B row has its “deleted” value set to true.

  2. From this set of results get the parent_id column

  3. For any row in table_A whose id is part of the parent_id column, so for all parents, check if their is_active is true and if so make it false.

The EXPLAIN:

Advertisement

Answer

Sometimes proper use of aliases can make a difference.

Compare the following 2 query plans.
The first is the original query run on sample data.
cost=0.00..314144.00 rows=4975 Estimated to update 4975 rows on a table of less than 10 rows?

The second is a slightly modified version of the first.
cost=92.26..122.20 rows=2

| QUERY PLAN                                                                                     |
| :--------------------------------------------------------------------------------------------- |
| Update on table_a a  (cost=0.00..314144.00 rows=4975 width=25)                                 |
|   ->  Nested Loop  (cost=0.00..314144.00 rows=4975 width=25)                                   |
|         Join Filter: (SubPlan 1)                                                               |
|         ->  Seq Scan on table_a  (cost=0.00..29.90 rows=1990 width=10)                         |
|         ->  Materialize  (cost=0.00..29.93 rows=5 width=18)                                    |
|               ->  Seq Scan on table_a a  (cost=0.00..29.90 rows=5 width=18)                    |
|                     Filter: ((parent_id IS NULL) AND is_active)                                |
|         SubPlan 1                                                                              |
|           ->  Nested Loop  (cost=0.15..57.97 rows=1990 width=4)                                |
|                 ->  Index Scan using table_b_pkey on table_b  (cost=0.15..8.17 rows=1 width=0) |
|                       Index Cond: (table_a.foreign_id = id)                                    |
|                       Filter: deleted                                                          |
|                 ->  Seq Scan on table_a b  (cost=0.00..29.90 rows=1990 width=4)                |
| QUERY PLAN                                                                                       |
| :----------------------------------------------------------------------------------------------- |
| Update on table_a  (cost=92.26..122.20 rows=2 width=31)                                          |
|   ->  Hash Join  (cost=92.26..122.20 rows=2 width=31)                                            |
|         Hash Cond: (table_a.id = a2.parent_id)                                                   |
|         ->  Seq Scan on table_a  (cost=0.00..29.90 rows=5 width=18)                              |
|               Filter: ((parent_id IS NULL) AND is_active)                                        |
|         ->  Hash  (cost=89.76..89.76 rows=200 width=16)                                          |
|               ->  HashAggregate  (cost=87.76..89.76 rows=200 width=16)                           |
|                     Group Key: a2.parent_id                                                      |
|                     ->  Hash Join  (cost=50.14..85.27 rows=995 width=16)                         |
|                           Hash Cond: (a2.foreign_id = b.id)                                      |
|                           ->  Seq Scan on table_a a2  (cost=0.00..29.90 rows=1990 width=14)      |
|                           ->  Hash  (cost=34.70..34.70 rows=1235 width=10)                       |
|                                 ->  Seq Scan on table_b b  (cost=0.00..34.70 rows=1235 width=10) |
|                                       Filter: deleted                                            |

The second query only uses a few more alias names.

The update statement can also be written as a join on a sub-query.

1 rows affected
id | parent_id | is_active | foreign_id
-: | --------: | :-------- | ---------:
 2 |         1 | t         |          2
 3 |         1 | t         |          5
 4 |      null | t         |          3
 5 |         3 | t         |          4
 1 |      null | f         |          1

Test on db<>fiddle here

User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement