Skip to content
Advertisement

Best way to insert into multiple tables in a cluster

Suppose I have a wide table (CUSTOMERS) that contains lots of columns (CUSTOMER_ID, FIRST_NAME, LAST_NAME, DOB, ADDRESS, PHONE_NUMBER, etc).

This table is used in several full table scans, each of which selects only a small number of the columns of the table. Since this table is very large, these full table scans are very expensive.

In order to make these queries better, I want to create a cluster (on CUSTOMER_ID), and create several thin tables within this cluster (e.g. CUSTOMER_NAMES for FIRST_NAME and LAST_NAME). Once this is set up, the thinner table can be full scanned instead of the wider table, which will be much less expensive.

The difficulty comes from finding the best way to populate the thin tables in the cluster:

  • The INSERT ALL statement can insert into multiple tables in a single statement, but I don’t think it can be used to insert data selected from another table.
  • I can use a seperate insert statement for each of the thin tables that selects from the wide table, but that means the wide table will have to be full scanned for each of the thin tables.
  • I can use a couple big unpivot statement to generate unpivoted versions of the wide table and insert those into table that is partitioned on the original column name, then use these to insert into the thin tables. This will require full scanning the wide table for each distinct datatype, and unpivoting data to pivot it again shortly afterwards feels pretty dodgy.
  • I can create an updateable view that joins from each of the thin tables and insert into that view, but as I undersant it only a single table can be updated in such a way

Is there a better way to insert into the thin tables? I feel like there should be a way that only scans the wide table once.

Advertisement

Answer

The INSERT ALL statement can insert into multiple tables in a single statement, but I don’t think it can be used to insert data selected from another table.

Yes it can. There are examples of that in the documentation. Here you can do something like:

insert all
into customer_name (customer_id, first_name, last_name)
            values (customer_id, first_name, last_name)
into customer_dob (customer_id, dob)
           values (customer_id, dob)
select customer_id, first_name, last_name, dob
from customer;

Demo using a cut-down version of your table and one dummy row:

create table customer (
  customer_id number(38),
  first_name varchar2(20),
  last_name varchar2(20),
  dob date
);

insert into customer (customer_id, first_name, last_name, dob)
values (42, 'Alex', 'Poole', date '1972-01-01');

and a cluster with two tables (though the cluster isn’t really relevant):

create cluster customer_cluster (customer_id number(38));

create index customer_cluster_idx on cluster customer_cluster;

create table customer_name (
  customer_id number(38),
  first_name varchar2(20),
  last_name varchar2(20)
)
cluster customer_cluster (customer_id);

create table customer_dob (
  customer_id number(38),
  dob date
)
cluster customer_cluster (customer_id);

then you can do:

insert all
into customer_name (customer_id, first_name, last_name)
            values (customer_id, first_name, last_name)
into customer_dob (customer_id, dob)
           values (customer_id, dob)
select customer_id, first_name, last_name, dob
from customer;

2 rows inserted.

select * from customer_name;

CUSTOMER_ID FIRST_NAME           LAST_NAME           
----------- -------------------- --------------------
         42 Alex                 Poole               

select * from customer_dob;

CUSTOMER_ID DOB       
----------- ----------
         42 1972-01-01
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement