Inserting large amounts of generated test data into postgresql database

Question

I want to insert a billion rows of floaty data into a postgresql database so that I can test the performance of various postgis functions. My effort below takes ages and seems to be really inefficient ...

Accepted Answer

Typically using generate_series() is faster than using loops in PL/pgSQL.To generate the &#8220;position&#8221; value, you can use random()The following will insert 100 million rows with random values for the second column:insert into lat_longs(c1, c2)select g.id, random() * 100from generate_series(1,100e6) as g(id);I prefer to insert test data in chunks (e.g. 10 million at a time). That is easier to do, if you let Postgres generate a unique value for the first column, e.g. by defining it as an identify column:create table lat_longs (  c1 bigint generated always as identity,  c2 float)insert into lat_longs(c2)select random() * 100from generate_series(1,10e6) as g(id);insert into lat_longs(c2)select random() * 100from generate_series(1,10e6) as g(id);...If you really need a constant increase for the second column you can do that if you have the identity column:insert into lat_longs(c2)select g.positionfrom generate_series(0, 100, 0.0000001) as g(position);Or do it in chunks:insert into lat_longs(c2)select g.positionfrom generate_series(0, 10, 0.0000001) as g(position);insert into lat_longs(c2)select g.positionfrom generate_series(10, 20, 0.0000001) as g(position);...

Advertisement

Answer