I’ve got the following tables, of which translation
is empty and I’m trying to fill:
translation { id translated language_id template_id } language { id langname langcode } template { id tplname source domain total }
The source data to fill translation
is a temporary table that I’ve populated from an external CSV file:
tmp_table { id translated langname tplname source domain }
What I’d like to do is to fill translation
with the values from tmp_table
. The translated
field can be copied directly, but I’m not quite sure how to fetch the right language_id
(tmp_table.langname could be used to determine language.id) and template_id
(tmp_table.tplname, tmp_table.source, tmp_table.domain together can be used to determine template.id).
It might be a trivial question, but I’m quite new to SQL and not sure what the best query should be to populate the translation
table. Any ideas?
Advertisement
Answer
This can be simplified to:
INSERT INTO translation (id, translated, language_id, template_id) SELECT tmp.id, tmp.translated, l.id, t.id FROM tmp_table tmp JOIN language l USING (langname) JOIN template t USING (tplname, source, domain) ORDER BY tmp.id;
I added an ORDER BY
clause that you don’t strictly need, but certain queries may profit if you insert your data clustered like that (or some other way).
If you want to avoid losing rows where you can’t find a matching row in language
or template
, make it LEFT JOIN
instead of JOIN
for both tables (provided that language_id
and template_id
can be NULL
.
In addition to what I already listed under your previous question: If the INSERT
is huge and constitutes a large proportion of the target table, it is probably faster to drop all indexes on the target table and recreate them afterwards. Creating indexes from scratch is a lot faster then updating them incrementally for every row.
Unique indexes also serve as constraints, so you’ll have to consider whether to enforce the rules later or leave them in place.