I have an SQLite table called wcvp
constructed from a csv file downloaded from the World Check List of Vascular Plants (see https://wcvp.science.kew.org/ and http://sftp.kew.org/pub/data-repositories/WCVP/). When I run this query:
sqlite> SELECT kew_id, genus, species, infraspecies
FROM wcvp
WHERE genus = 'Quercus'
AND species = 'robur'
AND taxonomic_status = 'Accepted';
I get this result:
kew_id | genus | species | infraspecies |
---|---|---|---|
304293-2 | Quercus | robur | |
77189540-1 | Quercus | robur | broteroana |
77189379-1 | Quercus | robur | brutia |
77189383-1 | Quercus | robur | imeretina |
60459295-2 | Quercus | robur | pedunculiflo |
77171868-1 | Quercus | robur | robur |
I want to add a column to the table (which has hundreds of thousands of rows in it) called number_of_infraspecies
which would look like this:
kew_id | genus | species | infraspecies | number_of_infraspecies |
---|---|---|---|---|
304293-2 | Quercus | robur | 5 | |
77189540-1 | Quercus | robur | broteroana | NULL |
77189379-1 | Quercus | robur | brutia | NULL |
77189383-1 | Quercus | robur | imeretina | NULL |
60459295-2 | Quercus | robur | pedunculiflo | NULL |
77171868-1 | Quercus | robur | robur | NULL |
Alternatively I could construct a new table with two columns: kew_id
as a foreign key and number_of_infraspecies
as the other column.
Whichever approach I take I can only think of a procedure which would result in making a separate query for every row of the wcvp
table or at least those rows which do not have a value in the infraspecies column (AND taxonomic_status = ‘Accepted’).
Is there a way of doing it with one or just a few queries?
Advertisement
Answer
Create a VIEW
that returns the column number_of_infraspecies
:
CREATE VIEW my_view AS
SELECT kew_id, genus, species, infraspecies,
CASE
WHEN infraspecies IS NULL
THEN COUNT(infraspecies) OVER (PARTITION BY genus, species)
END number_of_infraspecies
FROM wcvp
WHERE taxonomic_status = 'Accepted';
and then select from that VIEW
specific genus
and species
:
SELECT kew_id, genus, species, infraspecies, number_of_infraspecies
FROM my_view
WHERE genus = 'Quercus' AND species = 'robur';
See the demo.