I have an SQLite table called wcvp constructed from a csv file downloaded from the World Check List of Vascular Plants (see https://wcvp.science.kew.org/ and http://sftp.kew.org/pub/data-repositories/WCVP/). When I run this query:
sqlite> SELECT kew_id, genus, species, infraspecies 
        FROM wcvp 
        WHERE genus = 'Quercus' 
          AND species = 'robur' 
          AND taxonomic_status = 'Accepted';
I get this result:
| kew_id | genus | species | infraspecies | 
|---|---|---|---|
| 304293-2 | Quercus | robur | |
| 77189540-1 | Quercus | robur | broteroana | 
| 77189379-1 | Quercus | robur | brutia | 
| 77189383-1 | Quercus | robur | imeretina | 
| 60459295-2 | Quercus | robur | pedunculiflo | 
| 77171868-1 | Quercus | robur | robur | 
I want to add a column to the table (which has hundreds of thousands of rows in it) called number_of_infraspecies which would look like this:
| kew_id | genus | species | infraspecies | number_of_infraspecies | 
|---|---|---|---|---|
| 304293-2 | Quercus | robur | 5 | |
| 77189540-1 | Quercus | robur | broteroana | NULL | 
| 77189379-1 | Quercus | robur | brutia | NULL | 
| 77189383-1 | Quercus | robur | imeretina | NULL | 
| 60459295-2 | Quercus | robur | pedunculiflo | NULL | 
| 77171868-1 | Quercus | robur | robur | NULL | 
Alternatively I could construct a new table with two columns: kew_id as a foreign key and number_of_infraspecies as the other column.
Whichever approach I take I can only think of a procedure which would result in making a separate query for every row of the wcvp table or at least those rows which do not have a value in the infraspecies column (AND taxonomic_status = ‘Accepted’).
Is there a way of doing it with one or just a few queries?
Advertisement
Answer
Create a VIEW that returns the column number_of_infraspecies:
CREATE VIEW my_view AS
SELECT kew_id, genus, species, infraspecies,
       CASE 
         WHEN infraspecies IS NULL 
           THEN COUNT(infraspecies) OVER (PARTITION BY genus, species)
       END number_of_infraspecies
FROM wcvp
WHERE taxonomic_status = 'Accepted';
and then select from that VIEW specific genus and species:
SELECT kew_id, genus, species, infraspecies, number_of_infraspecies FROM my_view WHERE genus = 'Quercus' AND species = 'robur';
See the demo.