Skip to content
Advertisement

How to add data to a field based on other fields in a SQL database

I have an SQLite table called wcvp constructed from a csv file downloaded from the World Check List of Vascular Plants (see https://wcvp.science.kew.org/ and http://sftp.kew.org/pub/data-repositories/WCVP/). When I run this query:

sqlite> SELECT kew_id, genus, species, infraspecies 
        FROM wcvp 
        WHERE genus = 'Quercus' 
          AND species = 'robur' 
          AND taxonomic_status = 'Accepted';

I get this result:

kew_id genus species infraspecies
304293-2 Quercus robur
77189540-1 Quercus robur broteroana
77189379-1 Quercus robur brutia
77189383-1 Quercus robur imeretina
60459295-2 Quercus robur pedunculiflo
77171868-1 Quercus robur robur

I want to add a column to the table (which has hundreds of thousands of rows in it) called number_of_infraspecies which would look like this:

kew_id genus species infraspecies number_of_infraspecies
304293-2 Quercus robur 5
77189540-1 Quercus robur broteroana NULL
77189379-1 Quercus robur brutia NULL
77189383-1 Quercus robur imeretina NULL
60459295-2 Quercus robur pedunculiflo NULL
77171868-1 Quercus robur robur NULL

Alternatively I could construct a new table with two columns: kew_id as a foreign key and number_of_infraspecies as the other column.

Whichever approach I take I can only think of a procedure which would result in making a separate query for every row of the wcvp table or at least those rows which do not have a value in the infraspecies column (AND taxonomic_status = ‘Accepted’).

Is there a way of doing it with one or just a few queries?

Advertisement

Answer

Create a VIEW that returns the column number_of_infraspecies:

CREATE VIEW my_view AS
SELECT kew_id, genus, species, infraspecies,
       CASE 
         WHEN infraspecies IS NULL 
           THEN COUNT(infraspecies) OVER (PARTITION BY genus, species)
       END number_of_infraspecies
FROM wcvp
WHERE taxonomic_status = 'Accepted';

and then select from that VIEW specific genus and species:

SELECT kew_id, genus, species, infraspecies, number_of_infraspecies
FROM my_view
WHERE genus = 'Quercus' AND species = 'robur';

See the demo.

User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement