SQL table with a column which may have “multiple types of NULL” values

Question

The title may be confusing so I need to clarify. I have the following survey dataset that I need to transfer to a PostgreSQL database. Where -10000, -20000, and -30000 all represent a null value, but due to different reasons. (e.g., -10000 means that the participant was not shown the question about the column, -20000 means the participant skipped the

Accepted Answer

I think the best design here would be to just add status columns for each data column, assuming you want to keep track of the status for a given column.  Something like this:| id | title | income | age    | age_code | height | height_code | weight | Education   || 1  | foo   | 10000  | 45     | 1        | 170    | 1           | 50     | College     || 2  | bar   | 15000  | NULL   | 2        | NULL   | 4           | 45     | High School || 3  | hoge  | NULL   | NULL   | 3        | 150    | 1           | 60     | NULL        || 4  | fuga  | 20000  | 20     | 1        | NULL   | 5           | 70     | College     |Here, I am using the code values 2,3 to represent some sort of error code in the age, and the values 4,5 represent other types of errors with the height.Typically the values in a column should represent just one thing.  In this case, the age and height columns should store data for those measurements, not the status of those measurements.  I backfilled NULL values for where the data be missing or otherwise totally invalid.

Advertisement

Answer