I want to normalize a staging table into two tables. One of the fields of input data contains multiple values delimited by a semicolon.
DATETIME | FILENAME | MULTIVALUEDFIELD 7-22-18 | somefile.txt | value1;value2;value3
I’m sure you data-wizards know where this is going, but I want this normalized thusly:
ID | FILENAME | 1 | somefile.txt
and
ID | SINGLEVALUES | OTHERTABLEID 1 | value1 | 1 2 | value2 | 1 3 | value3 | 1
Sorry for formatting issues: I didn’t see a way to do tables. Any help would be awesome!
Advertisement
Answer
It sounds like you actually have a many-to-many relationship here; a filename can refer to many singlevalues, and singlevalues can appear in many filenames.
The traditional way to represent that would be with a bridge table that consists of foreign key references to the primary keys of filename and singlevalues, and, if it’s applicable to your design, any additional fields that represent the relationship.
dbo.filename +-------------+---------------+ | filename_sk | filename | +-------------+---------------+ | 1 | somefile.txt | | 2 | otherfile.txt | +-------------+---------------+ dbo.value +----------+--------+ | value_sk | value | +----------+--------+ | 1 | value1 | | 2 | value2 | | 3 | value3 | +----------+--------+ dbo.bridge +-------------+----------+ | filename_sk | value_sk | +-------------+----------+ | 1 | 1 | | 1 | 2 | | 2 | 2 | | 2 | 3 | +-------------+----------+