I want to normalize a staging table into two tables. One of the fields of input data contains multiple values delimited by a semicolon.
DATETIME | FILENAME | MULTIVALUEDFIELD 7-22-18 | somefile.txt | value1;value2;value3
I’m sure you data-wizards know where this is going, but I want this normalized thusly:
ID | FILENAME | 1 | somefile.txt
and
ID | SINGLEVALUES | OTHERTABLEID 1 | value1 | 1 2 | value2 | 1 3 | value3 | 1
Sorry for formatting issues: I didn’t see a way to do tables. Any help would be awesome!
Advertisement
Answer
It sounds like you actually have a many-to-many relationship here; a filename
can refer to many singlevalues
, and singlevalues
can appear in many filename
s.
The traditional way to represent that would be with a bridge table that consists of foreign key references to the primary keys of filename
and singlevalues
, and, if it’s applicable to your design, any additional fields that represent the relationship.
dbo.filename +-------------+---------------+ | filename_sk | filename | +-------------+---------------+ | 1 | somefile.txt | | 2 | otherfile.txt | +-------------+---------------+ dbo.value +----------+--------+ | value_sk | value | +----------+--------+ | 1 | value1 | | 2 | value2 | | 3 | value3 | +----------+--------+ dbo.bridge +-------------+----------+ | filename_sk | value_sk | +-------------+----------+ | 1 | 1 | | 1 | 2 | | 2 | 2 | | 2 | 3 | +-------------+----------+