I Implemented a small ERP system for our kitchen. The system is used to track our groceries, so it contains the goods (referred as food) with its name and UPC code. Over the time the names and upcs started to overlap and creating a data problem. In a first attempt we started to add the manufacturer into the name of the food leafing us with data which looks like this.
Id | Foodname |
---|---|
1 | Acidophilus;Joghurt |
2 | Aibler;Rapsöl |
3 | Allos;Choco Confiserie |
4 | Alnatura; Dinkel Butterkeks |
5 | Alnatura; Salatkerne Mix |
The names are German, sorry for that. The first part of the name is the manufacturer and the second part is the name of the food. I now introduced a new table called ‘Manufacturer’ which holds an Id and a Name. The table for the foods was extended by the ManufacturerId
field. I want to have the following tables
Manufacturer
Id | Name |
---|---|
1 | Acidophilus |
2 | Aibler |
3 | Allos |
4 | Alnatura |
5 | Alnatura |
Food
Id | Foodname | ManufacturerId |
---|---|---|
1 | Joghurt | 1 |
2 | Rapsöl | 2 |
3 | Choco Confiserie | 3 |
4 | Dinkel Butterkeks | 4 |
5 | Salatkerne Mix | 5 |
I tried warping my head around a SQL statement which will solve this for me but I couldn’t find any solution.
Requiremnts
- Not all entries in the food table already have a manufactuerer assigened. So the query has to filter for item which contain the a ‘;’ in the name. Like
select Name from Foods where Foods.Name like '%;%' order by [Name]
- As seen in the sample data there could be whitespaces after the ‘;’ so the entries for manufactuerer name and food name have to be trimmed.
- A manufacturer has to be created (INSERT) if it isn’t already in the table
insert into Manufacturer(Name) values (left part of the string split)
- The name of the food has to be updated to just the name, meaning remove the manufacturer and the ‘;’
- The ID of the manufactuerer has to be set on the food entry (ManufacturerId)
update foods set Name= left part of the string split set ManufacturerId=id of the inserted or found manufacturer
I tried with the build in functions STRING_SPLIT, LEFT(), RIGHT() but couldn’t solve it. According to the docu string split also does not garantie that the result is always in the same order. I hope someone has an idea for me.
Thank you
Advertisement
Answer
This is very do-able, but metadata changes like this typically require several operational data transforms to complete.
First, here is my understanding of your current state:
/* Current state of things: */ CREATE TABLE Food ( ID INT IDENTITY(1,1) PRIMARY KEY, FoodName NVARCHAR(255) ) GO INSERT INTO Food (Foodname) VALUES (N'Acidophilus;Joghurt'), (N'Aibler;Rapsöl'), (N'Allos;Choco Confiserie'), (N'Alnatura; Dinkel Butterkeks'), (N'Alnatura; Salatkerne Mix');
Assuming this is correct, then here are the DDL changes:
/* New Manufacturer Table */ CREATE TABLE Manufacturer ( ID INT IDENTITY(1,1) PRIMARY KEY, Name NVARCHAR(255) ) GO /* Extend Food table with new column */ ALTER TABLE Food ADD ManufacturerID INT GO
Once the metadata/DDL has been changed, then you can do the data transforms. Your case is simple enough that it can be accomplished with two SQL statements:
/* Data Transforms */ -- Get the Manufacturer Names: WITH cteFoodOff AS (SELECT *, CHARINDEX(';', FoodName) AS offset FROM Food) , cteFoodMan AS ( SELECT *, CASE WHEN offset > 0 THEN LEFT(FoodName, offset-1) ELSE '' END AS MName FROM cteFoodOff WHERE offset > 0 ) INSERT INTO Manufacturer SELECT DISTINCT MName FROM cteFoodMan WHERE MName NOT IN(Select Name From Manufacturer) ; -- Add ManufacturerIDs into Food table and remove Manufacturer Name from FoodName WITH cteFoodOff AS (SELECT *, CHARINDEX(';', FoodName) AS offset FROM Food) , cteFoodMan AS ( SELECT *, CASE WHEN offset > 0 THEN LEFT(FoodName, offset-1) ELSE '' END AS MName FROM cteFoodOff WHERE offset > 0 ) UPDATE cteFoodMan SET ManufacturerID = (Select ID From Manufacturer Where Name = MName), FoodName = LTRIM(RIGHT(FoodName, LEN(FoodName) - offset)) ; GO
If you need to do this while there are any live users/clients (not recommended for metadata changes, because there needs to be code-switching for the client access also), then you may need to wrap this in a transaction, and you may additionally need a View to hide the changes from the older client code.