There’re the items:
item_1
- …
item_N
N is about a few hundreds. New items can be added sometimes.
Each item has a set of parameters:
parameter_1
- …
parameter_M
M is about 20-30. These parameters are measured several times a day (independently of each other, in different time) and need to be stored in a database. So parameter_j(item_i)
has a massive of time-value pairs:
<t0, v0>
(the first measurement of parameter_j(item_i))- …
<tk, vk>
(the latest measurement of parameter_j(item_i))
Base queries that need to be performed:
- all values of a parameter for an item (ordered by the time of measurement)
- the latest values of all parameters for all items
What database structure (tables) do I need to perform these quesries efficiently? Could you please provide SQL code for these two types of queries?
Advertisement
Answer
I would probably go with a structure like this:
I will provide MS SQL codes.
Example code for creating these tables:
IF NOT OBJECT_ID('tempdb..#Items') IS NULL DROP TABLE #Items
CREATE TABLE #Items (
ID int NOT NULL,
ItemName varchar(255) NOT NULL,
PRIMARY KEY (ID)
)
IF NOT OBJECT_ID('tempdb..#Measurements') IS NULL DROP TABLE #Measurements
CREATE TABLE #Measurements (
ItemID int NOT NULL,
MeasurementDate datetime NOT NULL,
Param1 varchar(255),
Param2 float,
Param3 varchar(255),
ParamN binary
FOREIGN KEY (ItemID) REFERENCES #Items (ID)
)
I filled up with dummy data via this code:
INSERT INTO #Items values ( 0, 'Item1' )
INSERT INTO #Items values ( 1, 'Item2' )
INSERT INTO #Measurements (ItemID, MeasurementDate, Param2) values (0, GetDate(), 2.3)
INSERT INTO #Measurements (ItemID, MeasurementDate, Param3) values (1, DATEADD(day, 3, GetDate()), 'red')
INSERT INTO #Measurements (ItemID, MeasurementDate, Param3) values (1, DATEADD(day, 2, GetDate()), 'blue')
INSERT INTO #Measurements (ItemID, MeasurementDate, Param2) values (0, DATEADD(day, 5, GetDate()), 4.0)
Query for ,,all values of a parameter for an item (ordered by the time of measurement)”:
Code:
SELECT #Measurements.MeasurementDate, #Measurements.Param2
FROM #Measurements INNER JOIN #Items ON #Measurements.ItemID = #Items.ID
WHERE #Items.ItemName LIKE '%Item1%'
ORDER BY #Measurements.MeasurementDate
Output:
Query for ,,the latest values of all parameters for all items”:
Code (see explanation SQL Server get latest value by date):
SELECT TOP 1 WITH TIES
*
FROM #Measurements
ORDER BY row_number() over (partition by ItemID order by MeasurementDate desc)
Output: