There is a requirement to store hourly data in SQL Server 2016 and retrieve. It’s an OLTP database.
I will explain with an example: we need to capture temperature of each city of a country and store on hourly basis. What would be the best and efficient design to do this. The data would be stored for a year and then archived
This is my plan. Can some one review and let me know if this approach is fine?
CREATE TABLE [dbo].[CityMaster] ( [CityId] [int] NULL, [CityName] [varchar](300) NULL ) ON [PRIMARY] --OPTION 1 CREATE TABLE [dbo].[WeatherData] ( [Id] [bigint] NULL, [CityId] [int] NULL, [HrlyTemp] [decimal](18, 1) NULL, [CapturedTIme] [datetime] NULL ) ON [PRIMARY] GO --OPTION2 CREATE TABLE [dbo].[WeatherData_JSon] ( [Id] [bigint] NULL, [CityId] [int] NULL, [Month] [varchar](50) NULL, [Hrlytemp] [nvarchar](max) NULL ) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY] GO
Advertisement
Answer
To some extent, this depends on how the data is going to be used. The most natural solution is to tweak the first option and use a partitioned table:
CREATE TABLE [dbo].CityHourlyTemperatures ( Id bigint NULL, CityId int NULL, HrlyTemp decimal(6, 1) NULL, CapturedTIme datetime NULL ) ;
Note I changed the name to something that seems to better capture the name.
Even with global warming, I think that 4 or 5 digits of precision in the temperature is quite sufficient — 18 is way overkill.
Each row here has 8 + 4 + 5 + 8 bytes = 25 bytes (it may be rounded up if there are alignment restrictions). A year has about 8,766 hours. So, if you have 100 cities, this is less than a million rows per year and just a few tens of megabytes per year.
That is quite feasible, but you might want to consider partitioning the table — the older partitions can act like an “archive”.
Your second option stores the temperatures as a blob. This would make sense under only one circumstance: you don’t care about the temperatures but you need to return the data to an application that does.
The implication from the name is that you want to store the value as JSON. This usually requires more overhead than storing the data using native types — and is often less efficient. JSON is very powerful and useful for some types of data, particularly sparse data. However, your data is quite relational and can be stored in a relational format.
If you wanted to save space, you could consider the following:
- Replacing the
datetime
value with anhourId
column. This could possibly be a shortint if you only want a few years of data. - Removing the
id
column and defining thecityid
/hourid
as the primary key.
However, your volume of data does not seem to suggest that such an approach is necessary.