I have three tables and would like to answer the following question in SQL: “Who has only certifications that do NOT have scores?” For instance, in the setup below, the query would return “John” only. Joana has the “AWS Certification”, which is in SCORE table (id 57). Marry has the “ITIL V3 Certification” which is not in the SCORE table,
Tag: data-warehouse
Using Identity or sequence in data warehouse
I’m new to data warehouse, So I try to follow the best practice, mimicking some implementation details from the Microsoft Demo DB WideWorldImportersDW, One of the things that I have noticed is using Sequence as default value for PK over Identity. Could I ask, If it’s preferable to use Sequence over Identity in data warehouse in general and Which one
Using Surrogate Keys in Data Warehouse Pros and Cons
A surrogate key is a mechanism that exists in our books for years and I hate for bringing into discussion again. Everyone is talking about the benefits of using a surrogate key instead of a business key. Even Microsoft Analysis Services Tabular and Microsoft PowerBI Tabular Models are working with the surrogate key. Both platforms mentioned give you the ability
SQL Server – Aggregate data by minute over multiple days
Context I’m using Microsoft SQL Server 2016. There is a database table “Raw_data”, that contains the status of a machine, together with it’s starting time. There are several machines and …
Reducing granularity of historical status table?
I have a table with what are essentially historical logs for project tasks. Each row contains the ID for a project and the date of when one specific task was either started or ended. I need to reduce the grain so that all of the start/end times are in a single row (see IDs 1 and 2 in the image)
Accounting Data Warehouse Design Question
The finance module in our ERP has a general ledger and sub-ledgers (accounts receivable, accounts payable etc.). All the sub-ledgers are rolled up into the GL. Standard accounting schema I believe I …
Service that does advanced queries on a data set, and automatically returns relevant updated results every time new data is added to the set?
I’m looking for a cloud service that can do advanced statistics calculations on a large amount of votes submitted by users, in “real time”. In our app, users can submit different kind of votes like …
How Do I aggregate Data By Day and Still Respect Timezone?
We are currently using a summary table that aggregates information for our users on an hourly basis in UTC time. The problem we are having is that this table is becoming too large and slowing our system down immensely. We have done all the tuning techniques recommended for PostgreSQL and we are still experiencing slowness. Our idea was to start