Skip to content
Advertisement

How to calculate the degree of agreement by row comparisons in SQL Server?

For a minimal, reproducible example (reprex) let’s assume I have a database object (dbo) in a Microsoft SQL Server and I want to query things in T-SQL.

My dbo looks like this:

What I would like to do is calculate an allele match percentage per Marker-ID across all Animal-IDs.

Given the dbo example from above the desired result looks like this:

So far, I tried the following approaches:

First I thought selecting individual rows is sufficient.

and then compare that to the range between 4 and 6.

The problem here is that, in my real and way lager data set, not all animal-ID pairs have the same number of rows, i.e. not the same number of markers.

That is why I thought grouping might be helpful:

but that does not allow me to do comparisons and/or calculations across groups.

Thus I would like to ask how to calculate the degree of agreement in the comparison of row pairs.

Advertisement

Answer

Sample data

Solution

  1. Select all unique animals cte_AllAnimals.
  2. Select all unique markers cte_AllMarkers.
  3. Combine every animal with every animal behind it a2.AnimalId > a1.AnimalId. This will give you all unique animal combinations.
  4. Combine every pair with every marker cross join cte_AllMarkers.

This gives me:

Result

Fiddle to see it in action.

User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement