Skip to content
Advertisement

Get fill rates in another table – SQL Server

I am trying to create a script to calculate the fill rates for each column in a table Data_table and insert it into a second table Metadata_table.

The Data_table has 30 columns in it, and some columns have 100% data in them, and some have less than 100% (due to nulls).

My code to calculate the fill rate looks like this,

 select 
     cast(sum(case 
                 when employee_id is null 
                    then 0 
                    else 1 
              end) / cast(count(1) as float ) * 100 as decimal(8,3)) as employee_id_fill,
     .....--/*so on for 30 columns..*/ 
from 
    [Data_table] 

The Metadata_table should look like this:

Table_name  |  Colmn_name  |  Fill_rate
[Data_table]|   Colomn_a   | 100%
[Data_table]|   Colomn_b   | 89%
[Data_table]|   Colomn_c   | 100%
 and so on... 

I think

unpivot

can work here, but i am unable to get the column names into the [Metadata_table] automatically.

I tried using this for automating the column names-

COL_NAME(OBJECT_ID('DBO.[DATA_TABLE]'),'COLOMN_A') 

but this has not worked so far.

Any help is appreciated

Advertisement

Answer

You can use sys.columns for grabbing the column names. You can join it to sys.tables by the object_id if you ever need to associate the two.

For example:

SELECT c.NAME
FROM SYS.TABLES t
     INNER JOIN SYS.COLUMNS c ON t.OBJECT_ID = c.OBJECT_ID
WHERE t.OBJECT_ID = OBJECT_ID('DBO.[Data_Table]');  

You can generate SQL from here in the format you wanted by creating an expression to query your table and then unpivot it.

Another approach could be a while loop to do inserts into your metadata table. If you’re working with a very large table this option will be more expensive so keep it in mind. I used an example table dbo.Attendance_Records and this script will print out the example SQL, not execute it. You would want to change it to call sp_executesql on that text.

DECLARE @Table NVARCHAR(128) = 'DBO.[Attendance_Records]'
       ,@MetaTable NVARCHAR(128) = 'DBO.[Metadata_Table]'
       ,@ColumnName NVARCHAR(128)
       ,@Iterator INT = 1
       ,@SQL NVARCHAR(MAX)

SELECT c.NAME
      ,c.COLUMN_ID
      ,ROW_NUMBER() OVER (ORDER BY COLUMN_ID) AS RN
INTO #Cols
FROM SYS.COLUMNS c
WHERE c.OBJECT_ID = OBJECT_ID(@Table);

WHILE @Iterator <= (SELECT ISNULL(MAX(RN),0) FROM #Cols)
    BEGIN
        SET @ColumnName = (SELECT NAME FROM #Cols WHERE RN = @Iterator)
        SET @SQL =  'INSERT INTO ' + @MetaTable + ' (Table_Name, Column_Name, Fill_Rate) '
                  + 'SELECT ''' + REPLACE(@Table,'DBO.','') + ''', ''' + @ColumnName + ''', 100 * CONVERT(DECIMAL(8,3), SUM(CASE WHEN [' + @ColumnName + '] IS NULL THEN 0 ELSE 1 END)) / COUNT(1) AS [' + @ColumnName + '_fill]' + ' FROM ' + @Table

        PRINT @SQL
        SET @Iterator += 1
    END
User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement