My query doesn’t seems to parse information_type
STEP TO REPRODUCE:
- Take AdventureWork or any other test database
- Right click on your test database >
Tasks > Data Discovery and Classification > Classify Data...
and classify a bunch of data - Now use this query to check your
information_type
column (here I’m using AdventureWork2019):
Paste this select and execute
SELECT schema_name(O.schema_id) AS schema_name, O.NAME AS table_name, C.NAME AS column_name, [Type] = CASE WHEN ct.[name] IN ('varchar', 'char') THEN ct.[name] + '(' + IIF(c.max_length = -1, 'max', CAST(c.max_length AS VARCHAR(25))) + ')' WHEN ct.[name] IN ('nvarchar','nchar') THEN ct.[name] + '(' + IIF(c.max_length = -1, 'max', CAST(c.max_length / 2 AS VARCHAR(25)))+ ')' WHEN ct.[name] IN ('decimal', 'numeric') THEN ct.[name] + '(' + CAST(c.precision AS VARCHAR(25)) + ', ' + CAST(c.scale AS VARCHAR(25)) + ')' WHEN ct.[name] IN ('datetime2') THEN ct.[name] + '(' + CAST(c.scale AS VARCHAR(25)) + ')' ELSE ct.[name] END, information_type, label, rank, rank_desc, CAST( CASE WHEN information_type = 'Contact Info' THEN 'contact info' ELSE 'not contact info' END AS varchar(max)) as Checking_Content FROM sys.sensitivity_classifications sc JOIN sys.objects O ON sc.major_id = O.object_id JOIN sys.columns C ON sc.major_id = C.object_id AND sc.minor_id = C.column_id JOIN sys.types ct ON C.user_type_id = ct.user_type_id --AND EP.minor_id = C.column_id order by information_type
As you can see the query is returning not contact info
even when the column information_type
has value Contact Info
schema_name | table_name | column_name | Type | information_type | label | rank | rank_desc | Checking_Content |
---|---|---|---|---|---|---|---|---|
Person | PersonPhone | PhoneNumber | Phone | Contact Info | Confidential | 20 | MEDIUM | not contact info |
Person | PersonPhone | PhoneNumberTypeID | int | Contact Info | Confidential | 20 | MEDIUM | not contact info |
Person | PhoneNumberType | PhoneNumberTypeID | int | Contact Info | Confidential | 20 | MEDIUM | not contact info |
Person | Address | AddressLine1 | nvarchar(60) | Contact Info | Confidential | 20 | MEDIUM | not contact info |
Person | Address | AddressLine2 | nvarchar(60) | Contact Info | Confidential | 20 | MEDIUM | not contact info |
Person | Address | City | nvarchar(30) | Contact Info | Confidential | 20 | MEDIUM | not contact info |
Person | Address | PostalCode | nvarchar(15) | Contact Info | Confidential | 20 | MEDIUM | not contact info |
Production | ProductReview | EmailAddress | nvarchar(50) | Contact Info | Confidential | 20 | MEDIUM | not contact info |
Person | EmailAddress | EmailAddress | nvarchar(50) | Contact Info | Confidential | 20 | MEDIUM | not contact info |
dbo | ErrorLog | UserName | sysname | Credentials | Confidential | 20 | MEDIUM | not contact info |
Person | Password | PasswordHash | varchar(128) | Credentials | Confidential | 20 | MEDIUM | not contact info |
Person | Password | PasswordSalt | varchar(10) | Credentials | Confidential | 20 | MEDIUM | not contact info |
Sales | CreditCard | CreditCardID | int | Credit Card | Confidential | 20 | MEDIUM | not contact info |
Sales | CreditCard | CardType | nvarchar(50) | Credit Card | Confidential | 20 | MEDIUM | not contact info |
Sales | CreditCard | CardNumber | nvarchar(25) | Credit Card | Confidential | 20 | MEDIUM | not contact info |
Sales | CreditCard | ExpYear | smallint | Credit Card | Confidential | 20 | MEDIUM | not contact info |
Sales | PersonCreditCard | CreditCardID | int | Credit Card | Confidential | 20 | MEDIUM | not contact info |
Sales | SalesOrderHeader | CreditCardID | int | Credit Card | Confidential | 20 | MEDIUM | not contact info |
Sales | SalesOrderHeader | CreditCardApprovalCode | varchar(15) | Credit Card | Confidential | 20 | MEDIUM | not contact info |
HumanResources | Employee | BirthDate | date | Date Of Birth | Confidential – GDPR | 20 | MEDIUM | not contact info |
I’m very confused:
What do you think?
Advertisement
Answer
According to
exec sp_describe_first_result_set N'select * from sys.sensitivity_classifications sc'
information_type is a sql_variant. Which is not per the docs but should not be an issue as compatible types are converted to sql_variant for comparison without issue. EG
select case when cast(N'Contact Info' as sql_variant) = 'Contact Info' then 1 else 0 end
returns
1
So this is potentially a bug. You can work around by casting explitly to nvarchar. eg
SELECT schema_name(O.schema_id) AS schema_name, O.NAME AS table_name, C.NAME AS column_name, [Type] = CASE WHEN ct.[name] IN ('varchar', 'char') THEN ct.[name] + '(' + IIF(c.max_length = -1, 'max', CAST(c.max_length AS VARCHAR(25))) + ')' WHEN ct.[name] IN ('nvarchar','nchar') THEN ct.[name] + '(' + IIF(c.max_length = -1, 'max', CAST(c.max_length / 2 AS VARCHAR(25)))+ ')' WHEN ct.[name] IN ('decimal', 'numeric') THEN ct.[name] + '(' + CAST(c.precision AS VARCHAR(25)) + ', ' + CAST(c.scale AS VARCHAR(25)) + ')' WHEN ct.[name] IN ('datetime2') THEN ct.[name] + '(' + CAST(c.scale AS VARCHAR(25)) + ')' ELSE ct.[name] END, information_type, label, rank, rank_desc, CAST( CASE WHEN cast(information_type as nvarchar(200)) = 'Contact Info' THEN 'contact info' ELSE 'not contact info' END AS varchar(max)) as Checking_Content FROM sys.sensitivity_classifications sc JOIN sys.objects O ON sc.major_id = O.object_id JOIN sys.columns C ON sc.major_id = C.object_id AND sc.minor_id = C.column_id JOIN sys.types ct ON C.user_type_id = ct.user_type_id --AND EP.minor_id = C.column_id order by information_type