Skip to content
Advertisement

How can I restrict access to sensitive columns in Apache Drill view based on user permissions in another view?

Background: I have users connect to Apache Drill with Kerberos authentication to read from a Parquet file so basically a single table with multiple columns. Some of the columns in that file are known to be sensitive and only certain users can see them. Apart from the data table Drill has access to another table with information who has access to sensitive data (2 columns there: userId, sensitiveDataAccess). To emphasize, users can see all rows in the data table, but only those who have access to sensitve data can see the sensitive columns.

Advertisement

Answer

The solution is to create a view joining the data table with a row from the security table containing information about access to sensitive data for the logged in user and then using conditions in the SELECT clause to nullify sensitive columns if a user does not have access to them.

SELECT
  hc.name,
  CASE WHEN sec.`sensitiveDataAccess`=TRUE THEN hc.`salary` ELSE null END AS salary, --example of a sensitive column
FROM dfs.`/data/headcount.parquet` hc
JOIN (SELECT * FROM dfs.`/data/security.parquet` WHERE userId=session_user) sec
ON sec.userId=session_user;

You might need to enable cartesian joins in Drill to make it work or add a dummy column with zeroes in both tables and then add the below to the join condition as a workaround:

AND hc.JoinHack=sec.JoinHack 
User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement