By Seppe van Winkel – Data Engineer
Hi, my name is seppe van winkel, data engineer at infofarm and I’m taking over the blog series about cloud data platforms this week! In the previous blog on azure data platforms, we talked about etl on synapse. Today, I’ll show you how we can do user and access management inside azure. How do you secure a specific column? How do you deny people access to specific files? All answers below. Enjoy reading!
Overview of the user and access management
In Azure there are two main components to user and access management:
For user management there is Azure Active Directory, if you already have a Microsoft 365/Office 365 subscription you’ll already have an Azure AD. This will allow you to create or invite users to your directory, and manage them if needed.
The access management part is called RBAC (Role based access control). The role management means to assign rights on a user in the Active Directory.
Azure Active Directory
Azure active directory is a cloud-based identity and access management service. It can be used to grant people access to external resources such as Microsoft 365, Azure, …
It can be used to secure internal resources such as cloud apps for your company or intranet related resources.
The AD will be attached to your company domain or by default will be on Microsoft domain: .onmicrosoft.com / .. In case of Microsoft 365/Office 365 this will already be the case. You can then invite users inside your domain, connect a different AD or invite external users.
Azure AD will also support a lot of different features to make your AD safer, MFA (Multi factor authentication), Conditional Access (If user x signs in from this device on this location, he can access a specific resource) , just-in-time access (It’s a bad practice to open a RDP/SSH ports all the time in the network. JIT (just-in-time access) is a feature that allows you to request access and Azure will enable the port for that time and will disable it later.) and many more, some of these features require a paid license.
Azure Role based access control
Azure Role-Based Access Control (RBAC) is a system for managing access to Azure resources. It allows administrators to set fine-grained permissions for Azure resources, ensuring that users and applications only have access to the resources they need.
A role assignment consists of three elements: security principal, role definition and scope.
The security principal is the object that is requesting access to Azure resources, this can be a user, a group or even a resource itself.
The role definition is a collection of permissions, Azure contains several built-in roles, but you can also create custom roles.
Finally, the scope, is a set of resources which the access applies to. A scope can be either a resource or it’s higher-level components. (resource group, subscription or management group)
Do you want to know more about Azure RBAC?
One way to manage access in Azure is to use Azure Active Directory (AD) users and groups. These are created and managed within Azure AD and can be used to grant access to Azure resources. For example, you could create an AD group called “Virtual Machine Admins” and add users to that group who should have the ability to manage virtual machines in Azure. Then, you can assign the “Virtual Machine Contributor” role to the “Virtual Machine Admins” group, giving members of that group the permissions they need to manage virtual machines.
Another way to manage access in Azure is to use managed identities. A managed identity is a special type of identity that is managed by Azure and can be used to authenticate to Azure resources. Managed identities can be used in place of AD users or groups to grant access to Azure resources. For example, you could create a managed identity for an Azure Virtual Machine and assign it the “Virtual Machine Contributor” role, giving it the permissions it needs to manage itself.
When to use an user or a managed identity? Often you’ll have to decide which access you give to what service. If your Synapse Analytics needs access to a storage account it will require you to allow that managed identity Read/Write access on the storage account.
If you want your users to access it, you’ll have to grant those rights to the user or the group of users.
Overall, Azure RBAC provides a flexible and granular way to manage access to Azure resources, allowing you to give users and applications the access they need while still maintaining control over your Azure environment.
For example granting a user access to Synapse Analytics, you can give access to a workspace or to a specific item. And then choose a specific build in Synapse Role.
For the example we take will grant rights to the whole workspace and then select Synapse Contributor as our role. Which means he can do most things, except grant other people roles and manage endpoints.
More granular
But this way, a person / group or service can access the whole resource. You can’t grant specific access to a record of data. For a more granular approach we have a few options. Row-level security (RLS), column-level security (CLS) and access control lists in storage.
Row-level security (RLS) and column-level security (CLS) are important features of Azure Synapse Analytics that help to secure data within a workspace. RLS allows administrators to control access to rows of data within a table, based on the user or group that is accessing the data. This is useful for scenarios where different users or groups of users should only have access to certain rows of data within a table. For example, an organization may have a table containing sensitive customer data, and RLS could be used to ensure that only authorized users can access this data.
CLS, on the other hand, allows administrators to control access to specific columns within a table. This is useful for scenarios where certain columns within a table contain sensitive or restricted information, and access should be restricted to only authorized users. For example, an organization may have a table containing employee data, and CLS could be used to restrict access to the salary column to only authorized users.
Both RLS and CLS can be implemented in Azure Synapse Analytics using T-SQL statements or by using the Azure portal. It is important to properly set up and manage RLS and CLS to ensure that sensitive data is protected and only accessed by authorized users.
For example an employee table has a field ‘salary’ and shouldn’t be access by anyone.
‘CREATE TABLE Employees (EmployeeId int IDENTITY, FirstName varchar(100) NULL, SSN char(9) NOT NULL, LastName varchar(100) NOT NULL, Phone varchar(12) NULL, Email varchar(100) NULL, Salary int NOT NULL); ‘
We will allow a group of users who do onboardings access to the data but not the salary and social security number.
‘GRANT SELECT ON Employees(EmployeeId, FirstName, LastName, Phone, Email) TO GRP-Onboarding;’
And we’ll grant HR the ability to query those fields.
‘GRANT SELECT ON Employees(EmployeeId, FirstName, LastName, Phone, Email) TO GRP-HR;’
When the GRP-Onboarding does a query that queries all fields, they will get an error.
‘SELECT * FROM Employees; -- Msg 230, Level 14, State 1, Line 12 -- The SELECT permission was denied on the column 'SSN' of the object Employees, database 'TestDb, schema 'dbo'.’
Access Control Lists (ACLs) in Azure Storage provide a way to set fine-grained permissions on storage accounts and the resources within them. These permissions control who has access to storage resources and what actions they are allowed to perform. There are two types of ACLs: resource ACLs and account ACLs. Resource ACLs apply to specific resources within a storage account, like a specific blob or queue, while account ACLs apply to the storage account itself and all the resources within it. If an ACL is set at the resource level, it will override any account-level ACLs. ACLs are made up of access policies, which are a set of permissions and the identity to which they apply. Identities can be Azure Active Directory (AAD) identities, like users, groups, and service principals, or anonymous identities, like an application running on an Azure virtual machine. Permissions can include read, write, list, delete, add, and update actions. These ACLs can be managed using Azure PowerShell, Azure CLI, or the Azure Storage REST API.
The ACL rights are combines with the RBAC rights that are on the resource. So if you have Storage Reader access on a storage account you will be able to read the data on it.
This means you cannot deny access to a path with ACL when you’ve already granted the RBAC role.
For example the following table shows you the additional ACL you’ll need to execute certain operation based on your role. N/A means you do not need any additional ACL.
If you’re excited about our content, make sure to follow the InfoFarm company page on LinkedIn and stay informed about the next blog in this series. Interested in how a data platform would look like for your organization? Book a meeting with one of our data architects and we’ll tell you all about it!
Want to start building your own data platform straight away? Take a look at the InfoFarm One Day Data Platform. A reference architecture in both AWS and Azure. We get you going with a fully operational data platform in only one day! More info on our website.