Snowflake Role-Based Access: Best Practices Design Guide

·

9 min read

This article describes a design pattern to deliver a simple but scalable and extensible role-based access control architecture for Snowflake. It removes a huge amount of the complexity in delivering an RBAC design and sets the standard for how RBAC should be deployed on Snowflake.

Three-Part Series on RBAC

This article is part of a series of three, including:

  1. Introducing Role Based Access Control, an introduction to RBAC concepts

  2. Snowflake System Defined Roles - Best Practice, which describes Best Practices for RBAC design

  3. This article is on how to architect an RBAC solution.

    Together, they describe the underlying RBAC concepts and best practices needed to deploy a robust but simple Role-Based Access architecture.

A Quick Summary of RBAC Concepts

The diagram below illustrates a typical RBAC scenario demonstrating the nature of granting access to database objects and the difference between ownership of a role and an object.

The diagram above illustrates several important RBAC concepts, which may be summarised under the headings of:

  • Object Ownership: The PROD_DATA_ADMIN The database, Schema, and Tables role owns and can, therefore, Insert, Update, and delete from the tables and grant access to other roles. This role effectively manages the objects.

  • Role Ownership: PROD_ROLE_ADMIN owns the PROD_DATA_ADMIN Role. However, it has no access to the underlying data, which would require a grant. It could, however, grant the PROD_DATA_ADMIN role to other users or roles (including itself). Effectively, this role manages a role.

  • Inheritance: PROD_DATA_ADMIN Has been granted the PROD_USER Role. This means it inherits all underlying grants and effectively owns the Database, Schema, and Tables. This means RICK has the same privileges on the database as LEON.

  • Object Grants: The PROD_READ role has been granted USAGE privilege on the Database and Schema and SELECT on the tables. This means RACHEL can query the data, but unlike LEON or RICK, it cannot alter the table structure.

It is worth pausing to fully understand these concepts, as they form the underlying RBAC principles we will use to implement the solution.

Simplifying Grants

Albert Einstein says, “Everything Should be as Simple as Possible and no Simpler.”

While this should be a key tenet of any design, it is especially so for Role-Based Access design, where there is almost unlimited scope for complexity. To understand the need for simplicity, fast-forward five years and consider the database landscape for your project. You’ll likely have over 5,000 tables and data accessed by over 20 roles accessed by thousands of users.

Consider the number of grants alone needed to support this system. As every table is potentially accessible using SELECT, INSERT, UPDATE, and DELETE access, you will need between 5,000 and 20,000 grants per role.

In the worst case, you might need up to 400,000 individual grants. It’s almost impossible to visualize that number of privileges, let alone verify they are correct. We need a simple solution.

The SQL statement below shows the way to simply object-level grants.

grant select, insert, update, delete, references
   on all tables
   in schema main
   to role PROD_ANALYST;

Using this single statement, we can reduce the number of individual grants by a factor of 20, which means you can be 100% sure of the access granted to every user with the PROD_ANALYST role.


Want to learn more from a Snowflake Expert? Click on the image below for training from Analytics Today.

Snowflake training by Analytics Today


Future Grants

We can further simplify granting access to new objects with Future Grants. The number of tables in a given schema will never remain static; additional tables and views will be added over time. To ensure any newly added tables are accessible in the same way, Snowflake introduced the concept of Future Grants.

grant select, insert, update, delete, references
   on FUTURE tables
   in schema main
   to role PROD_ANALYST;

The SQL above shows the syntax for creating a rule whereby any additional tables created in the same schema automatically trigger the creation of the same grants to the PROD_ANALYST role.

Using this feature means that once future grants are created, they are automatically applied whenever new objects are created.

Controlling Grants with Managed Access Schemas

As indicated above, whenever a role owns an object, they have full control over it, including the ability to grant access to others.

The diagram below illustrates a typical situation whereby the database and schema are owned by a central administrator (SYSADMIN), while the DATA_ANALYST role owns the Tables and can create and modify tables in the schema.

The above scenario has a major drawback in that the owner of the table can grant access to any other role. This means there’s a potential risk that LEON could grant SELECT on the tables to anybody else on the account, which is a potential security risk.
The diagram below illustrates a solution in which the schema is marked as a managed access schema.

In the above scenario, although the PROD_DATA_ANALYST role has created and, therefore owns tables in the MAIN schema, only the schema owner (SYSADMIN) can grant access to others. In this case, only the central admin (RACHEL) can grant access to the database, schema, and table for the PROD_USER role. The SQL needed to create or modify a managed access schema is listed below.

create schema main
    with managed access;

alter schema main
    with managed access;

Role-Based Access Architecture

As indicated above, it’s vital to simplify the Role Based Access solution, and this can be achieved by organizing roles in a sequence of layers. The diagram below illustrates the overall RBAC architecture.

The diagram above illustrates the overall multi-layer RBAC architecture, which is consistent with the following layers.

  • Database Objects: At the lowest level are one or more databases containing schemas and database objects, which must be secured.

  • Security Roles: These are the first layer in role-based access and provide a predefined set of access controls. Each role defines (and simplifies) access to a given database and schema and provides Read, Write, Execute, or Full access to the underlying schema objects. These roles use Future Grants to automatically provide Select, Insert, Update, Delete, or Execute access to underlying data. Full Access grants ownership access, and in the above example, the PROD_DATA_SCIENTIST role can create or drop tables in the SANDBOX schema.

  • Functional Roles: These are the second layer of role-based access, and these roles are granted to individual users to perform a task. Unlike the other roles in the account, these are named to indicate the task assigned the individual, for example PROD_BI_ANALYST or DEV_ENGINEER.

  • Environment Admin Roles: These are the first layers of account control, and they give the ability to delegate administrative responsibility to an environment*.* A Snowflake Environment is not a physical object but an arbitrary collection of Roles, Virtual Warehouses, and Databases managed as a single unit. This could be used, (for example), to provide access to a UAT, TEST or PROD environment. The roles at this level are:

    • PROD_SECURITY_ADMIN: This role creates and owns all the roles in the environment. It performs a similar role to USERADMIN but for a given environment. This role has the privilege to CREATE ROLES but can only manage the roles it owns within its own environment.

    • PROD_SYSADMIN: This role creates and owns the Database and Schemas in the environment. It performs a similar role to SYSADMIN but for a given environment. This role has the privilege to create databases.

  • Account Admin Roles: Unlike the Environment Admin roles, which are limited to a single environment, these roles have control over every Virtual Warehouse, Role, Database, and Schema in the entire Snowflake account. As defined in the RBAC Best Practices, these roles inherit privileges from the underlying Environment Admin Roles, maintaining a consistent RBAC hierarchy.


Snowflake Training by an Expert.

Click in the image below for more detail


RBAC Environment Administration

As indicated above, one of the main challenges faced by Snowflake Administrators is the mounting complexity involved in managing an ever-expanding system. The only way to realistically scale the solution is to automate the building and replication of both data and the associated RBAC hierarchy. While Snowflake provides cloning for rapidly producing data copies, there is no corresponding equivalent to replicate an RBAC hierarchy.

The diagram below illustrates how two or more environments can be created using a well-designed RBAC architecture; in this case, the PROD and TEST environments.

The diagram above illustrates how the multi-layer RBAC architecture can be scaled up to support multiple independent environments. While the USERADMIN and SYSADMIN roles control the entire account, the environment admin roles for each environment only control their own environment and cannot impact other areas in the system.

Notice the strictly defined naming convention for Roles and Databases, which must also extend to Virtual Warehouse names. Each role is prefixed with a short code to indicate the environment (eg. PROD or TEST), and the XXX_SECURITY_ADMIN role owns all roles in the environment. In addition, a set of Security Roles are automatically created for each schema consisting of: XXX_DATABASE_SCHEMA_TYPE. Where the TYPE is one of READ, WRITE, EXECUTE or FULL and DATABASE and SCHEMA Identify the corresponding database and schema to which they are granted access.

Because a strict naming convention has been observed, it is possible (indeed almost certainly necessary) to create a stored procedure to both clone a schema and replicate the associated RBAC hierarchy.

While it may not be evident at this point, the entire solution can (and has been) scripted. This means the entire task of deploying and replicating RBAC can be automated, reducing the risk of manual mistakes. It also means that once 99% of the RBAC solution is scripted, the only remaining effort is to decide which access to give to each Functional Role. For example, the SQL below grants access to several functional roles and users.

grant role PROD_EDW_SANDBOX_FULL
   to role PROD_DATA_SCIENTIST;

grant role PROD_EDW_MAIN_READ
   to role PROD_DATA_SCIENTIST;

grant role PROD_EDW_MAIN_READ
   to role PROD_BI_ANALYST;

Conclusion

Even the smallest analytics application or Data Warehouse will eventually grow and administrators risk being overwhelmed by the complexity of grants and ownership. It’s a situation I have witnessed at many Snowflake customer deployments.

If you encounter a similar challenge with Snowflake, you need to take control and deploy a simple but scalable Role-Based Access architecture.

Suppose you are lucky to read this before deploying the production system. In that case, you are strongly advised to prepare the groundwork with a scalable solution based on the advice in these articles.

During the past two years, I have helped over fifty customers understand and deploy role-based access solutions, and not one asked for a more complex solution. The fact is, RBAC is almost entirely flexible and has an unlimited capacity for complexity. You’ll be able to manage only by designing out the complexity to deliver a robust, architected solution.

Three-Part Series on RBAC

This article is part of a series of three, including:

  1. Introducing Role Based Access Control, an introduction to RBAC concepts

  2. Snowflake System Defined Roles - Best Practice, which describes Best Practices for RBAC design

  3. This article is on how to architect an RBAC solution.


Snowflake Training from an Snowflake Superhero.

Click on the image below for more information.

Analytics Today Snowflake Training