by datastudy.nl

Field notes for enterprise data engineers and scientists

Engineering

Snowflake RBAC and masking: lock it down without grinding to a halt

Snowflake access control is role-based: privileges attach to roles, not users. Splitting access roles from functional roles collapses thousands of direct grants to a few hundred, and one masking policy can protect thousands of columns at query time.

Bar chart comparing grants to manage: about 10,000 with direct user grants versus 260 with functional and access roles, on a log scale
Illustrative: a role hierarchy collapses the number of grants to manage from roughly 10,000 direct user grants to about 260. Snowflake documentation on access control.

Governance is where a fast-moving Snowflake account quietly turns into a liability. Someone needs access, an admin grants it straight to their user, and a year later nobody can answer who can see the salary table. The fix is not more process, it is using the access model Snowflake actually gives you. Snowflake is role-based: privileges are granted to roles, and roles are granted to users, never privileges to users directly. Get the role design right and you collapse what would be roughly 10,000 individual grants down to a few hundred while making access auditable instead of archaeological.

This guide is for the data engineer or platform owner who has to keep Snowflake both usable and defensible: analysts unblocked, auditors satisfied, and PII not leaking into a dashboard. We will stick to how Snowflake's RBAC and masking actually behave.

Why grant to roles and not users?

In Snowflake, a privilege (say, SELECT on a schema) is granted to a role, and a user gets access by being granted that role. Grant directly to users and the math explodes: fifty users who each need access to two hundred objects is up to ten thousand grants to create, track, and eventually revoke. Miss a few on offboarding and you have a standing audit finding.

Bar chart of grants to manage: about 10,000 with direct user grants versus 260 with functional and access roles, log scale
Illustrative: a two-layer role hierarchy collapses grant sprawl from roughly 10,000 direct user grants to about 260. Source: Snowflake documentation on access control.

The pattern that scales is a two-layer role hierarchy, which Snowflake's own guidance recommends. Access roles own the privileges on objects: analytics_db_read holds SELECT on the analytics schemas, raw_db_write holds write on the landing zone. Functional roles map to jobs: data_analyst, data_engineer, bi_developer. You grant access roles to functional roles, and functional roles to people. Now a new analyst gets exactly one grant, data_analyst, and inherits everything that role is supposed to see.

-- Access role owns the object privileges.
CREATE ROLE analytics_read;
GRANT USAGE ON DATABASE analytics TO ROLE analytics_read;
GRANT USAGE ON SCHEMA analytics.marts TO ROLE analytics_read;
GRANT SELECT ON ALL TABLES IN SCHEMA analytics.marts TO ROLE analytics_read;

-- Functional role = the job. It inherits the access role.
CREATE ROLE data_analyst;
GRANT ROLE analytics_read TO ROLE data_analyst;

-- People get one grant: the job.
GRANT ROLE data_analyst TO USER jdoe;

Two discipline points keep this clean. Keep ACCOUNTADMIN for break-glass only and run day-to-day administration through SECURITYADMIN and SYSADMIN, because handing out the top role defeats the whole hierarchy. And give every object a deliberate owner role, so OWNERSHIP is not scattered across whoever happened to run the CREATE.

How do I protect specific columns without copying data?

RBAC controls which tables a role can touch. It does not, on its own, hide the ssn column from an analyst who legitimately needs the rest of the row. That is what Dynamic Data Masking is for, and it is the governance feature that earns its keep fastest.

A masking policy is a schema-level object that rewrites a column's value at query time based on the querying role. The data on disk never changes; what a user sees is decided per query.

CREATE MASKING POLICY mask_email AS (val string) RETURNS string ->
    CASE
        WHEN CURRENT_ROLE() IN ('PII_READER', 'SECURITYADMIN') THEN val
        ELSE REGEXP_REPLACE(val, '.+@', '****@')
    END;

ALTER TABLE customers MODIFY COLUMN email SET MASKING POLICY mask_email;

Now PII_READER sees real email addresses and everyone else sees ****@example.com, from the same table, with no copy and no second pipeline. The leverage Snowflake's docs call out: you write the policy once and apply it to thousands of columns, and you can change the policy's logic centrally without reapplying it anywhere. Pair it with tag-based masking, attach the policy to a pii_email tag and tag the columns, and protection follows the data automatically as new tables appear. Row-level filtering has a sibling feature, row access policies, which hides whole rows by role using the same query-time model.

A few realities to plan around:

  • Dynamic Data Masking is an Enterprise Edition feature. If you are on Standard, it is not available, and that may itself be the reason to upgrade.
  • A column takes one masking policy at a time. You cannot stack two; decide the policy per column.
  • Masking interacts with materialized views. Apply policies to the base table columns, not to a materialized view, or you will hit errors.

Where masking gets genuinely powerful is in combination with tags. Define the sensitivity taxonomy once, pii, financial, restricted, attach a masking policy to each tag, and then governance becomes a tagging exercise rather than a per-column chase. Snowflake can also help you find what to tag: sensitive data classification scans columns and proposes semantic categories like email or phone number, so you are not auditing thousands of columns by eye. The pattern that scales is classify, tag, mask-by-tag, because new tables that inherit a tagged column's lineage pick up protection automatically instead of waiting for someone to remember.

How do I roll this out on an account that is already messy?

You rarely get a greenfield account. Retrofit in order, and audit as you go.

  1. See what exists. SHOW GRANTS and the ACCOUNT_USAGE.GRANTS_TO_USERS view expose every direct-to-user grant. That list is your cleanup backlog and, usually, a sobering one.
  2. Stand up the hierarchy alongside the mess. Create the access and functional roles, grant the access roles into them, and move users onto functional roles one team at a time. Nothing breaks for users still on the old grants while you migrate.
  3. Revoke the direct grants last. Once a team is fully on functional roles, strip their direct-to-user grants. Now access is described entirely by which roles a person holds, which is exactly what an auditor wants to see.
  4. Classify and mask the sensitive columns. Tag PII columns, attach masking policies through the tags, and verify with the POLICY_REFERENCES view that every sensitive column is actually covered.

Audit continuously, not at year-end: ACCOUNT_USAGE.POLICY_REFERENCES lists every object a masking policy is set on, and the GRANTS_TO_ROLES view lets you answer "who can read this table" with a query instead of a meeting. The same instinct from the rest of the Snowflake guides applies, measure the real state in ACCOUNT_USAGE rather than trusting the diagram on the wiki.

The test that proves it works

Pick your most sensitive table and ask: can you answer "who can see this column, and what do they see" with a single query? If yes, your RBAC and masking are doing their job. If it takes a meeting and three Slack threads, the access model is living in people's heads instead of in Snowflake, and that gap is exactly what fails an audit.

Sources