Data Today: Snowflake

Snowflake warehouse sizing: stop paying for idle compute

2026-06-07T00:00:00Z

If your Snowflake bill jumped last quarter and nobody changed the data, the culprit is almost always compute, and almost always a warehouse that is bigger or busier than the work needs. A Snowflake virtual warehouse is just a cluster you rent by the second, and its size is the single dial that sets how fast credits drain. Get the dial wrong and you either throttle every analyst or you torch budget on idle compute. An XS warehouse burns 1 credit per hour; a 4XL burns 128 for the same wall-clock hour. That is a 128x spread on one setting, and most teams pick it once and never revisit it.

This guide is for the person who owns the warehouse and answers to finance when the credit line spikes. The goal is to size by workload, not by reflex, and to prove the change with numbers you can pull from your own account.

How does warehouse size actually behave?

Snowflake sizes go XS, S, M, L, XL, then 2XL through 6XL. Each step up doubles the compute, and it doubles the credit rate in lockstep. The pricing is not a curve you have to model; it is a clean power of two.

Credits per hour double with every Snowflake warehouse size, from 1 at XS to 128 at 4XL. Source: Snowflake documentation.

The trap is assuming bigger is wasteful and smaller is safe. It is the opposite as often as not. A larger warehouse finishes a heavy query faster, and because billing is per-second after the first 60 seconds, a query that runs in 2 minutes on an L can cost the same as one that crawls for 8 minutes on an S. The size that wins is the smallest one that does not spill to disk and does not queue. Doubling the size only saves money when it more than halves the runtime, which holds for large scans and big joins but breaks for small, serial, or metadata-bound queries.

You can see the difference in the query profile. Two signals tell you a warehouse is too small for the job:

Spilling. When a query runs out of memory it spills to local then remote storage, and remote spill is brutally slow. Any remote spill is a sign to size up.
Queuing. If queries wait for a slot, the warehouse is saturated. That is a concurrency problem, not a size problem, and the fix is different.

What does it cost, and where does it break?

Sizing up fixes spilling. It does nothing for queuing, because a single larger cluster still runs a fixed number of concurrent queries before it starts holding them in a line. The right tool for a crowd of small queries is multi-cluster scaling: keep the size modest and let Snowflake add clusters under concurrency, then retire them when the rush passes. Reach for a bigger size when one query is slow; reach for more clusters when many queries are waiting.

Here is the comparison that matters when you are deciding which lever to pull:

Symptom	Right lever	Wrong lever	Why
One query spills to remote disk	Size up one step	Add clusters	Extra clusters do not give a single query more memory
Many short queries queue at 9am	Multi-cluster (min 1, max 3)	Size up	A bigger single cluster still serializes the crowd
Warehouse runs all day at 5% load	Auto-suspend at 60s	Bigger warehouse	You are paying for idle, not for slow
Nightly batch runs 40 minutes	Size up, measure cost	Leave it on an S	Faster finish can cost the same and frees the window

The most expensive mistake is not the wrong size at all. It is idle time. A warehouse left running with no queries bills every second until it suspends. Set auto-suspend to 60 seconds for interactive warehouses and confirm auto-resume is on so the next query wakes it. The classic 600-second default means each abandoned warehouse quietly bills ten minutes of nothing, over and over.

How do I roll it out without guessing?

Do not size by vibes. Snowflake records every query and every credit in ACCOUNT_USAGE, so you can measure the real distribution before you touch anything. Start by finding your spillers, because those are where sizing up pays for itself:

-- Queries that spilled to remote storage in the last 7 days,
-- ranked by how much they spilled. These are your size-up candidates.
SELECT
    warehouse_name,
    query_id,
    ROUND(bytes_spilled_to_remote_storage / POWER(1024, 3), 1) AS remote_spill_gb,
    ROUND(total_elapsed_time / 1000, 1) AS elapsed_s,
    LEFT(query_text, 80) AS query_preview
FROM snowflake.account_usage.query_history
WHERE start_time > DATEADD('day', -7, CURRENT_TIMESTAMP())
  AND bytes_spilled_to_remote_storage > 0
ORDER BY remote_spill_gb DESC
LIMIT 50;

Then look at the other end: warehouses that bill credits while doing almost nothing. Pair the credit spend from WAREHOUSE_METERING_HISTORY with the query counts from QUERY_HISTORY, and any warehouse with high credits and low query volume is an auto-suspend problem, not a sizing one.

Once you have the data, roll out in three controlled moves:

Set guardrails first. Put a RESOURCE MONITOR on each warehouse with a monthly credit quota and a suspend trigger, so a runaway query or a forgotten session cannot run up an open-ended bill. This is your seatbelt before you start changing sizes.
Change one warehouse, then measure. Resize a single warehouse with ALTER WAREHOUSE ... SET WAREHOUSE_SIZE, leave it for a few days, and compare credits and median runtime against the week before. Changing everything at once means you learn nothing.
Split workloads. Stop running dashboards, ad-hoc analysis, and heavy ELT through one shared warehouse. Give each workload its own warehouse so you can size and monitor them independently, and so a 3XL backfill never starves the analyst running a quick count.

The rollback is trivial, which is why this is safe to try: resizing takes effect on the next query, and you can drop back a size in one statement if the cost does not move the way you expected.

If you run dbt or scheduled ELT, point those at their own warehouse and size it for the heaviest model in the run, not the average. Batch work is exactly where a larger warehouse earns its credits, because it finishes faster and hands the time window back. The rest of the Snowflake guides go deeper on the pipeline and governance side.

The one number to watch

Track credits per active query-hour, not raw credits. Raw credits go up when usage grows, which can be healthy. Credits burned per hour of actual query work is the number that exposes idle warehouses and oversized clusters, and it is the one that drops the day you right-size. Pull it weekly, watch the trend, and let the warehouse that does the least work for the most credits be the first one you fix.

Sources

Dynamic Tables vs Streams and Tasks: which Snowflake pipeline to build

2026-06-07T00:00:00Z

Most Snowflake data teams hit the same fork. You have raw data landing in a table and you need a clean, transformed version that stays current. Do you write the transformation as a declarative Dynamic Table and let Snowflake keep it fresh, or do you wire up a Stream to capture changes and a Task to process them on a schedule you control? Pick wrong and you either fight the orchestration you did not need or hit a freshness wall you did not see coming. The single hard number that settles a lot of these arguments: Dynamic Tables cannot refresh faster than a 60-second target lag. If your SLA is tighter than a minute, that choice is already made for you.

This guide is for the engineer designing a transformation pipeline on Snowflake who wants to stop relitigating the same build-versus-declare debate on every table. We will use real Snowflake behavior, not vibes.

What is each one actually doing?

A Dynamic Table is a materialized result of a SELECT that Snowflake keeps up to date for you. You write the query and a TARGET_LAG, and Snowflake figures out the dependency graph, watches the base tables, and refreshes, incrementally when it can, so readers never see a partial result.

CREATE OR REPLACE DYNAMIC TABLE dt_orders
    TARGET_LAG = '10 minutes'
    WAREHOUSE = transform_wh
    REFRESH_MODE = INCREMENTAL
AS
    SELECT order_id, customer_id, order_date,
           TRIM(UPPER(product_name)) AS product_name,
           quantity * unit_price AS line_total
    FROM raw_orders
    WHERE order_status != 'returned';

That is the whole pipeline. No scheduler, no change-tracking code. Set TARGET_LAG = '10 minutes' and Snowflake tries to keep the table no more than ten minutes behind its source. Chain dynamic tables that read from each other and Snowflake refreshes them in dependency order against a consistent snapshot, which is the part that makes them genuinely pleasant for multi-step transforms.

Streams and Tasks are the imperative alternative. A Stream is a change-tracking cursor over a table: it records the rows inserted, updated, or deleted since you last consumed it. A Task runs a SQL statement (or calls a procedure) on a schedule or when a predecessor finishes. You compose them yourself: a Stream captures what changed, a Task reads the Stream and applies the change with a MERGE. You own the logic, the ordering, and the failure handling.

Illustrative: the lowest practical lag each Snowflake approach reaches. Dynamic Tables and minute-level Tasks both bottom out near 60 seconds; Snowpipe Streaming goes lower; an hourly task sits at 3,600. Source: Snowflake documentation on dynamic tables and tasks.

Which one should I reach for?

The honest answer is that Dynamic Tables are the right default for most analytical transforms now, and Streams and Tasks are the tool you keep for the cases Dynamic Tables cannot express. Snowflake's own rule of thumb is blunt: if your logic fits in a SELECT, it is a candidate for a Dynamic Table.

Reach for Dynamic Tables when the transform is expressible as SQL, you want declarative freshness you tune with one parameter, and a minute or more of lag is fine. This is the bulk of cleaning, joining, and aggregating work.
Reach for Streams and Tasks when you need procedural logic (calling a stored procedure, an external function, branching), sub-minute freshness, or fine control over exactly when and how a change is applied. CDC into a slowly changing dimension with custom merge rules is the classic case.
Reach for Snowpipe Streaming when the requirement is genuinely low-latency ingestion, rows available in seconds, which neither of the above delivers on its own.

A practical gotcha sits inside the freshness column: a Task scheduled "every 1 minute" is not the same as one-minute freshness. The task has to start, the warehouse has to resume if it was suspended, and the merge has to run, so real end-to-end lag is the schedule plus the run time. Snowflake's triggered tasks, which fire when an underlying Stream gets data instead of on a fixed clock, close most of that gap and are usually the better choice than a tight cron when you want responsiveness without polling an empty stream every minute.

Here is the comparison that actually drives the decision:

Dimension	Dynamic Tables	Streams + Tasks
Programming model	Declarative: write a SELECT	Imperative: you orchestrate
Lowest freshness	60-second target lag minimum	Down to ~1 minute on a schedule, faster with triggered tasks
Multi-step pipelines	Auto dependency graph, consistent snapshot	You sequence tasks yourself
Stored procedures / external functions	Not supported in the definition	Fully supported
Failure handling	Managed by Snowflake	Yours to design and monitor
Best fit	Cleaning, joins, aggregations	CDC, custom merge logic, procedural steps

What does it cost, and where does it bite?

Both approaches bill the same underlying things, so cost is rarely the deciding factor, but the shape differs. A Dynamic Table charges warehouse compute for each refresh query, Cloud Services for the dependency tracking and change detection, and storage for the materialized rows plus Time Travel. The trap is target lag: a shorter lag means more frequent refreshes and more scheduling overhead, so setting TARGET_LAG = '1 minute' on a table nobody reads more than hourly just burns credits for freshness no one consumes. Match the lag to how fresh the data genuinely needs to be, and use TARGET_LAG = DOWNSTREAM on intermediate tables so they only refresh when something downstream needs them.

Streams and Tasks bill the warehouse (or serverless compute) for each task run. The classic waste here is a frequent schedule on a Stream that is usually empty: guard every task with WHEN SYSTEM$STREAM_HAS_DATA('my_stream') so it skips the run, and the credits, when nothing changed.

If you run dbt, this maps cleanly: dbt can materialize models as Dynamic Tables, so you get declarative freshness inside your existing project rather than hand-rolling tasks. Point that work at its own warehouse, sized for the heaviest refresh, as covered in the warehouse sizing guide. The other Snowflake guides go deeper on cost monitoring.

The rule that saves the most rework

Start declarative, escalate only when blocked. Build the table as a Dynamic Table with a target lag that matches the real SLA. If you hit something a SELECT cannot express, or you need freshness under a minute, drop that one table down to Streams and Tasks. Choosing imperative orchestration for a whole warehouse of tables you could have declared is the most common way teams sign up for maintenance they never needed.

Sources

Snowflake clustering: when a clustering key actually pays off

2026-06-07T00:00:00Z

When a Snowflake query is slow, the instinct is to size up the warehouse. Sometimes that is right. Often the real problem is that the query is reading the whole table to answer a question about a sliver of it, and no amount of extra compute fixes a full scan, it just pays for it faster. The fix is pruning: getting Snowflake to skip the storage that cannot contain your answer. A clustering key is the main lever you have over how well that pruning works, and on the right table it is the difference between scanning 9,800 micro-partitions and scanning 140.

This guide is for the data engineer who owns a large, slow table and a query profile that says "partitions scanned" is close to "partitions total". The goal is to know when a clustering key earns its credits, and when it just quietly bills you for nothing.

How does Snowflake decide what to skip?

Every Snowflake table is silently split into micro-partitions, contiguous chunks holding 50 to 500 MB of uncompressed data each, stored column by column. For every micro-partition, Snowflake keeps metadata: the min and max value of each column, the distinct count, and more. That metadata is the whole game. When you filter WHERE event_date = '2026-06-01', Snowflake checks each micro-partition's stored range for event_date and skips any whose range cannot contain that date. Per Snowflake's own docs, a query touching 10% of a value range should ideally scan only 10% of the micro-partitions.

Pruning only works when the values you filter on are physically grouped together. Data lands in micro-partitions in roughly the order you load it. If you load by day, rows for one day already sit together and a date filter prunes beautifully for free. If you load in random order, every micro-partition holds a smear of every date, no range can be ruled out, and the query reads everything even though it returns almost nothing.

Illustrative: clustering a large event table on its date column lets a one-day query prune from 9,800 micro-partitions down to 140. Source: Snowflake documentation on micro-partitions and pruning.

A clustering key tells Snowflake to keep the table physically sorted on the columns you actually filter by, so that smear never forms. You can check how good the current grouping is without changing anything:

-- How well is this table clustered on the column you filter by?
-- Lower average_depth = better pruning. A high depth on a big table
-- that you filter by event_date is the signal a clustering key may help.
SELECT SYSTEM$CLUSTERING_INFORMATION('events', '(event_date)');

The number that matters in the result is clustering depth: the average number of overlapping micro-partitions for that column. Depth near 1 means almost no overlap and excellent pruning. A large depth means heavy overlap and wasted scans.

Clustering is not the only pruning lever, and reaching for it reflexively is a mistake. If your slow queries are highly selective point lookups, WHERE customer_id = 12345 against a huge table, the Search Optimization Service often fits better: it builds a separate search-access index that prunes for equality and substring filters without physically reordering the table. It bills its own credits too, so the evaluation discipline is identical, but the two solve different shapes of problem. Clustering wins on range and date filters that scan a contiguous slice; search optimization wins on needle-in-haystack lookups across the whole table.

What does a clustering key cost, and where does it break?

A clustering key is not free, and this is where teams get burned. Once you set one with ALTER TABLE events CLUSTER BY (event_date), Snowflake's automatic clustering service reorganizes micro-partitions in the background to keep the table sorted as new data arrives. That background work is serverless compute, and it bills credits that show up in AUTOMATIC_CLUSTERING_HISTORY, separate from your warehouses. The more the table churns, the more reclustering it triggers, and the more you pay.

That cost profile decides where clustering wins and where it loses money:

Table profile	Clustering key?	Why
Large, filtered by one date/ID column, append-mostly	Yes	Pruning savings dwarf the steady reclustering cost
Small (under a few GB)	No	Snowflake already prunes it well; you would pay to reorder nothing
High-churn, rows updated all over the key	Rarely	Constant reclustering can cost more than the queries save
Filtered by many different columns each query	No	One key cannot serve every filter; pick the dominant one or none

Two rules keep you out of trouble. First, cluster on low-cardinality-to-medium columns you filter on, like a date or a region, not a unique ID with billions of values, because an ultra-high-cardinality key forces near-constant reordering. Second, do not cluster a table that fits in a handful of micro-partitions; Snowflake prunes small tables fine on its own and the service would bill you to sort data that was never the bottleneck.

How do I roll it out without guessing?

Prove the need before you turn anything on, then measure the bill after.

Confirm the table is actually scan-bound. Open the query profile for the slow query and compare "partitions scanned" to "partitions total". If you are scanning most of the table to return a small result, pruning is failing and clustering is a real candidate. If the scan is already small, your problem is elsewhere (a join, spilling, or a missing filter) and a clustering key will not help.
Baseline, then enable on one table. Record the query runtime and the SYSTEM$CLUSTERING_INFORMATION depth. Set the key with ALTER TABLE ... CLUSTER BY, let automatic clustering settle over a day or two, then re-measure both the query and the credits in AUTOMATIC_CLUSTERING_HISTORY.
Compare savings against the reclustering bill. This is the step people skip. If the query got 20x faster and reclustering costs a few credits a day, keep it. If reclustering is burning more than the warehouse time you saved, suspend it with ALTER TABLE ... SUSPEND RECLUSTER or drop the key.

The rollback is clean: ALTER TABLE events DROP CLUSTERING KEY stops the service and leaves the data in place. Nothing breaks, you just stop paying for ordering. Before you reach for a clustering key at all, make sure the warehouse running the query is the right size, because the two levers interact. The warehouse sizing guide covers that side, and the rest of the Snowflake guides go deeper on the query profile.

The number to watch

Track partitions scanned divided by partitions total for your heaviest recurring query. That ratio, not raw runtime, tells you whether pruning is working. When it is high on a big table you filter the same way every day, a clustering key is worth a test. When it is already low, leave the table alone and spend the credits somewhere they move the needle.

Sources

Snowflake Openflow: managed ingestion built on Apache NiFi

2026-06-07T00:00:00Z

Every data team has the same unglamorous problem: getting data in. Change-data-capture from a transactional database, events off a Kafka topic, PDFs from SharePoint, rows from a SaaS API. The usual answer is a patchwork of Fivetran connectors, custom scripts, and a NiFi cluster someone has to babysit. Openflow is Snowflake's bid to own that whole layer: a fully managed integration service, built on Apache NiFi, that moves structured and unstructured data from hundreds of sources into Snowflake with the platform's security and governance baked in. It went generally available across AWS, Azure, and GCP, and the pitch is that ingestion becomes a Snowflake feature instead of a separate system to run.

This guide is for the data engineer who owns the pipelines and is tired of gluing ingestion tools together. What Openflow actually is, the two ways it runs, what it is good at, and the trade-off you are signing up for when you let Snowflake manage the pipe.

What is Openflow, and why build it on NiFi?

Apache NiFi is a battle-tested open-source dataflow tool: you build pipelines on a visual canvas out of processors, small building blocks that read, transform, route, and write data, wired together with connections that buffer and backpressure. It has been the quiet workhorse of enterprise data movement for a decade. Openflow takes that engine, runs it as a managed service, and wires it natively into Snowflake's security model so you are not hand-rolling credentials and network rules.

The headline capability is breadth. Openflow handles structured and unstructured data, in both batch and streaming modes, through hundreds of processors. That unstructured part is the strategically interesting bit: it can pull documents off Google Drive, Box, or SharePoint and land them in Snowflake ready for Cortex to index, which is exactly the feedstock a Cortex Search service or a RAG chatbot needs. The common use cases the docs call out:

Database CDC. Replicate change-data-capture from operational tables into Snowflake for centralized reporting.
Streaming events. Ingest real-time events from Apache Kafka for near real-time analytics.
SaaS connectors. Pull from platforms like LinkedIn Ads into Snowflake for reporting.
Unstructured for AI. Continuously ingest documents from Drive, Box, and SharePoint so you can chat with them through Cortex.

You build a flow by dropping Snowflake and NiFi processors and controller services onto the Openflow canvas, the same mental model as NiFi, with Snowflake handling the runtime underneath.

Where does Openflow actually run?

This is the decision that shapes cost, security, and operational burden, because Openflow ships in two deployment types and they are genuinely different products under one name.

Illustrative: BYOC keeps the data plane in your own VPC for maximum control, while the SPCS deployment trades that for managed simplicity. Source: Snowflake Openflow deployment documentation.

	Snowflake deployment (SPCS)	Bring Your Own Cloud (BYOC)
Where the data plane runs	Inside Snowflake on Snowpark Container Services	In your own cloud VPC
Setup and ops	Simplest, self-contained in Snowflake	You run the data plane, Snowflake runs the control plane
Billing	Compute-pool utilization, by uptime and usage	Your cloud compute, infrastructure, and storage charges
Best for	Teams that want least operational overhead	Sensitive data that must be preprocessed inside your own perimeter

The split maps to a familiar trade-off. The SPCS deployment runs entirely inside Snowflake on a compute pool, so it is the easy button: native security, no infrastructure of your own, billed on the compute pool's uptime. BYOC runs the data-processing engine inside your own VPC while Snowflake manages the control plane, which is what you reach for when sensitive data has to be handled within your own cloud boundary before it ever moves. In both, the control plane that hosts the canvas and APIs is Snowflake-managed; only the data plane location changes.

Authentication is one place Openflow genuinely simplifies life. The default is the Snowflake Managed Token, short-lived credentials that Snowflake rotates for you, so you stop generating, storing, and rotating long-lived key pairs:

-- Openflow needs an account admin to grant the privileges that let a
-- role create deployments and runtimes. Fine-grained RBAC governs the rest.
GRANT CREATE OPENFLOW DATA PLANE INTEGRATION ON ACCOUNT TO ROLE openflow_admin;

In a BYOC deployment the runtime uses workload identity federation, exchanging its cloud identity such as an AWS IAM role for a Snowflake token, so there are still no long-lived secrets to leak.

What does it cost, and what should you watch?

Openflow does not have a single price tag; it inherits the cost shape of wherever it runs. A SPCS deployment bills on compute-pool utilization, meaning the uptime and compute usage of the container pool hosting your runtimes, the same serverless-ish model as anything on Snowpark Container Services. A BYOC deployment bills you directly through your cloud provider for the compute, infrastructure, and storage the data plane consumes, plus whatever Snowflake charges for the managed control plane.

The practical watch items:

Runtimes are where flows execute, and you will have several. Teams typically run multiple runtimes to isolate projects or environments, and each one is compute that costs while it is up. Idle runtimes are idle spend.
Streaming means always-on. A continuous Kafka or CDC flow keeps a runtime warm around the clock, which is a very different cost profile from a nightly batch. Size that into your estimate before you commit a real-time pipeline.
Unstructured ingestion compounds downstream. Landing documents is only step one; embedding and indexing them for Cortex is a second meter. Budget the Cortex Search cost alongside the ingestion cost, not after.

Openflow also brings real security plumbing: fine-grained RBAC, TLS in transit, PrivateLink compatibility, AWS Secrets Manager or HashiCorp Vault integration in BYOC, and Tri-Secret Secure support. That is the part that makes it credible for regulated data, and it is the reason to prefer it over a hand-built NiFi box you now have to secure and patch yourself.

Should you move your ingestion onto Openflow?

Openflow makes the most sense when ingestion is already a real cost center: many sources, a mix of structured and unstructured data, and a NiFi or scripts setup that someone maintains by hand. Consolidating that onto a managed service inside your governance boundary is a genuine simplification, and the unstructured-to-Cortex path is something the bolt-on connectors do not do as cleanly. It is less compelling if you have one or two simple batch loads that COPY INTO or Snowpipe and Dynamic Tables already handle, where adding a NiFi-based service is more machinery than the job needs.

The honest trade-off: Openflow is powerful and broad, but it is still NiFi underneath, which means a real flow-design skillset and runtimes you have to size and watch. It moves the babysitting from your own cluster to a managed service, it does not delete it. For a team drowning in connectors and custom CDC scripts, that is a clear win. For a team with a tidy handful of loads, it is a solution looking for a problem.

The honest read

Openflow is Snowflake pulling the ingestion layer inside its walls, and for enterprises with sprawling, multi-format data movement that is a strong offer: one managed, governed, NiFi-powered service from any source to Snowflake, with a clean path for the unstructured data that feeds AI. The decision that actually matters is SPCS versus BYOC, because that one choice sets your cost model, your security perimeter, and how much you operate yourself. Pick SPCS for least overhead, BYOC for data that must stay in your own cloud, and size your runtimes honestly, especially for streaming. Done right, ingestion stops being a patchwork and becomes a feature. Done lazily, it is just NiFi with a bigger invoice.

Sources

Should you put your data lake in Snowflake Iceberg tables?

2026-06-07T00:00:00Z

The question landing in every enterprise data architecture review right now: do we keep paying Snowflake to store our data, or do we put it in open Apache Iceberg tables and keep one copy that every engine can read? Iceberg is an open table format that adds database features, ACID transactions, schema evolution, hidden partitioning, and time-travel snapshots, on top of plain Parquet files sitting in your own S3, GCS, or Azure storage. Snowflake supports it natively. The headline that makes finance lean in: with an external volume, Iceberg tables incur zero Snowflake storage cost, because your cloud provider bills you for the bytes directly. The headline that should make architects slow down: not all Iceberg tables are equal, and the catalog you choose decides how much of Snowflake you actually get to keep.

This guide is for the data engineer or platform lead weighing a lakehouse move on Snowflake: when Iceberg is the right call, what you give up, and how to avoid the interoperability promise quietly costing you more than it saves.

What problem does Iceberg actually solve?

A standard Snowflake table stores data in Snowflake's proprietary format. It is fast and fully managed, but the data is locked behind Snowflake's engine: Spark, Trino, or Databricks cannot read it without exporting a copy. The moment two teams want two engines on the same data, you are either copying it or paying twice.

Iceberg breaks that lock. The table data lives as Parquet in open cloud storage, and an Iceberg catalog tracks which files make up the current table state. Any Iceberg-aware engine can read it. So the real question is not "Snowflake or open format" but who owns the catalog, because Snowflake supports two very different models and they are not interchangeable.

Snowflake as the catalog (Snowflake-managed): Snowflake owns the metadata pointer, handles maintenance like compaction, and gives you read and write with full platform support. The data still sits in your external volume, so storage stays cheap, but Snowflake behaves almost like it is a native table.
External catalog (AWS Glue, Databricks Unity Catalog, a remote Iceberg REST catalog): another system owns the metadata. Snowflake connects through a catalog integration and gives you limited platform support. Snowflake does not manage the table lifecycle here.

How much of Snowflake do you give up with an external catalog?

This is the trade-off the marketing glosses over, and it is the single most important thing to get right. Going Snowflake-managed keeps nearly the whole platform. Going external-catalog trades platform features for ecosystem neutrality.

Illustrative: a Snowflake-managed Iceberg catalog retains most platform features while an external catalog supports a much smaller subset. Source: Snowflake Iceberg tables documentation, considerations and limitations.

Per Snowflake's own documentation, here is what the catalog choice actually costs you:

Capability	Snowflake-managed catalog	External catalog
Read and write	Full	Full (writes supported via Iceberg REST)
Clustering keys	Supported	Not supported
Replication	Supported	Not supported
Cloning	Supported	Not supported (externally managed)
Standard streams (CDC)	Supported	Insert-only streams only
Lifecycle maintenance (compaction)	Snowflake handles it	You own it in the external engine
Query from other engines	Sync to Snowflake Open Catalog	Native, that is the point

The pattern that follows: if Snowflake is your primary engine and you just want cheap open storage, use Snowflake as the catalog and sync to Open Catalog when another engine needs read access. If a Spark or Databricks platform is the system of record and Snowflake is a guest, use the external catalog and accept that clustering, cloning, and replication are off the table. Snowflake even supports a catalog-linked database that stays in sync with a remote Iceberg REST catalog, including bidirectional access to Databricks Unity Catalog, so the two platforms can share one copy of the data.

What does it really cost, and where does the bill hide?

The storage saving is real but narrower than it sounds. Snowflake bills you for virtual warehouse compute and cloud services whenever you query or maintain Iceberg tables, exactly as it does for native tables. What changes is storage.

Illustrative: with an external volume, Snowflake charges no storage for Iceberg, your cloud provider bills you instead; with Snowflake-managed storage, Snowflake charges storage as normal. Source: Snowflake Iceberg tables billing documentation.

The nuance that catches teams out: storage is free from Snowflake only when you use your own external volume. If you pick the convenience option of EXTERNAL_VOLUME = SNOWFLAKE_MANAGED storage, Snowflake charges for storage just like a normal table, and the headline saving evaporates. And the most common surprise bill is geography. If your Snowflake account and your external volume sit in different regions, every query triggers cross-region egress that your cloud provider charges you for, and Snowflake adds cross-region data-transfer usage on top for managed tables. Keep compute and storage in the same region or that interoperability dream turns into an egress line item.

Two more honest caveats before you migrate a critical table:

No Fail-safe. Iceberg tables on an external volume get no Snowflake Fail-safe recovery. You own data protection and recovery for that storage. For a standard table, Snowflake's seven-day Fail-safe is a safety net you are quietly giving up.
Maintenance is real work. For externally managed tables, you own compaction and cleanup. Excessive position deletes can actually block table creation and refresh, and orphan files from failed writes can make your storage bill drift above what Snowflake reports. This is operational overhead a native table never asked of you.

When is Iceberg the right call, and when is it a trap?

Reach for Iceberg when you have a genuine multi-engine reality or an existing data lake you cannot or will not move into Snowflake. The open format pays off precisely when more than one tool needs the same data and the alternative is copies that drift. Snowflake says it plainly: Iceberg tables are ideal for existing data lakes you choose not to store in Snowflake.

Do not reach for Iceberg when Snowflake is your only engine and you just heard "open format" in a keynote. For a single-engine shop, a Snowflake-managed Iceberg table on an external volume buys you cheaper storage and an exit option, which is a reasonable hedge, but a standard table buys you Fail-safe, zero maintenance, and every feature with no asterisks. The worst outcome is adopting an external catalog for openness you never use, then discovering six months later that you cannot cluster your biggest table or replicate it for disaster recovery.

If cost is the driver, weigh the storage saving against the maintenance time and egress risk before you commit, the same discipline from the warehouse sizing guide: measure the real number, do not assume the cheap-looking option is cheaper once you count everything. The rest of the Snowflake guides cover the compute side that Iceberg does not change.

The decision in one line

Pick the catalog before you pick the format. Snowflake-managed Iceberg gives you open storage with almost the whole platform intact; an external catalog gives you true engine neutrality at the cost of clustering, cloning, replication, and a maintenance burden you now own. Choose external only when a second engine is genuinely the system of record, otherwise you are paying in lost features for an openness you will never spend.

Sources

Snowflake RBAC and masking: lock it down without grinding to a halt

2026-06-07T00:00:00Z

Governance is where a fast-moving Snowflake account quietly turns into a liability. Someone needs access, an admin grants it straight to their user, and a year later nobody can answer who can see the salary table. The fix is not more process, it is using the access model Snowflake actually gives you. Snowflake is role-based: privileges are granted to roles, and roles are granted to users, never privileges to users directly. Get the role design right and you collapse what would be roughly 10,000 individual grants down to a few hundred while making access auditable instead of archaeological.

This guide is for the data engineer or platform owner who has to keep Snowflake both usable and defensible: analysts unblocked, auditors satisfied, and PII not leaking into a dashboard. We will stick to how Snowflake's RBAC and masking actually behave.

Why grant to roles and not users?

In Snowflake, a privilege (say, SELECT on a schema) is granted to a role, and a user gets access by being granted that role. Grant directly to users and the math explodes: fifty users who each need access to two hundred objects is up to ten thousand grants to create, track, and eventually revoke. Miss a few on offboarding and you have a standing audit finding.

Illustrative: a two-layer role hierarchy collapses grant sprawl from roughly 10,000 direct user grants to about 260. Source: Snowflake documentation on access control.

The pattern that scales is a two-layer role hierarchy, which Snowflake's own guidance recommends. Access roles own the privileges on objects: analytics_db_read holds SELECT on the analytics schemas, raw_db_write holds write on the landing zone. Functional roles map to jobs: data_analyst, data_engineer, bi_developer. You grant access roles to functional roles, and functional roles to people. Now a new analyst gets exactly one grant, data_analyst, and inherits everything that role is supposed to see.

-- Access role owns the object privileges.
CREATE ROLE analytics_read;
GRANT USAGE ON DATABASE analytics TO ROLE analytics_read;
GRANT USAGE ON SCHEMA analytics.marts TO ROLE analytics_read;
GRANT SELECT ON ALL TABLES IN SCHEMA analytics.marts TO ROLE analytics_read;

-- Functional role = the job. It inherits the access role.
CREATE ROLE data_analyst;
GRANT ROLE analytics_read TO ROLE data_analyst;

-- People get one grant: the job.
GRANT ROLE data_analyst TO USER jdoe;

Two discipline points keep this clean. Keep ACCOUNTADMIN for break-glass only and run day-to-day administration through SECURITYADMIN and SYSADMIN, because handing out the top role defeats the whole hierarchy. And give every object a deliberate owner role, so OWNERSHIP is not scattered across whoever happened to run the CREATE.

How do I protect specific columns without copying data?

RBAC controls which tables a role can touch. It does not, on its own, hide the ssn column from an analyst who legitimately needs the rest of the row. That is what Dynamic Data Masking is for, and it is the governance feature that earns its keep fastest.

A masking policy is a schema-level object that rewrites a column's value at query time based on the querying role. The data on disk never changes; what a user sees is decided per query.

CREATE MASKING POLICY mask_email AS (val string) RETURNS string ->
    CASE
        WHEN CURRENT_ROLE() IN ('PII_READER', 'SECURITYADMIN') THEN val
        ELSE REGEXP_REPLACE(val, '.+@', '****@')
    END;

ALTER TABLE customers MODIFY COLUMN email SET MASKING POLICY mask_email;

Now PII_READER sees real email addresses and everyone else sees ****@example.com, from the same table, with no copy and no second pipeline. The leverage Snowflake's docs call out: you write the policy once and apply it to thousands of columns, and you can change the policy's logic centrally without reapplying it anywhere. Pair it with tag-based masking, attach the policy to a pii_email tag and tag the columns, and protection follows the data automatically as new tables appear. Row-level filtering has a sibling feature, row access policies, which hides whole rows by role using the same query-time model.

A few realities to plan around:

Dynamic Data Masking is an Enterprise Edition feature. If you are on Standard, it is not available, and that may itself be the reason to upgrade.
A column takes one masking policy at a time. You cannot stack two; decide the policy per column.
Masking interacts with materialized views. Apply policies to the base table columns, not to a materialized view, or you will hit errors.

Where masking gets genuinely powerful is in combination with tags. Define the sensitivity taxonomy once, pii, financial, restricted, attach a masking policy to each tag, and then governance becomes a tagging exercise rather than a per-column chase. Snowflake can also help you find what to tag: sensitive data classification scans columns and proposes semantic categories like email or phone number, so you are not auditing thousands of columns by eye. The pattern that scales is classify, tag, mask-by-tag, because new tables that inherit a tagged column's lineage pick up protection automatically instead of waiting for someone to remember.

How do I roll this out on an account that is already messy?

You rarely get a greenfield account. Retrofit in order, and audit as you go.

See what exists. SHOW GRANTS and the ACCOUNT_USAGE.GRANTS_TO_USERS view expose every direct-to-user grant. That list is your cleanup backlog and, usually, a sobering one.
Stand up the hierarchy alongside the mess. Create the access and functional roles, grant the access roles into them, and move users onto functional roles one team at a time. Nothing breaks for users still on the old grants while you migrate.
Revoke the direct grants last. Once a team is fully on functional roles, strip their direct-to-user grants. Now access is described entirely by which roles a person holds, which is exactly what an auditor wants to see.
Classify and mask the sensitive columns. Tag PII columns, attach masking policies through the tags, and verify with the POLICY_REFERENCES view that every sensitive column is actually covered.

Audit continuously, not at year-end: ACCOUNT_USAGE.POLICY_REFERENCES lists every object a masking policy is set on, and the GRANTS_TO_ROLES view lets you answer "who can read this table" with a query instead of a meeting. The same instinct from the rest of the Snowflake guides applies, measure the real state in ACCOUNT_USAGE rather than trusting the diagram on the wiki.

The test that proves it works

Pick your most sensitive table and ask: can you answer "who can see this column, and what do they see" with a single query? If yes, your RBAC and masking are doing their job. If it takes a meeting and three Slack threads, the access model is living in people's heads instead of in Snowflake, and that gap is exactly what fails an audit.

Sources

Snowflake Gen2 warehouses: faster compute you have to opt into

2026-06-07T00:00:00Z

Snowflake quietly shipped a second generation of its compute engine, and most teams have not noticed they are now paying for the choice. Generation 2 standard warehouses (Gen2) run on faster underlying hardware with software optimizations to delete, update, merge, and table-scan operations, and Snowflake says the majority of queries finish faster as a result. That sounds like a free win, but it is not free: Gen2 bills at a higher credit rate than Gen1, and depending on when and where your account was created, you may have to ask for it explicitly with a SQL clause. The question that decides whether Gen2 helps or hurts your bill: does the speedup beat the rate increase for your workload?

This guide is for the data engineer or FinOps owner who runs the warehouses and signs off on the credit spend. If you have a heavy data-engineering pipeline or a wall of analytics queries, Gen2 is the single most consequential warehouse setting you are probably not thinking about.

What actually changed under the hood?

A Snowflake virtual warehouse is just a cluster of compute that runs your queries, sized from XS up to 6XL, and you pay credits per second it runs. Gen1, the original engine, is what every account has used for years. Gen2 keeps the same sizes and the same per-second model but swaps in faster hardware and engine improvements aimed squarely at the operations that dominate modern pipelines: the DML that rewrites tables and the scans that feed big aggregations.

Illustrative: Gen2 aims to finish the same query faster, the exact gain depends on your workload, so benchmark it. Source: Snowflake Gen2 warehouse documentation.

Snowflake is careful, and you should be too: the exact gain depends on your configuration and workload, and the docs explicitly tell you to test it rather than assume. Gen2 is not available for the two largest sizes, X5LARGE and X6LARGE, and it applies only to standard warehouses, not Snowpark-optimized ones. There is one genuine bonus that ships with it: a new Gen2 warehouse turns on the Query Acceleration Service by default with a scale factor of 2, which lets bursty queries borrow extra serverless compute for the heavy parts.

How do you actually get a Gen2 warehouse?

This is where teams trip, because the default depends on your account's age and region. For new organizations created after the mid-2025 rollout dates in supported regions, standard warehouses now default to Gen2. For everyone else, if you do not specify a generation when you create a warehouse, you still get Gen1. So an older account is almost certainly running Gen1 everywhere and paying Gen1 speeds without realizing there is a switch.

The switch is one clause. The recommended syntax is the GENERATION parameter on CREATE WAREHOUSE:

-- Create a new Gen2 standard warehouse. Without this clause an
-- older account silently gets Gen1.
CREATE OR REPLACE WAREHOUSE etl_wh
  GENERATION = '2'
  WAREHOUSE_SIZE = MEDIUM;

You can flip an existing warehouse in place, running or suspended, with a single ALTER:

-- Convert an existing warehouse to Gen2 without recreating it.
ALTER WAREHOUSE etl_wh SET GENERATION = '2';

One billing gotcha worth knowing before you convert a live warehouse: if you convert it while queries are running, the in-flight queries finish on Gen1 compute while new queries start on Gen2, and you are billed for both sets of compute until the old queries drain. Convert while suspended to avoid the double charge, or convert while running if you care more about zero downtime than a few minutes of overlap. Note also that the GENERATION clause is not in Snowsight yet, so this is a SQL-only change today, and the setting shows up in the resource_constraint column of SHOW WAREHOUSES:

-- Confirm which generation each warehouse is running.
SHOW WAREHOUSES;
-- look at the "resource_constraint" column: STANDARD_GEN_2 means Gen2.

Does Gen2 actually save you money?

This is the only question that matters, and the honest answer is: it depends, so measure. Gen2 finishes faster, which means fewer credit-seconds per query, but it charges a higher credit rate per second than Gen1, with the exact rates published in Snowflake's Service Consumption Table. The maths is a race between two effects.

Factor	Pushes cost down	Pushes cost up
Query runtime	Faster, so fewer seconds billed	-
Credit rate	-	Higher per-second rate than Gen1
Concurrency	More queries done per running hour	-
Auto-suspend gaps	Shorter active windows	-

The break-even logic is simple. If Gen2 cuts your runtime by more than the rate premium, you win. A warehouse running heavy merges and large scans, exactly what Gen2 was tuned for, is the most likely winner. A warehouse running tiny, cheap queries where the bottleneck is not compute may see the rate increase swamp a speedup it barely benefits from. The right move is to A/B it: run a representative workload on a Gen1 and a Gen2 copy, then compare credits in ACCOUNT_USAGE:

-- Compare credits burned per warehouse over the last 7 days.
SELECT warehouse_name,
       SUM(credits_used) AS credits
FROM   snowflake.account_usage.warehouse_metering_history
WHERE  start_time >= DATEADD('day', -7, CURRENT_TIMESTAMP())
GROUP  BY warehouse_name
ORDER  BY credits DESC;

Pair that with QUERY_HISTORY to confirm the runtime actually dropped, not just the credit line. The discipline here is the same one from the warehouse sizing guide: never assume the faster-or-newer option is cheaper until you have the two numbers side by side.

What should you do this week?

If you run an older account, you are almost certainly on Gen1 everywhere and leaving a possible speedup on the table, so this is worth ten minutes. The plan:

Find your heaviest warehouse. The one with the most credits in the query above, usually an ETL or transformation warehouse doing merges and big scans, is the best Gen2 candidate.
Clone the workload, not the prod warehouse. Point a representative job at a GENERATION = '2' copy and a Gen1 copy, same size, and compare credits and runtime over a real day.
Mind replication. If you replicate warehouses across regions, every secondary region must also support Gen2, or a Gen2 warehouse may fail to resume after a failover. Test the failover path before you standardize on Gen2.
Leave tiny, idle warehouses alone. A warehouse that mostly serves cheap dashboard queries will feel the rate increase more than the speedup. Gen2 earns its premium on heavy work.

For the read on Gen2 against the rest of your compute strategy, the other Snowflake guides cover sizing, clustering, and the pipeline patterns that decide how much work a warehouse does in the first place.

The honest read

Gen2 is a real upgrade, not marketing: faster hardware, smarter DML, Query Acceleration on by default. But Snowflake made it an opt-in with a higher rate, which means it is a FinOps decision dressed as a performance feature. The teams that win are the ones that benchmark before they standardize, because for compute-heavy pipelines Gen2 can pay for its premium several times over, and for light, bursty workloads it can quietly raise the bill for a speedup you never needed. Run the two-warehouse test, read the two numbers, then decide. Do not let a default, old or new, make the call for you.

Sources

Cortex Search: the managed RAG engine inside Snowflake

2026-06-07T00:00:00Z

Every team building a chatbot on their own data hits the same wall: retrieval. The large language model is the easy part; the hard part is finding the right three paragraphs out of a million to feed it, keeping that index fresh as the data changes, and not standing up a separate vector database to do it. Cortex Search is Snowflake's answer: a fully managed hybrid search service that embeds your text, runs vector and keyword search together with semantic reranking, and refreshes the index automatically, all inside Snowflake. It is the retrieval engine behind enterprise RAG chatbots and high-quality search bars, and it pairs directly with the Cortex Agents framework as the tool that answers questions about unstructured data. The thing to understand before you ship it: it is genuinely low-effort to stand up, but the bill has several meters, and one of them runs even when nobody is searching.

This guide is for the engineer asked to "add search" or "build a chatbot on our docs" who needs to know what Cortex Search does, how good it is out of the box, and where the cost hides.

Why not just use a vector database?

The default 2024-era answer to RAG was: chunk your text, run it through an embedding model, load the vectors into a dedicated vector store, and query it. That works, but it is a second system to run, secure, and keep in sync with the source data, and tuning retrieval quality is a research project of its own. Cortex Search collapses that stack. You point it at a column of text in a Snowflake table, and it handles embedding, indexing, refresh, and serving.

The quality story is the part that earns the "managed" label. Cortex Search does not just do vector similarity; it takes a hybrid approach combining vector search, keyword search, and a semantic reranking step, which is what gets you good results across messy real-world queries without hand-tuning. Vector search catches "internet is down" matching "connectivity problem"; keyword search catches exact product codes and names that embeddings blur; reranking puts the genuinely relevant hits on top. You get all three with no parameters to fiddle.

Creating a service is one SQL statement:

-- Point Cortex Search at a text column. It embeds, indexes,
-- and keeps the result fresh on its own.
CREATE OR REPLACE CORTEX SEARCH SERVICE transcript_search_service
  ON transcript_text
  ATTRIBUTES region
  WAREHOUSE = cortex_search_wh
  TARGET_LAG = '1 day'
  EMBEDDING_MODEL = 'snowflake-arctic-embed-l-v2.0'
  AS (
    SELECT transcript_text, region, agent_id
    FROM support_transcripts
  );

That TARGET_LAG = '1 day' is the freshness contract: the service checks the base table for changes about once a day and re-embeds only what changed. The embedding model is swappable, from the fast English-only snowflake-arctic-embed-m-v1.5 default up to multilingual and long-context options.

How do you query it, and how do you chunk the text?

Querying is a REST API, a Python API, or a SQL preview function. The SQL preview is the fastest way to sanity-check that retrieval works before you wire up an app:

-- Preview retrieval straight from a worksheet, with a filter on an attribute.
SELECT PARSE_JSON(
  SNOWFLAKE.CORTEX.SEARCH_PREVIEW(
    'cortex_search_db.services.transcript_search_service',
    '{
       "query": "internet issues",
       "columns": ["transcript_text", "region"],
       "filter": {"@eq": {"region": "North America"}},
       "limit": 1
     }'
  )
)['results'] AS results;

The single biggest lever on quality you control is chunk size. Cortex Search truncates anything longer than the embedding model's context window before vectorizing, and Snowflake's own research says smaller chunks retrieve more precisely. The recommendation is concrete: split your text into chunks of no more than 512 tokens, roughly 385 English words. Snowflake gives you a built-in function so you do not have to write a splitter:

-- Chunk long documents before indexing for sharper retrieval.
SELECT SNOWFLAKE.CORTEX.SPLIT_TEXT_RECURSIVE_CHARACTER(
  document_text, 'markdown', 512, 64
) AS chunks
FROM raw_documents;

Granting access follows Snowflake's normal model, and it matters because search services run with owner's rights: the service reads data as its creator, so you control exposure through who you grant usage to, not through the caller's row access. Lock that down deliberately, the same discipline as the RBAC and masking guide:

GRANT USAGE ON CORTEX SEARCH SERVICE transcript_search_service TO ROLE support_app;

What does it actually cost?

Here is where the managed convenience meets the invoice, and it is the part teams underestimate. A Cortex Search service bills on four separate meters, not one.

Illustrative: serving compute, billed per GB of indexed data per month, runs even when no queries are served. Source: Snowflake Cortex Search cost documentation.

Meter	What it charges for	The gotcha
Embedding tokens	Vectorizing each text row	Only added or changed rows re-embed, so churn drives it
Serving compute	Per GB of indexed data per month	Bills while the service is available, even with zero queries
Warehouse refresh	Running the source query on refresh	No credits if nothing changed since last refresh
Storage	The materialized index and structures	Flat rate per terabyte

The one that surprises people is serving compute, charged per gigabyte of indexed data per month whether or not anyone searches. A big index that sits idle still costs money, because the low-latency serving layer is kept warm for you. Snowflake added a fix worth knowing: you can set AUTO_SUSPEND (minimum 30 minutes of inactivity) so an idle service parks its serving compute and resumes on the next query. The first query after a suspend waits for the resume and concurrent ones get a 429, so build retry logic into your client.

-- Park serving compute after 30 minutes idle to stop paying for an unused index.
ALTER CORTEX SEARCH SERVICE transcript_search_service SET AUTO_SUSPEND = 1800;

Track the spend per service with the CORTEX_SEARCH_DAILY_USAGE_HISTORY view, and right-size the refresh warehouse: Snowflake recommends a dedicated warehouse no larger than MEDIUM for each service.

What are the limits you will hit?

Set expectations before you scale. A service's materialized result must be under 100 million rows or the create fails, though you can ask Snowflake to raise it. A single service is rate-limited to 20 queries per second and 140 across the account by default, with 429s when you exceed it, again contact your account team to raise it. Because the source query must qualify for Dynamic Table incremental refresh, the same query restrictions apply, so an arbitrary complex join may not be eligible. And Cortex Search does not yet support cloning, and the tables it reads must not be dropped or modified while the service runs, so coordinate schema changes around it.

None of these are dealbreakers for a typical knowledge base, but they are real edges, and hitting the row cap or the QPS limit in production without knowing it exists is an avoidable outage.

The honest read

Cortex Search is one of the cleaner managed-AI stories Snowflake tells: it deletes the vector-database tier, gives you genuinely good hybrid retrieval with no tuning, and keeps the index fresh on its own. For building RAG on data that already lives in Snowflake, it is the path of least resistance, and the tight integration with Cortex Agents makes it the obvious retrieval backbone. The catch is the serving meter that bills idle indexes, so treat AUTO_SUSPEND and right-sized chunks as setup, not afterthoughts. Get the chunking and the cost controls right up front and you have enterprise search and RAG without running a search stack. Skip them and you are paying rent on an index nobody is querying.

Sources

Cortex Analyst: let business users query Snowflake in plain English

2026-06-07T00:00:00Z

The most expensive bottleneck in most data teams is not compute, it is the queue. Business users have questions, only the analysts can write SQL, and every "quick number" becomes a ticket. Cortex Analyst is Snowflake's answer: a fully managed text-to-SQL service where a business user asks "what was month-over-month revenue growth in EMEA last quarter" in plain English and gets a real answer, generated as SQL, run on your warehouse, with no analyst in the loop. It ships as a REST API so you can drop it into Slack, Teams, a Streamlit app, or your own product. The thing that decides whether it delights users or embarrasses you: the semantic view you build first. Point it at a raw schema and it guesses; give it a semantic model and it gets the business logic right.

This guide is for the data engineer who will be asked to "add AI to our analytics" and needs to know what Cortex Analyst really is, what makes it accurate, what it costs, and where it falls down.

Why does a semantic model matter so much?

Generic text-to-SQL fails in enterprises for a boring reason: a database schema does not contain business meaning. A column called rev_amt does not tell a model that revenue excludes returns, that "EMEA" maps to a specific set of country codes, or that "active customer" has a precise definition your finance team agreed on. Hand a model just the schema and it hallucinates the business logic.

Cortex Analyst closes that gap with a semantic view, a schema-level Snowflake object that defines the business layer over your tables: logical tables for entities like customers and orders, dimensions for categorical context, facts for row-level numbers, metrics for the KPIs with their correct aggregation formulas, and the relationships that define how tables join. That metadata is what turns a vague question into the right SQL.

Illustrative: giving Cortex Analyst a semantic view rather than a bare schema is the single biggest lever on answer accuracy. Source: Snowflake Cortex Analyst documentation on semantic models.

The practical consequence: the work is not the AI, it is the semantic model. Snowflake handles the language model, the model selection, and the text-to-SQL machinery. Your job is to encode the business definitions once, well, so the answers are trustworthy. Skip that and you ship a confident liar.

-- A semantic view is the business layer Cortex Analyst reasons over.
-- It names entities, the metrics with their real formulas, and join paths,
-- so "revenue" means what finance means, not whatever a column is called.
CREATE SEMANTIC VIEW sales_analytics
  TABLES (
    orders AS analytics.public.orders PRIMARY KEY (order_id),
    customers AS analytics.public.customers PRIMARY KEY (customer_id)
  )
  RELATIONSHIPS (
    orders (customer_id) REFERENCES customers (customer_id)
  )
  FACTS (orders.line_total AS quantity * unit_price)
  METRICS (orders.net_revenue AS SUM(line_total) WHERE status != 'returned')
  DIMENSIONS (customers.region AS region);

You can build it with the Semantic View Autopilot or hand-write the YAML, and you should seed it with verified example questions and their correct SQL, which Snowflake uses to guide generation and lets you measure accuracy over time.

What does it cost, and how is that different from raw Cortex?

This is where Cortex Analyst surprises people in a good way. Most Cortex AI functions bill per token, so a long prompt costs more. Cortex Analyst bills per message instead: each successful request (HTTP 200) counts as one unit, and the number of tokens does not affect the price unless you call it through Cortex Agents. Failed requests are not charged.

The cost you must not forget is the second meter. Cortex Analyst's per-message charge covers only the text-to-SQL generation. The generated SQL then runs on your virtual warehouse, and that compute is billed separately. So a chatty dashboard that fires hundreds of questions an hour costs you on two lines: the Analyst messages and the warehouse time to execute every query. Track the first with the CORTEX_ANALYST_USAGE_HISTORY view in ACCOUNT_USAGE and the second the same way you watch any warehouse, as covered in the warehouse sizing guide.

Right-size the warehouse behind it. The questions are usually small aggregations, so a small warehouse with auto-suspend is plenty. You do not need a big cluster to answer "total revenue last month".
Cache the obvious. If the same ten questions arrive all day, the generated SQL is deterministic enough to cache results upstream rather than re-running every time.
Watch multi-turn cost. Follow-up questions resend the whole conversation history on every turn, and the cost grows with each round, so long rambling sessions cost more than crisp ones.

What can it not do yet?

Set expectations honestly with stakeholders, because Cortex Analyst is narrower than "ask anything". It answers questions that can be resolved with SQL over your structured data, and nothing else. It does not generate open-ended business insight: ask "what trends do you see" and it has no answer, because that is not a SQL query. It also cannot reference the results of a previous query, so "what is the revenue of the second product you just listed" breaks, since it does not have the earlier result set in hand. And very long, intent-shifting conversations degrade, at which point the fix is to reset and start clean.

Governance is the part you can reassure security about. Cortex Analyst runs inside Snowflake's perimeter, the default models from Mistral and Meta keep data and prompts within Snowflake's governance boundary, it does not train on your data, and the generated SQL executes under the calling user's role-based access control, so a user can never get an answer from data their role cannot see. Access is gated by the SNOWFLAKE.CORTEX_ANALYST_USER database role, which you grant to a custom role rather than directly to users. For the broader access-control pattern, see the governance guide.

The honest read

Cortex Analyst is one of the few enterprise AI features where the value is real and the work is well understood: it moves the analyst queue off your team and onto a managed service, and it does so without your data leaving Snowflake. The catch is that it is only as good as the semantic model you invest in, and that model is ordinary, unglamorous data-modelling work, not magic. Teams that treat the semantic view as the product ship something business users trust. Teams that point it at a raw schema and call it AI ship a demo that quietly gives wrong numbers, which is worse than no answer at all.

Sources

Snowflake Cortex: running LLMs where your data already lives

2026-06-07T00:00:00Z

Every data team has been handed the same ask: "can we use AI on this?" The usual answer involves standing up a separate service, shipping data to an external API, and owning a new security review. Snowflake Cortex removes most of that by exposing large language models as plain SQL functions that run inside the Snowflake perimeter. You call AI_COMPLETE('llama3.1-70b', prompt) in a query, against data that never leaves your account, and get a result back in the same result set. The catch that decides your bill: these functions are billed per token by model, and the spread between a small model and a frontier one is roughly 40x. Model choice is not a detail here, it is the budget.

This guide is for the data engineer or data scientist who wants to add LLM work to a Snowflake pipeline without standing up new infrastructure, and who needs to know what is production-ready versus what is a demo.

What can you actually call?

Cortex ships LLM capability as two kinds of SQL function. The task-specific functions do one job well with no prompt engineering: AI_CLASSIFY sorts text or images into categories you define, AI_FILTER returns true or false so you can use a model inside a WHERE or JOIN, AI_SENTIMENT extracts sentiment, AI_EXTRACT pulls structured fields out of text and documents, AI_TRANSLATE localizes, and AI_AGG and AI_SUMMARIZE_AGG summarize across many rows without hitting a context-window limit. The general function is AI_COMPLETE, the one you reach for when you want a specific model to do an open-ended generation.

The point that makes this worth using is that it is just SQL, so a model call composes with everything else you do in a query:

-- Triage support tickets in one pass: classify, score sentiment,
-- and summarize, all inside the warehouse, no data leaving Snowflake.
SELECT
    ticket_id,
    AI_CLASSIFY(body, ['billing', 'bug', 'feature_request', 'churn_risk']):labels[0]::string AS topic,
    AI_SENTIMENT(body):categories[0]:sentiment::string AS sentiment,
    AI_COMPLETE('llama3.1-8b',
        'Summarize this ticket in one sentence: ' || body) AS one_line
FROM support_tickets
WHERE created_at > DATEADD('day', -1, CURRENT_TIMESTAMP());

To run any of this, the role needs the USE AI FUNCTIONS account privilege plus the CORTEX_USER database role. Models from OpenAI, Anthropic, Meta, Mistral, and DeepSeek are available depending on your region, and all of them run inside Snowflake's service perimeter rather than calling out to a third-party endpoint, which is the entire security argument for using Cortex over a raw API.

Beyond the one-shot text functions, Cortex also covers the building blocks for retrieval. AI_EMBED turns text or images into embedding vectors you can store in a VECTOR column and search with VECTOR_COSINE_SIMILARITY, which is the foundation of a retrieval-augmented generation pipeline that never leaves Snowflake. For teams that need more than SQL functions, Snowpark runs custom Python, including your own models, on Snowflake compute, so the classic build-versus-buy line sits between a managed Cortex function and a Snowpark job you own end to end.

What does it cost, and where does it bite?

Cortex AI functions are billed by tokens processed, metered as Snowflake credits, and the rate depends on the model. That is the lever that dominates everything else.

Illustrative: relative cost per million tokens climbs steeply with model size in Cortex AI_COMPLETE, roughly 1x for a small model, 9x mid, 40x frontier. Source: Snowflake Cortex documentation; check the current consumption table for exact per-model rates.

The practical consequence: defaulting every call to the biggest model is how a proof-of-concept turns into a budget incident. Most classification, sentiment, and extraction work runs fine on a small model, and the small models are where the per-token rate is a fraction of frontier. Reserve the large models for genuinely hard generation, and you can cut the bill by an order of magnitude without anyone noticing a quality drop. Three habits keep Cortex spend honest:

Right-size the model per task. Use a small model for classification and tagging; only escalate to a frontier model when the output quality demonstrably needs it.
Count tokens before you run at scale. AI_COUNT_TOKENS tells you the token load of a prompt, so you can estimate the cost of a batch before you point it at a 50-million-row table.
Batch, do not trickle. Cortex functions are optimized for throughput over large tables. Running them row-by-row from an app is the slow, expensive path; for interactive latency Snowflake points you at the REST APIs instead.

Track spend with the Cortex functions in ACCOUNT_USAGE so model cost lands in the same FinOps view as your warehouses. The same right-sizing instinct from the warehouse sizing guide applies here: the cheapest unit that does the job, not the biggest one available.

One more cost trap worth naming: the prompt is part of the token count, not just the answer. Stuffing a 4,000-token system prompt and the full document into every row of a large table means you pay for that context on every single call. If you are classifying ten million rows, a 200-token prompt versus a 2,000-token prompt is a 10x difference on the input side alone. Trim the prompt, pass only the columns the model needs, and use the task-specific functions instead of hand-rolling the same job through AI_COMPLETE, because they are tuned for exactly that work.

What is production-ready, and what is still a demo?

Be honest with stakeholders about the maturity line, because Cortex spans both sides of it. The task-specific functions and AI_COMPLETE are generally available and fine to ship; they are SQL functions with predictable billing. Higher up the stack, Cortex Search (retrieval) and Cortex Analyst (natural-language-to-SQL) are powerful but deserve real evaluation before you put them in front of users, and some individual functions are still Preview features, which means they can change and should not anchor a production SLA. The rule: check whether the specific function you are calling is GA or Preview before it goes near a customer path. Generally available functions are safe to build on; Preview ones are for prototypes.

The number to watch

Track credits per thousand Cortex calls, broken down by model. That single view exposes the most common failure mode, an expensive model quietly handling work a cheap one could do, and it turns "can we use AI on this?" into a question you can actually price. When the per-call cost is dominated by one frontier model on a high-volume task, that is your first optimization, not a bigger budget. Get that one number on a dashboard and the AI line in your Snowflake bill stops being a mystery.

Sources

Building an AI agent on your Snowflake data with Cortex Agents

2026-06-07T00:00:00Z

Every enterprise that bought into chat-with-your-data quickly hit the same wall: a text-to-SQL tool answers questions about tables, a search tool answers questions about documents, and a real business question needs both at once. "Why did churn spike in EMEA last quarter, and what did the support tickets say" is half a SQL query and half a document search, stitched together with reasoning. Cortex Agents is Snowflake's framework for exactly that: an AI agent that plans a multi-step task, calls the right tool for each step, reflects on the result, and keeps going until it has an answer, all inside Snowflake's governance boundary. It is the orchestration layer that sits above Cortex Analyst and Cortex Search rather than replacing them.

This guide is for the builder who has been asked to "build us an AI agent" on company data and needs to know what Cortex Agents actually does, how the loop works, what privileges it needs, and where the costs hide before you ship it.

What does an agent do that a single Cortex call does not?

A plain Cortex call is one shot: a prompt in, a completion out. An agent runs a loop. Snowflake's orchestration follows a plan-act-reflect cycle that should be familiar to anyone who has built with agent frameworks:

Planning. The agent reads the request, explores what it knows, splits the task into steps, and routes each step to a tool. "Churn plus ticket sentiment" becomes a structured query step and a document search step.
Tool use. It calls the tools. Structured questions go to a Cortex Analyst tool over your semantic view. Unstructured questions go to a Cortex Search tool over indexed documents. It can also call custom tools you expose as stored procedures or UDFs, and a web-search tool backed by the Brave API with zero data retention.
Reflection. It evaluates the tool output, decides whether it has enough to answer or needs another step, and iterates.
Monitor and respond. It assembles the final answer and you can monitor, evaluate, and tune the whole trajectory.

The agent itself is a first-class Snowflake object, and it uses threads to persist conversation context across turns, so a follow-up question keeps the earlier context instead of starting cold. You access it through the agent:run REST API, which means the same agent can power a Streamlit app, an internal portal, or a product feature.

-- An agent is a Snowflake object that bundles an orchestration model
-- with the tools it is allowed to use. The model plans; the tools act.
CREATE AGENT support_analyst
  WITH PROFILE = '{"orchestration_model": "auto"}'
  COMMENT = 'Answers questions across sales tables and support tickets'
  TOOLS = (
    cortex_analyst_tool(semantic_view => 'sales_analytics'),
    cortex_search_tool(service => 'support_tickets_search')
  );

Setting the orchestration model to auto lets Snowflake pick, which is the recommended default; you can pin a specific model such as claude-sonnet-4-5 or openai-gpt-4.1 when you need predictable behaviour.

What does it cost to run?

This is the question that decides whether your agent reaches production, and the honest answer is that an agent has no single price. A Cortex Agent bill is the sum of four separate meters, and a careless agent can light up all four on every question.

Illustrative: a Cortex Agent combines four cost components rather than one, so a single question can bill on all four at once. Source: Snowflake Cortex Agents cost documentation.

Meter	What it charges for	What drives it up
Orchestration tokens	The planning and reflection model's reasoning	More steps, longer threads, bigger context
Cortex Analyst tokens	Each structured-data tool call	Every SQL step the agent takes
Cortex Search index	Keeping the document index live	Index size and persistence over time
Warehouse run	Executing the SQL the agent generates	Query size and warehouse uptime

The trap is that an agent multiplies calls. A single user question can trigger several planning rounds, two or three tool calls, and a reflection pass, so what feels like "one question" is a dozen billed operations under the hood. Watch the trajectory, cap the number of steps where you can, and keep threads from growing unbounded, because every turn resends accumulated context and the orchestration-token line climbs with it. The warehouse discipline from the warehouse sizing guide applies to the fourth meter unchanged.

What privileges and guardrails does it need?

Cortex Agents is governed like the rest of Snowflake, which is the reason to build the agent here rather than wiring an external LLM to a database connection. To create and run agents a role needs SNOWFLAKE.CORTEX_USER (or the more specific CORTEX_AGENT_USER database role) plus the object privileges to CREATE AGENT, and USAGE, OWNERSHIP, MODIFY, or MONITOR on the agents themselves. Every tool the agent calls still runs under the caller's role, so the agent can never read data the user could not, the same boundary that protects Cortex Analyst and is enforced by the RBAC and masking setup.

Two operational caveats before you put an agent in front of users:

Accuracy is not guaranteed. Snowflake is explicit that agent answers can be wrong and should be reviewed, especially for anything that drives a decision. Build a review step for high-stakes use, do not auto-execute on the agent's word.
Runtime matters. Agents are not supported from the Streamlit-in-Snowflake warehouse runtime; you run them in the container runtime instead. Get this wrong and your first deploy fails for an unobvious reason.

The honest read

Cortex Agents is the right tool when a real question genuinely spans structured and unstructured data and you want the reasoning, the data, and the access control to stay in one governed place. That is a meaningful capability, and keeping it inside Snowflake's perimeter is a genuine advantage over bolting an external agent onto a database. The discipline it demands is cost observability: four meters, a loop that multiplies calls, and answers you still have to verify. Build the semantic view and the search service well first, instrument the trajectory before you scale, and treat the agent as a powerful assistant that needs a reviewer, not an oracle you can trust unattended. The rest of the Snowflake guides cover the foundations an agent is only as good as.

Sources

Snowflake Adaptive Compute rewrites warehouse sizing

2026-06-07T00:00:00Z

Snowflake warehouse sizing used to be a small act of fiction. You picked a size, guessed at concurrency, argued about auto-suspend, and hoped next month’s bill did not punish last month’s optimism. Snowflake Adaptive Compute changes that bargain. It replaces fixed warehouse sizing with adaptive warehouses that set compute per query, bill by query usage, and expose only 2 primary knobs: MAX_QUERY_PERFORMANCE_LEVEL and QUERY_THROUGHPUT_MULTIPLIER.

The important correction: Snowflake’s June 2, 2026 blog says Snowflake Adaptive Compute is generally available soon, but the current Snowflake Adaptive Compute documentation lists it as an Open Preview feature. For platform teams, that is the whole story in miniature. This is not just a faster warehouse type. It is a new operating model for Snowflake credits, observability, and accountability.

If you already moved steady workloads to Snowflake Gen2 warehouses, Adaptive Compute is the next question you need to answer. Gen2 keeps the old control surface. Adaptive Compute asks you to let Snowflake schedule and scale inside an account-level compute pool, then prove whether the bill got better.

What actually changes when you create an adaptive warehouse?

An adaptive warehouse is still a Snowflake virtual warehouse from the user’s point of view. You USE WAREHOUSE, run SQL, load data, and monitor usage in SNOWFLAKE.ACCOUNT_USAGE. The difference is what you stop managing. Snowflake says adaptive warehouses remove manual warehouse size, multi-cluster settings, Query Acceleration Service settings, and suspend or resume policies from your tuning loop.

The default create statement is deliberately boring:

CREATE ADAPTIVE WAREHOUSE bi_adaptive_wh;

That creates a warehouse with MAX_QUERY_PERFORMANCE_LEVEL = XLARGE and QUERY_THROUGHPUT_MULTIPLIER = 2, according to the SQL examples in Snowflake’s adaptive warehouse docs. Those defaults matter because they are not cosmetic. MAX_QUERY_PERFORMANCE_LEVEL is the upper bound Snowflake may apply for a single statement when it has high confidence an optimization helps. QUERY_THROUGHPUT_MULTIPLIER controls how much total query work can run at once relative to Snowflake’s computed baseline.

Here is the version you should put in Terraform or a migration script when you want intent on the page:

CREATE ADAPTIVE WAREHOUSE etl_adaptive_wh
  WITH MAX_QUERY_PERFORMANCE_LEVEL = LARGE
       QUERY_THROUGHPUT_MULTIPLIER = 4
       STATEMENT_QUEUED_TIMEOUT_IN_SECONDS = 300
       STATEMENT_TIMEOUT_IN_SECONDS = 3600;

The supported performance levels run from XSMALL through X4LARGE. A throughput multiplier is a non-negative integer, and 0 means unlimited throughput. That last value is powerful and dangerous. In a FinOps review, 0 should read as no instantaneous cap, not as a clever shortcut.

Snowflake’s claimed benchmark gains are large enough to justify a pilot, but not large enough to skip one. Its June 2026 blog reports Adaptive Compute gains of 1.6x for analytics, 2.2x for operational throughput, and 3.5x for DML-heavy workloads, based on TPC-DS and internal benchmarks measured in May 2026 against standard compute.

Snowflake reported Adaptive Compute benchmark gains of 1.6x faster analytics, 2.2x higher operational throughput, and 3.5x faster DML execution, measured in May 2026.

Read the chart as a migration hypothesis, not as your savings forecast. The strongest number, 3.5x faster DML execution, points at pipelines, ingestion, and transformation workloads. The least surprising number, 1.6x faster analytics, still matters if your analysts live in ad hoc query land and your warehouse queue is a standing meeting with better snacks.

How is Snowflake Adaptive Compute billed?

The billing change is the feature. Standard warehouses make you reason about size, run time, idle time, and cluster count. Adaptive warehouses use query-based billing. Snowflake says the cost of each query depends on compute and software resources used, including cluster sizes and capacity used by features like Query Acceleration Service. Creating the warehouse is free. Charges start when the first query runs.

That shift kills one familiar FinOps metric: idle waste. On standard warehouses, WAREHOUSE_METERING_HISTORY.CREDITS_ATTRIBUTED_COMPUTE_QUERIES helps separate query work from idle compute. Snowflake’s WAREHOUSE_METERING_HISTORY documentation says that column is NULL for adaptive warehouses. So do not port your old idle-cost dashboard and call the migration measured.

For adaptive warehouses, start at the query level:

SELECT
  query_id,
  warehouse_name,
  SUM(credits_used) AS credits_used,
  SUM(credits_used_compute) AS compute_credits,
  SUM(credits_used_cloud_services) AS cloud_services_credits
FROM SNOWFLAKE.ACCOUNT_USAGE.QUERY_METERING_HISTORY
WHERE warehouse_name = 'BI_ADAPTIVE_WH'
  AND query_start_time >= DATEADD(day, -7, CURRENT_DATE())
GROUP BY query_id, warehouse_name
ORDER BY credits_used DESC;

The QUERY_METERING_HISTORY view returns per-query credit usage for queries run on adaptive warehouses over the last 365 days, with view latency of up to 1 hour. That is the view your FinOps team should care about first. It lets you find the expensive query patterns that a warehouse-level total hides.

Use warehouse metering for the monthly control plane:

SELECT
  warehouse_name,
  DATE_TRUNC(day, start_time) AS usage_day,
  SUM(credits_used) AS total_credits,
  SUM(credits_used_compute) AS compute_credits
FROM SNOWFLAKE.ACCOUNT_USAGE.WAREHOUSE_METERING_HISTORY
WHERE warehouse_name = 'BI_ADAPTIVE_WH'
  AND start_time >= DATEADD(day, -30, CURRENT_DATE())
GROUP BY warehouse_name, usage_day
ORDER BY usage_day;

The old question was whether an XLARGE warehouse sat idle. The new question is whether an XLARGE performance cap let a handful of queries consume more instantaneous compute than your service-level objective deserved.

Which workloads should you move first?

Start with workloads where static sizing is already lying to you. Snowflake’s docs call out analytics, data loading pipelines, mixed BI and ETL, high size variance, and occasional HTAP queries as adaptive warehouse candidates. They also say you may prefer Gen2 for workloads that need direct sizing control, interactive warehouses for very low latency dashboards or applications, and Snowpark-optimized warehouses for high-memory Snowpark or ML workloads.

A practical migration order looks like this:

Workload	First adaptive setting to test	Why it belongs in the pilot
Bursty BI plus ad hoc SQL	`MAX_QUERY_PERFORMANCE_LEVEL = XLARGE`, multiplier `2`	Matches the default and tests whether queues fall without manual multi-cluster tuning.
Cost-sensitive ELT	`MAX_QUERY_PERFORMANCE_LEVEL = MEDIUM`, multiplier `4`	Lets more work run while capping per-statement optimization.
DML-heavy pipelines	`MAX_QUERY_PERFORMANCE_LEVEL = LARGE`, multiplier `4` or `6`	Snowflake reports the largest benchmark gain here: 3.5x faster DML execution.
User-facing low-latency apps	Do not start here	Snowflake points those to interactive warehouses, not adaptive warehouses.

The table is intentionally conservative. Adaptive Compute is Open Preview, requires Enterprise Edition or higher, and is currently limited in Snowflake’s docs to AWS US West 2 (Oregon), EU West 1 (Ireland), and AP Northeast 1 (Tokyo). That is enough to run serious tests. It is not enough to rewrite every warehouse standard across a global estate on Monday morning.

The most interesting candidate is the messy shared warehouse you already dislike. The one with BI dashboards at 9 a.m., analyst exploration at noon, and transformation work after someone forgot to reschedule a task. Adaptive Compute’s account-level shared pool is built for that mess. It routes jobs from all adaptive warehouses in the account to a dedicated pool that is not shared with other accounts or other warehouse types.

How do you convert without breaking running queries?

Snowflake says converting a standard warehouse to or from adaptive is an online operation. Existing queries continue on the old compute resources, while new queries use the new warehouse type. During that overlap, Snowflake says you are charged for both sets of compute resources.

That detail deserves a runbook line in bold: convert during a quiet window even when the operation is online.

The SQL is simple:

ALTER WAREHOUSE bi_wh SET WAREHOUSE_TYPE = 'ADAPTIVE';

Rolling back is just as direct:

ALTER WAREHOUSE bi_wh SET WAREHOUSE_TYPE = 'STANDARD';

On conversion, Snowflake computes adaptive values from the existing warehouse size, MAX_CLUSTER_COUNT, Query Acceleration Service scale factor, and warehouse generation. After conversion, standard properties such as WAREHOUSE_SIZE, MIN_CLUSTER_COUNT, MAX_CLUSTER_COUNT, and SCALING_POLICY no longer apply. Adaptive properties do not apply after converting back to standard.

You can inspect the new state with SHOW WAREHOUSES:

SHOW WAREHOUSES LIKE 'BI_WH';

For adaptive warehouses, Snowflake adds columns such as STATE, MAX_QUERY_PERFORMANCE_LEVEL, QUERY_THROUGHPUT_MULTIPLIER, and DISABLED_REASONS. STATE is ENABLED or DISABLED, which is separate from the old mental model of a warehouse being suspended.

There are hard conversion limits. Snowflake’s docs say conversions to or from X5LARGE or X6LARGE are not supported, and neither are conversions to or from Snowpark-optimized or interactive warehouses. If you run those, create a new adaptive warehouse for testing instead of trying to be clever with ALTER.

What should your cost guardrails look like?

Do not hand Adaptive Compute to every team with CREATE WAREHOUSE. The control surface is smaller, which makes the blast radius easier to miss. QUERY_THROUGHPUT_MULTIPLIER = 0 can remove the throughput cap, and a high MAX_QUERY_PERFORMANCE_LEVEL can raise per-query spend when Snowflake believes optimization will help.

Start with roles:

GRANT CREATE WAREHOUSE ON ACCOUNT TO ROLE platform_compute_admin;

GRANT USAGE ON WAREHOUSE bi_adaptive_wh TO ROLE bi_analyst;
GRANT MONITOR ON WAREHOUSE bi_adaptive_wh TO ROLE finops_analyst;
GRANT OPERATE ON WAREHOUSE bi_adaptive_wh TO ROLE platform_operator;

Snowflake’s access control docs say CREATE WAREHOUSE is required at the account level, while warehouse privileges include USAGE, MONITOR, OPERATE, MODIFY, and OWNERSHIP. Keep MODIFY away from workload teams unless you want multiplier changes to become the new shadow scaling policy.

Then track queuing and latency before you raise the multiplier:

SELECT
  DATE_TRUNC(hour, start_time) AS hour_start,
  warehouse_name,
  AVG(queued_overload_time) AS avg_queued_overload_ms,
  AVG(total_elapsed_time) AS avg_elapsed_ms,
  COUNT(*) AS query_count
FROM SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY
WHERE warehouse_name = 'BI_ADAPTIVE_WH'
  AND start_time >= DATEADD(day, -7, CURRENT_TIMESTAMP())
GROUP BY hour_start, warehouse_name
ORDER BY hour_start;

If queued_overload_time stays high, increase QUERY_THROUGHPUT_MULTIPLIER one step and watch query-level credits. If credits spike while latency barely moves, lower the cap. This is the new tuning loop. It is less warehouse babysitting, but it is not zero governance.

A controlled adjustment looks like this:

ALTER WAREHOUSE bi_adaptive_wh SET
  MAX_QUERY_PERFORMANCE_LEVEL = LARGE
  QUERY_THROUGHPUT_MULTIPLIER = 3;

For production, pair that with a resource monitor or Snowflake budget. Snowflake says the same cost tools work with adaptive warehouses, including budgets, resource monitors, QUERY_METERING_HISTORY, and WAREHOUSE_METERING_HISTORY. The point is to govern total spend over time while the warehouse adapts inside those guardrails.

What should you do before moving production spend?

Run a two-week A/B pilot, not a belief exercise. Pick one existing warehouse with volatile demand. Clone the workload routing, set query tags, and compare p50, p95, total credits, credits per successful query, and queued time. Use the same 7-day and 30-day windows on both sides. Snowflake’s benchmark chart is useful because it tells you where to look first: analytics at 1.6x, operational throughput at 2.2x, and DML execution at 3.5x.

Your acceptance criteria should fit on one screen:

Credits per business event fall, or latency falls enough to justify flat credits.
QUERY_METERING_HISTORY identifies the top 10 query patterns by adaptive spend.
No team can change MAX_QUERY_PERFORMANCE_LEVEL or QUERY_THROUGHPUT_MULTIPLIER without platform approval.
Preview-region and Enterprise Edition constraints match the accounts you plan to use.
Rollback to WAREHOUSE_TYPE = 'STANDARD' has been tested once, not merely admired in a doc.

The feature is promising because it moves Snowflake closer to the way teams actually run data platforms: mixed workloads, uneven demand, and bills that need attribution below the warehouse. The lock-in is also plain. You are outsourcing more scheduling intelligence to Snowflake, and your cost model becomes more Snowflake-specific at the query level.

That trade can be worth it. Just make Snowflake earn the credits query by query.

Sources

Snowflake Adaptive Compute: a bill owner's guide

2026-06-07T00:00:00Z

Snowflake warehouse sizing used to be a tax you paid in meetings, runbooks, and Slack threads. Someone asks whether the finance dashboard should be on Medium or Large. Someone else asks why the ELT warehouse queued for 11 minutes at 8:05 a.m. Then the bill arrives and everybody rediscovers AUTO_SUSPEND.

Snowflake Adaptive Compute is Snowflake's attempt to make that whole ritual optional. It replaces fixed warehouse sizing with an adaptive warehouse that chooses compute per query, while you set two guardrails: MAX_QUERY_PERFORMANCE_LEVEL and QUERY_THROUGHPUT_MULTIPLIER. The catch is the same as the gift: Snowflake hides more of the machinery, and during public preview it does not give you per query cost visibility.

Snowflake lists Adaptive Compute as an open preview feature introduced in April 2026, and its documentation says it currently requires Enterprise Edition or higher and is available only in 3 AWS regions: US West 2 Oregon, EU West 1 Ireland, and AP Northeast 1 Tokyo. That makes this a migration candidate for real workloads, but not a blind replacement for your most politically sensitive warehouse.

If you already read our guide to how Snowflake Adaptive Compute rewrites warehouse sizing, treat this as the bill owner's companion piece: how it works, what to monitor, and where the control moves.

What did Snowflake actually change in the warehouse model?

Adaptive Compute gives you a new warehouse type, not a new SQL engine you call directly. You create an adaptive warehouse, point sessions and jobs at it, and Snowflake routes queries into an account dedicated compute pool. Snowflake says that pool is not shared with other accounts, but it is separate from your standard and Snowpark optimized warehouses.

The basic DDL is intentionally boring:

CREATE ADAPTIVE WAREHOUSE wh_adaptive_bi;

That creates an adaptive warehouse with Snowflake's documented defaults: MAX_QUERY_PERFORMANCE_LEVEL = XLARGE and QUERY_THROUGHPUT_MULTIPLIER = 2. You can also use the standard warehouse syntax with WAREHOUSE_TYPE = 'ADAPTIVE', which matters if your provisioning code already emits CREATE WAREHOUSE statements. Snowflake documents both forms in its Adaptive Compute SQL reference.

CREATE WAREHOUSE wh_adaptive_bi
  WITH WAREHOUSE_TYPE = 'ADAPTIVE'
       MAX_QUERY_PERFORMANCE_LEVEL = LARGE
       QUERY_THROUGHPUT_MULTIPLIER = 3;

What disappears is the old sizing choreography. You no longer set WAREHOUSE_SIZE, MIN_CLUSTER_COUNT, MAX_CLUSTER_COUNT, SCALING_POLICY, or a separate Query Acceleration Service scale factor on the adaptive warehouse. Snowflake's docs are explicit that standard warehouse properties such as WAREHOUSE_SIZE, MIN_CLUSTER_COUNT, MAX_CLUSTER_COUNT, and SCALING_POLICY cannot be set on an adaptive warehouse.

The chart below counts the configuration surface that matters for day to day tuning. It is illustrative, but the inputs come from Snowflake's documented properties: a standard warehouse exposes the usual size, cluster, scaling, and QAS controls, while an adaptive warehouse exposes 2 primary controls.

Illustrative: Standard warehouses expose 4 common tuning controls in this comparison, while Snowflake Adaptive Compute exposes 2 primary controls: MAX_QUERY_PERFORMANCE_LEVEL and QUERY_THROUGHPUT_MULTIPLIER.

That reduction is the product bet. Snowflake is saying most teams do not really want finer grained knobs. They want bounded latency, bounded spend, and fewer 9 a.m. warehouse autopsies.

The two remaining knobs are worth reading literally:

Control	Applies to	Default or special value	What it really limits
`MAX_QUERY_PERFORMANCE_LEVEL`	Adaptive warehouse	`XLARGE` default	Upper bound for a single statement's performance level
`QUERY_THROUGHPUT_MULTIPLIER`	Adaptive warehouse	`2` default	Burst throughput as a multiplier over Snowflake's computed minimum
`QUERY_THROUGHPUT_MULTIPLIER = 0`	Adaptive warehouse	Special value `0`	Unlimited throughput, subject to available capacity
`WAREHOUSE_SIZE`	Standard warehouse	`XSMALL` default in standard `CREATE WAREHOUSE` docs	Fixed cluster size for a running warehouse

The important word is cap. Setting MAX_QUERY_PERFORMANCE_LEVEL = XLARGE does not mean every tiny lookup burns XLARGE equivalent resources. Snowflake says smaller queries can run below the cap when they do not need that much compute. That is why this is not just auto resize with a fresh coat of paint.

How do you migrate without breaking jobs or chargeback?

The clean migration path is conversion in place. Snowflake documents that converting to or from an adaptive warehouse is an online operation: running queries continue on existing compute, while new queries use the new warehouse type. The warehouse name survives, which is more important than it sounds if your dbt profiles, Airflow DAGs, BI connections, and stored procedures have warehouse names hardcoded.

ALTER WAREHOUSE wh_bi_prod SET WAREHOUSE_TYPE = 'ADAPTIVE';

Snowflake automatically derives the adaptive values from the old warehouse's size, MAX_CLUSTER_COUNT, QAS scale factor, and generation. That is a sensible first move because it preserves the intent of the old shape. It is not a reason to skip review.

After conversion, inspect the visible properties:

SHOW WAREHOUSES LIKE 'WH_BI_PROD';

Snowflake adds adaptive specific SHOW WAREHOUSES columns such as MAX_QUERY_PERFORMANCE_LEVEL, QUERY_THROUGHPUT_MULTIPLIER, STATE, and DISABLED_REASONS. Properties that no longer apply show as NULL.

If the derived values are too generous for a dashboard warehouse, bring them down before the business discovers a new definition of interactive:

ALTER WAREHOUSE wh_bi_prod SET
  MAX_QUERY_PERFORMANCE_LEVEL = LARGE
  QUERY_THROUGHPUT_MULTIPLIER = 2;

If the migration goes sideways, you can convert back:

ALTER WAREHOUSE wh_bi_prod SET WAREHOUSE_TYPE = 'STANDARD';

There is one billing wrinkle in the conversion path that deserves a bright yellow sticky note. Snowflake says that while old queries finish and new queries run on the new warehouse type, you are charged for both sets of compute resources. For a quiet BI warehouse that may be noise. For a heavy ETL warehouse with 2 hour transformations already running, choose the migration window with intent.

Do not convert unsupported shapes. During preview, Snowflake says you cannot convert to or from X5Large or X6Large warehouses, Snowpark optimized warehouses, or interactive warehouses. That means a compute heavy Snowpark workload should stay where it is unless Snowflake broadens support.

How is Adaptive Compute billed, and what can you prove today?

Adaptive warehouses use query based billing. That sounds like a clean break from classic warehouse uptime billing, but Snowflake still reports the usage as virtual warehouse credits under compute. You are not charged for creating an adaptive warehouse. Charges start when the first query runs.

This is the cost model in one sentence: you manage spend with caps and monitors, not with idle time math.

That is a meaningful change for teams that spent years training everyone to fear idle warehouses. On a standard warehouse, the size determines the compute resources in each cluster and therefore the credits consumed while the warehouse is running. On an adaptive warehouse, Snowflake says each query's cost depends on compute and software resources used, including cluster sizes and additional capacity used by features like QAS.

Here is the part finance will ask about first: QAS does not show up as a separate adaptive warehouse credit line during preview. Snowflake says QAS usage is included in compute credits for adaptive warehouses. That simplifies showback, but it also removes a familiar line item you may have used to explain spikes.

Use WAREHOUSE_METERING_HISTORY for daily warehouse level credits:

SELECT
  start_time::DATE AS usage_date,
  warehouse_name,
  SUM(credits_used) AS credits_used
FROM snowflake.account_usage.warehouse_metering_history
WHERE warehouse_name = 'WH_BI_PROD'
  AND start_time >= DATEADD(day, -14, CURRENT_TIMESTAMP())
GROUP BY 1, 2
ORDER BY 1;

Use QUERY_HISTORY to confirm that the workload is actually running as adaptive, and to watch latency and queuing:

SELECT
  warehouse_name,
  COUNT(*) AS queries,
  AVG(total_elapsed_time) AS avg_elapsed_ms,
  AVG(queued_overload_time) AS avg_queued_overload_ms
FROM snowflake.account_usage.query_history
WHERE warehouse_name = 'WH_BI_PROD'
  AND warehouse_size = 'ADAPTIVE'
  AND start_time >= DATEADD(day, -7, CURRENT_TIMESTAMP())
GROUP BY 1;

Snowflake's docs also recommend WAREHOUSE_LOAD_HISTORY for queuing behavior. That is the view you should wire into an alert before you increase the throughput multiplier:

SELECT
  start_time,
  warehouse_name,
  avg_running,
  avg_queued_load
FROM snowflake.account_usage.warehouse_load_history
WHERE warehouse_name = 'WH_BI_PROD'
  AND start_time >= DATEADD(hour, -24, CURRENT_TIMESTAMP())
ORDER BY start_time;

What you cannot prove yet is per query cost. Snowflake says query level cost visibility is not available during public preview and is planned for general availability. That is the biggest practical reason to pilot Adaptive Compute on workloads with good warehouse level ownership. If 12 teams share one warehouse, adaptive billing will not magically produce clean accountability.

When is Adaptive Compute better than Gen2, and when is it just less control?

Gen2 warehouses and Adaptive Compute solve different problems. Gen2 keeps the familiar fixed warehouse model and improves the engine and hardware under it. Adaptive Compute changes the operating model.

Snowflake's 2025 product blog said Standard Warehouse Generation 2 delivered 2.1x faster performance for core analytics workloads over the 12 months ending May 2, 2025, and positioned Adaptive Compute as the next step toward less infrastructure tuning. If you moved to Gen2 and got the performance you needed, you do not have to treat Adaptive Compute as an emergency migration.

Choice	Status or number	What you still tune	Billing behavior	Best fit
Standard warehouse Gen1	Default generation `1` in `CREATE WAREHOUSE` docs	Size, clusters, scaling, suspend, QAS	Credits while the warehouse runs	Stable workloads with tight manual control
Standard warehouse Gen2	Snowflake cited 2.1x faster core analytics performance in 2025	Same warehouse model, newer generation	Credits while the warehouse runs	Existing warehouses that need faster execution without model change
Adaptive warehouse	Open preview introduced April 2026	2 primary knobs	Query based compute credits	Variable concurrency where tuning overhead is the pain

The strongest fit is a warehouse where the hard problem is workload variance: Monday dashboards, hourly ELT bursts, ad hoc analyst queries, and the occasional monster join. Adaptive Compute can choose resources per query and let you cap the worst case.

The weaker fit is a warehouse where you need deterministic cost attribution or a tightly reasoned performance envelope. If a regulated team asks why one query cost what it cost, preview Adaptive Compute may not satisfy them yet. You can show warehouse level credits and query level timing. You cannot show query level credits.

There is also a lock in angle. With standard warehouses, the mental model maps to other platforms: cluster size, concurrency, queueing, idle time. With Adaptive Compute, the most important scheduling logic is Snowflake proprietary. That may be fine. Most teams are not looking to lovingly hand tune cluster topology. But if your internal platform team has built a router that assigns queries to warehouses based on fingerprints, SLAs, and chargeback tags, Adaptive Compute competes with part of that control plane.

How should you lock it down before the pilot grows legs?

Start with roles. Adaptive warehouses are still warehouses, so do not give every enthusiastic analyst the ability to create or modify them. Snowflake's CREATE WAREHOUSE docs say the account level CREATE WAREHOUSE privilege is required to create warehouses, and only SYSADMIN or higher has it by default.

A simple pattern is one owner role, one operator role, and one user role:

CREATE ROLE adaptive_wh_admin;
CREATE ROLE adaptive_wh_operator;
CREATE ROLE adaptive_wh_user;

GRANT CREATE WAREHOUSE ON ACCOUNT TO ROLE adaptive_wh_admin;

After the admin creates the warehouse, grant usage widely and modification narrowly:

GRANT USAGE ON WAREHOUSE wh_bi_prod TO ROLE adaptive_wh_user;
GRANT MONITOR ON WAREHOUSE wh_bi_prod TO ROLE adaptive_wh_operator;
GRANT OPERATE ON WAREHOUSE wh_bi_prod TO ROLE adaptive_wh_operator;
GRANT MODIFY ON WAREHOUSE wh_bi_prod TO ROLE adaptive_wh_admin;

That split matters because MODIFY can change cost affecting properties. OPERATE can change state, including warehouse operations. For adaptive warehouses, Snowflake also supports ENABLE and DISABLE; disabling rejects new jobs while already running queries continue.

ALTER WAREHOUSE wh_bi_prod DISABLE;
ALTER WAREHOUSE wh_bi_prod ENABLE;

Use this for preview blast radius. A disabled adaptive warehouse is a cleaner stop sign than dropping a warehouse that your jobs still reference.

Then add a resource monitor or budget. Snowflake says existing budgets and resource monitors work with adaptive warehouses. The product story is automatic performance. Your job is automatic regret prevention.

A practical pilot plan looks like this:

Pick 1 warehouse with a clear owner and at least 14 days of baseline history in WAREHOUSE_METERING_HISTORY.
Convert in a low traffic window, especially if long queries are already running.
Keep QUERY_THROUGHPUT_MULTIPLIER at the derived value or the default 2 for the first week unless WAREHOUSE_LOAD_HISTORY shows sustained queueing.
Do not use QUERY_THROUGHPUT_MULTIPLIER = 0 on a shared production warehouse unless a resource monitor is already attached.
Report weekly on credits, query count, average elapsed time, and average queued overload time.

That is boring FinOps. Boring is the point.

So is this the end of warehouse sizing?

For many teams, yes, eventually. Snowflake Adaptive Compute moves the warehouse decision from "what size should this be?" to "what is the largest single query we are willing to fund, and how much burst do we allow?" That is a better conversation for most data teams.

But the preview version is not a blank check. It is available in 3 AWS regions, requires Enterprise Edition or higher, excludes several warehouse types, and lacks per query credit visibility. If you own the Snowflake bill, your first move should be a measured pilot, not a fleet wide conversion script.

The best use of Adaptive Compute is not to save you from understanding cost. It is to stop spending human time on knobs that Snowflake can probably tune better than your Tuesday afternoon hunch.