by datastudy.nl

Field notes for enterprise data engineers and scientists

Engineering

Building an AI agent on your Snowflake data with Cortex Agents

Cortex Agents let you build an AI agent that reasons across structured tables, unstructured documents, and custom tools inside Snowflake. It orchestrates a plan-act-reflect loop, and the bill is the sum of four meters you need to watch.

Bar chart of Cortex Agent cost components: orchestration tokens 30 percent, Cortex Analyst tokens 25 percent, Cortex Search index 20 percent, warehouse run 25 percent
Illustrative: a Cortex Agent bill combines orchestration tokens, Cortex Analyst tokens, Cortex Search index cost, and warehouse compute. Snowflake Cortex Agents documentation.

Every enterprise that bought into chat-with-your-data quickly hit the same wall: a text-to-SQL tool answers questions about tables, a search tool answers questions about documents, and a real business question needs both at once. "Why did churn spike in EMEA last quarter, and what did the support tickets say" is half a SQL query and half a document search, stitched together with reasoning. Cortex Agents is Snowflake's framework for exactly that: an AI agent that plans a multi-step task, calls the right tool for each step, reflects on the result, and keeps going until it has an answer, all inside Snowflake's governance boundary. It is the orchestration layer that sits above Cortex Analyst and Cortex Search rather than replacing them.

This guide is for the builder who has been asked to "build us an AI agent" on company data and needs to know what Cortex Agents actually does, how the loop works, what privileges it needs, and where the costs hide before you ship it.

What does an agent do that a single Cortex call does not?

A plain Cortex call is one shot: a prompt in, a completion out. An agent runs a loop. Snowflake's orchestration follows a plan-act-reflect cycle that should be familiar to anyone who has built with agent frameworks:

  1. Planning. The agent reads the request, explores what it knows, splits the task into steps, and routes each step to a tool. "Churn plus ticket sentiment" becomes a structured query step and a document search step.
  2. Tool use. It calls the tools. Structured questions go to a Cortex Analyst tool over your semantic view. Unstructured questions go to a Cortex Search tool over indexed documents. It can also call custom tools you expose as stored procedures or UDFs, and a web-search tool backed by the Brave API with zero data retention.
  3. Reflection. It evaluates the tool output, decides whether it has enough to answer or needs another step, and iterates.
  4. Monitor and respond. It assembles the final answer and you can monitor, evaluate, and tune the whole trajectory.

The agent itself is a first-class Snowflake object, and it uses threads to persist conversation context across turns, so a follow-up question keeps the earlier context instead of starting cold. You access it through the agent:run REST API, which means the same agent can power a Streamlit app, an internal portal, or a product feature.

-- An agent is a Snowflake object that bundles an orchestration model
-- with the tools it is allowed to use. The model plans; the tools act.
CREATE AGENT support_analyst
  WITH PROFILE = '{"orchestration_model": "auto"}'
  COMMENT = 'Answers questions across sales tables and support tickets'
  TOOLS = (
    cortex_analyst_tool(semantic_view => 'sales_analytics'),
    cortex_search_tool(service => 'support_tickets_search')
  );

Setting the orchestration model to auto lets Snowflake pick, which is the recommended default; you can pin a specific model such as claude-sonnet-4-5 or openai-gpt-4.1 when you need predictable behaviour.

What does it cost to run?

This is the question that decides whether your agent reaches production, and the honest answer is that an agent has no single price. A Cortex Agent bill is the sum of four separate meters, and a careless agent can light up all four on every question.

Bar chart of Cortex Agent cost components: orchestration tokens, Cortex Analyst tokens, Cortex Search index, warehouse run, each contributing a share of the total
Illustrative: a Cortex Agent combines four cost components rather than one, so a single question can bill on all four at once. Source: Snowflake Cortex Agents cost documentation.
Meter What it charges for What drives it up
Orchestration tokens The planning and reflection model's reasoning More steps, longer threads, bigger context
Cortex Analyst tokens Each structured-data tool call Every SQL step the agent takes
Cortex Search index Keeping the document index live Index size and persistence over time
Warehouse run Executing the SQL the agent generates Query size and warehouse uptime

The trap is that an agent multiplies calls. A single user question can trigger several planning rounds, two or three tool calls, and a reflection pass, so what feels like "one question" is a dozen billed operations under the hood. Watch the trajectory, cap the number of steps where you can, and keep threads from growing unbounded, because every turn resends accumulated context and the orchestration-token line climbs with it. The warehouse discipline from the warehouse sizing guide applies to the fourth meter unchanged.

What privileges and guardrails does it need?

Cortex Agents is governed like the rest of Snowflake, which is the reason to build the agent here rather than wiring an external LLM to a database connection. To create and run agents a role needs SNOWFLAKE.CORTEX_USER (or the more specific CORTEX_AGENT_USER database role) plus the object privileges to CREATE AGENT, and USAGE, OWNERSHIP, MODIFY, or MONITOR on the agents themselves. Every tool the agent calls still runs under the caller's role, so the agent can never read data the user could not, the same boundary that protects Cortex Analyst and is enforced by the RBAC and masking setup.

Two operational caveats before you put an agent in front of users:

  • Accuracy is not guaranteed. Snowflake is explicit that agent answers can be wrong and should be reviewed, especially for anything that drives a decision. Build a review step for high-stakes use, do not auto-execute on the agent's word.
  • Runtime matters. Agents are not supported from the Streamlit-in-Snowflake warehouse runtime; you run them in the container runtime instead. Get this wrong and your first deploy fails for an unobvious reason.

The honest read

Cortex Agents is the right tool when a real question genuinely spans structured and unstructured data and you want the reasoning, the data, and the access control to stay in one governed place. That is a meaningful capability, and keeping it inside Snowflake's perimeter is a genuine advantage over bolting an external agent onto a database. The discipline it demands is cost observability: four meters, a loop that multiplies calls, and answers you still have to verify. Build the semantic view and the search service well first, instrument the trajectory before you scale, and treat the agent as a powerful assistant that needs a reviewer, not an oracle you can trust unattended. The rest of the Snowflake guides cover the foundations an agent is only as good as.

Sources