One of the most frequently asked questions is how to monitor Snowflake usage and costs. This article briefly summarizes the charges on the Snowflake platform and explains how to monitor Snowflake usage and, therefore, spending over time.
Snowflake Virtual Warehouse Sizes
Unlike other cloud-based analytic platforms, which charge by the hour, Snowflake uniquely charges per second for the compute resources called Virtual Warehouses.
A Virtual Warehouse consists of a cluster of machines (CPUs, memory, and SSD) arranged as a Massively Parallel Processing (MPP) server with between 1 and 128 nodes. These are placed in a series of “T-shirt” sizes, with charges billed as “credits,” costing from as little as $3.00 per hour as illustrated below:
Warehouse Size | Servers | Virtual CPUs | Cost per Hour |
XSMALL | 1 | 8 | $3 |
SMALL | 2 | 16 | $6 |
MEDIUM | 4 | 32 | $12 |
LARGE | 8 | 64 | $24 |
XLARGE | 16 | 128 | $48 |
X2LARGE | 32 | 256 | $96 |
X3LARGE | 64 | 512 | $192 |
X4LARGE | 128 | 1,024 | $384 |
X5LARGE | Double Capacity | Double Capacity | $768 |
X6LARGE | Double Capacity | Double Capacity | $1,536 |
It’s therefore relatively easy to estimate the cost of running a server by simply taking:
Warehouse Size: A Medium warehouse has four nodes and enough power to summarize and report on gigabytes of data.
Hours per Day: The server is expected to run for at least 8 hours per working day. Note that a virtual warehouse will automatically suspend and resume when not being used to avoid unnecessary charges.
This should give a reasonable cost estimate, and the size can be dynamically adjusted depending on the expected data volumes and performance required.
Controlling Spend
Other usage should be relatively small by comparison and include the cost of storage, a simple pass-through charge from the cloud provider. This is typically less than $25 per terabyte per month, and data is automatically compressed at a ratio of 4-10 times, giving additional savings.
Finally, sophisticated features, including near-real-time data ingestion using Snowpipe, automatic data clustering, and materialized view refreshes, use internal Snowflake resources. They are charged per second per CPU core and may add an additional element.
However, you should typically expect around 80% of the spending to be on virtual warehouses, which should be the initial focus of analysis.
Custom Monitoring
Snowflake provides two locations to monitor usage:
Information Schema: This provides a set of system-defined views and metadata with real-time statistics about queries. This data is typically retained for up to 14 days and is most helpful in monitoring ongoing queries.
Snowflake Account Usage: Provides a rich set of historical metadata for up to a year and will be the primary source of usage analysis.
These areas provide a rich set of metadata that can be analyzed. However, it’s much easier to show the potential results below.
Snowflake Cost: Warehouse Credits Over Time
The diagram below shows the monthly warehouse credits and the cumulative cost during the year.
-- Warehouse Credits over time
select to_char(start_time,'YYYY-MM') as month
, sum(credits_used)
from snowflake.account_usage.warehouse_metering_history wmh
where wmh.start_time >= dateadd(month, -12, current_date())
group by to_char(start_time,'YYYY-MM')
order by 1;
The above query can be used to monitor the monthly spending by all warehouses on the account over the past 12 months and indicate growth over time.
Snowflake Cost: Credit by Type
In addition to warehouse costs (which typically amount to 80% of total spending), Snowflake provides several serverless features. These use internal Snowflake compute resources and are billed per second.
The graph below shows the breakdown by month for each type.
-- Monthly Credits By Type
select to_char(usage_date,'YYYYMM') as month
, sum(decode(service_type,
'WAREHOUSE_METERING', credits_billed)) as warehouse_credits,
sum(decode(service_type,
'PIPE', credits_billed)) as pipe_credits,
sum(decode(service_type,
'MATERIALIZED_VIEW', credits_billed)) as mview_credits,
sum(decode(service_type,
'AUTO_CLUSTERING', credits_billed)) as clustering_credits,
sum(decode(service_type,
'WAREHOUSE_METERING_READER', credits_billed)) as reader_credits
, sum(credits_billed) as total
from snowflake.account_usage.metering_daily_history wmh
where wmh.usage_date >= dateadd(month, -12, current_date());
The above query shows a breakdown of the total monthly credits, indicating the type of spend. This includes:
Warehouses: Credits used by compute resources
Clustering: Credits used by the automatic clustering mechanism
Query Acceleration Service: This improves query performance for large table scans.
Hybrid Tables: These support both transactional and analytic workloads
Database Replication: This is used to replicate data across accounts.
Search Optimization Service: This speeds point lookup, which fetches a few rows from massive data sets.
Snowpipe: Credits used by the near real-time loading facility, Snowpipe
Materialized Views: Credits used to refresh Materialized Views
Reader Accounts: Credits used by Reader Accounts, typically used to share access to data
Serverless Tasks: These execute predefined tasks using Snowflake-provided compute resources
Want to learn more from a Snowflake Expert? Click on the image below for training from Analytics Today.
Snowflake Cost: Credits by Warehouse
When trying to understand warehouse spending, it's sensible to start with this graph, which shows the total expenditure by warehouse. This can be adapted to limit spending to the current month or year.
select warehouse_name, sum(credits_used) as credits_used
from snowflake.account_usage.warehouse_metering_history wmh
group by warehouse_name
order by 2 desc;
Snowflake Cost: Credits by Hour
The graph below indicates the time of day queries were executed and credits spent. This may be useful for indicating peak times during the day and identifying unexpected high spending during a quiet time overnight.
-- Credits by hour of the day
select to_char(start_time,'HH24') as hour
, sum(credits_used)
from snowflake.account_usage.warehouse_metering_history wmh
where wmh.start_time >= dateadd(month, -1, current_date())
group by to_char(start_time,'HH24')
order by 1;
Notably, any warehouse running 24x7 is potentially expensive and relatively unusual. While some use cases always need to compute resources, you should verify that these are being used efficiently. If, for example, you find more than one warehouse of the same size running, there may be an opportunity to consolidate workloads.
Data Storage Cost by Month
Storage costs are a relatively small part of the overall Snowflake cost. However, tuning opportunities are often easy to identify, and I usually find Snowflake customers spending thousands of dollars on unnecessary storage.
-- Data Storage by Month and Type
select to_char(usage_date,'YYYYMM') as sort_month
, to_char(usage_date,'Mon-YYYY') as month
, avg(storage_bytes) as storage
, avg(stage_bytes) as stage
, avg(failsafe_bytes) as failsafe
from snowflake.account_usage.storage_usage
group by month, sort_month
order by sort_month;
The report above shows the monthly cost for storage broken by type. These indicate the cost by:
Storage: This shows the data stored in Snowflake tables.
Stages: This indicates the data stored in files in Snowflake stages
Failsafe: This shows the storage in the 7-day failsafe area to protect against accidental data loss.
The graph below illustrates a typical situation I've frequently seen. In this case, the customer loaded data into a Snowflake internal stage before using a COPY command to load the data warehouse. However, they didn't remove the old data files, which led to the situation below.
Another common mistake is loading data from a stage into a staging table, which is deleted before the next load. This leads to very high volumes of Failsafe Storage as, by default, Snowflake holds data for 7 days in Failsafe. This is easily fixed by converting the staging tables to TRANSIENT.
My article outlines more ways to cut Snowflake Data Storage Costs by avoiding issues with zero copy clones.
Snowflake Cost Monitoring: Free Dashboards
There are also several freely available dashboards for the major Business Intelligence vendors below:
If you know of another vendor or an alternative dashboard, please get in touch with me on LinkedIn or comment below.