Cost optimization

Problem

As data teams grow and mature, a common pain point we hear from our customers is managing their warehouse spend. Growing warehouse spend can be a sign of a successful data engineering team as their data usage grows. However, growing spend can also indicate inefficiencies in how the data is stored and being queried. It is important to not only get alerted when spend management exceeds expectations, but also to have strategies for mitigating spend.

Solution

Metaplane helps users detect when their warehouse spend is anomalous and provides insights into where data teams can focus their efforts to reduce warehouse usage and decrease spend. Metaplane has custom snowflake spend monitors that can immediately start tracking warehouse compute usage over time. Metaplane will then alert users when their warehouse compute usage is anomalous. Once a user is alerted that warehouse spend has increased, they can use the usage analytics functionality to see which tables are being queried the most and which queries specifically are the slowest. These are indicators of places where the data team might want to focus to reduce compute usage in their warehouse.

How to monitor snowflake compute usage in Metaplane

Visit the database that you’re interested in monitoring spend and click the ‘Add monitors’ button. You will see the Snowflake monitors option.

Once selected, you can choose which metric you are interested in monitoring. “Compute” will track overall compute usage over time whereas “Compute overage” will only track compute usage over your allowed amount over time.

When you click ‘Create’, Metaplane will create a monitor that will track and model your Snowflake compute usage over time. As time passes, Metaplane will learn your data team’s usage patterns and build a model to determine expected usage over time.

When Metaplane detects that usage has increased in an anomalous way, we will send you an alert.

Strategies to manage warehouse spend

The usage analytics feature in Metaplane helps data teams gain insight into how their warehouse is being used and where they may be able to optimize their warehouse usage. The main strategies for decreasing warehouse spend is to reduce the number of queries and optimize the slowest queries. Metaplane can help with both of these goals.

On the ‘Connection’ page for the database you are interested in, there is a card for Usage Analytics which is where you can enable usage analytics.

Once you have enabled usage analytics and the backfill has completed, you can click “See results” to see usage reports. Here you will see the tables within your database ordered by how many times they have been queried. This is a great starting point for optimizing your warehouse spend.

You should see many familiar tables in this list. The most commonly used tables are not usually surprising to a data team. However, if you do see any surprising tables, those are a great place to dig in and see if you can reduce the number of queries. If you see a table you do not expect to be queried, Metaplane can tell you which roles and users are querying that table.

You should also look at the “Most Common Queries” section to see which queries are being run the most. This should give you an indication for whether or not these queries are important and if you can remove them.

Once you’ve tackled the frequently queried tables, you can take a look at tables with the slowest queries next. Navigate back to the “Table usage” table and sort by “Avg query time”.

When you click into a table, you can see query performance over time which can give you insight into whether queries on this table have gotten slower. This can indicate that your data has grown in size and existing queries have gotten slower or there are new, slow queries being executed against this table.

Next, you can check out the “Slowest Queries” section to see what the slowest queries are. You should look for the slowest queries that are queried the most and focus on those. You can consider adding where clauses to the queries to minimize the amount of data scanned or creating intermediate tables to cache expensive calculations.

Summary

Managing warehouse spend is a tough problem to solve. It not only requires knowing when warehouse spend is increasing anomalously, but it also requires knowing which areas to focus on for reducing warehouse usage. Metaplane can help with both of these problems. Using Metaplane, data teams can create monitors that will track warehouse spend over time so that teams are not surprised by an unexpected Snowflake bill. Metaplane will also analyze query usage and provide insights into where data teams can focus to make the biggest impact on reducing warehouse usage. For many data teams it can be overwhelming to tackle the spend management problem and teams struggle with knowing where to start. With Metaplane’s usage analytics functionality, data teams can feel confident that when they start looking at most commonly queried tables and tables with the slowest queries, they will be able to make a dent in addressing warehouse usage.