What Do I Do When My Snowflake Query Is Slow? Part 1: Diagnosis
January 20, 2022
Because Rockset helps organizations achieve the data freshness and query speeds needed for real-time analytics, we sometimes are asked about approaches to improving query speed in databases in general, and in popular databases such as Snowflake, MongoDB, DynamoDB, MySQL and others. We turn to industry experts to get their insights and we pass on their recommendations. In this case, the series of two posts that follow address how to improve query speed in Snowflake.
Every developer wants peak performance from their software services. When it comes to Snowflake performance issues, you may have decided that the occasional slow query is just something that you have to live with, right? Or maybe not. In this post we’ll discuss why Snowflake queries are slow and options you have to achieve better Snowflake query performance.
It’s not always easy to tell why your Snowflake queries are running slowly, but before you can fix the problem, you have to know what’s happening. In part one of this two-part series, we’ll help you diagnose why your Snowflake queries are executing slower than usual. In our second article, What Do I Do When My Snowflake Query Is Slow? Part 2: Solutions, we look at the best options for improving Snowflake query performance.
Diagnosing Queries in Snowflake
First, let’s unmask common misconceptions of why Snowflake queries are slow. Your hardware and operating system (OS) don’t play a role in execution speed because Snowflake runs as a cloud service.
The network could be one reason for slow queries, but it’s not significant enough to slow execution all the time. So, let’s dive into the other reasons your queries might be lagging.
Check the Information Schema
In short, the
INFORMATION_SCHEMA is the blueprint for every database you create in Snowflake. It allows you to view historical data on tables, warehouses, permissions, and queries.
You cannot manipulate its data as it is read-only. Among the principal functions in the
INFORMATION_SCHEMA, you will find the
QUERY_HISTORY_BY_* tables. These tables help uncover the causes of slow Snowflake queries. You'll see both of these tables in use below.
Keep in mind that this tool only returns data to which your Snowflake account has access.
Check the Query History Page
Snowflake’s query history page retrieves columns with valuable information. In our case, we get the following columns:
EXECUTION_STATUSdisplays the state of the query, whether it is running, queued, blocked, or success.
QUEUED_PROVISIONING_TIMEdisplays the time spent waiting for the allocation of a suitable warehouse.
QUEUED_REPAIR_TIMEdisplays the time it takes to repair the warehouse.
QUEUED_OVERLOAD_TIMEdisplays the time spent while an ongoing query is overloading the warehouse.
Overloading is the more common phenomenon, and
QUEUED_OVERLOAD_TIME serves as a crucial diagnosing factor.
Here is a sample query:
select * from table(information_schema.query_history_by_session()) order by start_time;
This gives you the last 100 queries that Snowflake executed in the current session. You can also get the query history based on the user and the warehouse as well.
Check the Query Profile
In the previous section, we saw what happens when multiple queries are affected collectively. It’s equally important to address the individual queries. For that, use the query profile option.
You can find a query’s profile on Snowflake’s History tab.
The query profile interface looks like an advanced flowchart with step-by-step query execution. You should focus mainly on the operator tree and nodes.
The operator nodes are spread out based on their execution time. Any operation that consumed over one percent of the total execution time appears in the operator tree.
The pane on the right side shows the query’s execution time and attributes. From there, you can figure out which step took too much time and slowed the query.
Check Your Caching
To execute a query and fetch the results, it might take 500 milliseconds. If you use that query frequently to fetch the same results, Snowflake gives you the option to cache it so the next time it is faster than 500 milliseconds.
Snowflake caches data in the result cache. When it needs data, it checks the result cache first. If it does not find data, it checks the local hard drive. If it still does not find the data, it checks the remote storage.
Retrieving data from the result cache is faster than from the hard drive or remote memory. So, it is best practice to use the result cache effectively. Data remains in the result cache for 24 hours. After that, you have to execute the query again to get the data from the hard disk.
You can check out how effectively Snowflake used the result cache. Once you execute the query using Snowflake, check the Query Profile tab.
You find out how much Snowflake used the cache on a tab like this.
Check Snowflake Join Performance
If you experience slowdowns during query execution, you should compare the expected output to the actual result. You could have encountered a row explosion.
A row explosion is a query result that returns far more rows than anticipated. Therefore, it takes far more time than anticipated. For example, you might expect an output of four million records, but the outcome could be exponentially higher. This problem occurs with joins in your queries that combine rows from multiple tables. The join order matters. You can do two things: look for the join condition you used, or use Snowflake’s optimizer to see the join order.
An easy way to determine whether this is the problem is to check the query profile for join operators that display more rows in the output than in the input links. To avoid a row explosion, ensure the query result does not contain more rows than all its inputs combined.
Similar to the query pattern, using joins is in the hands of the developer. One thing is clear — bad joins result in slow Snowflake join performance, and slow queries.
Check for Disk Spilling
Accessing data from a remote drive consumes more time than accessing it from a local drive or the result cache. But, when query results don’t fit on the local hard drive, Snowflake must use remote storage.
When data moves to a remote hard drive, we call it disk spilling. Disk spilling is a common cause of slow queries. You can identify instances of disk spilling on the Query Profile tab. Take a look at “Bytes spilled to local storage.”
In this example, the execution time is over eight minutes, out of which only two percent was for the local disk IO. That means Snowflake did not access the local disk to fetch data.
The warehouse may be busy executing other queries. Snowflake cannot start incoming queries until adequate resources are free. In Snowflake, we call this queuing.
Queries are queued so as not to compromise Snowflake query performance. Queuing may happen because:
- The warehouse you are using is overloaded.
- Queries in line are consuming the necessary computing resources.
- Queries occupy all the cores in the warehouse.
You can rely on the queue overload time as a clear indicator. To check this, look at the query history by executing the query below.
QUERY_HISTORY_BY_SESSION( [ SESSION_ID => <constant_expr> ] [, END_TIME_RANGE_START => <constant_expr> ] [, END_TIME_RANGE_END => <constant_expr> ] [, RESULT_LIMIT => <num> ] )
You can determine how long a query should sit in the queue before Snowflake aborts it. To determine how long a query should remain in line before aborting it, set the value of the
STATEMENT_QUEUED_TIMEOUT_IN_SECONDS column. The default is zero, and it can take any number.
Analyze the Warehouse Load Chart
Snowflake offers charts to read and interpret data. The warehouse load chart is a handy tool, but you need the MONITOR privilege to view it.
Here is an example chart for the past 14 days. When you hover over the bars, you find two statistics:
- Load from running queries — from the queries that are executing
- Load from queued queries — from all the queries waiting in the warehouse
The total warehouse load is the sum of the running load and the queued load. When there is no contention for resources, this sum is one. The more the queued load, the longer it takes for your query to execute. Snowflake may have optimized the query, but it may take a while to execute because several other queries were ahead of it in the queue.
Use the Warehouse Load History
You can find data on warehouse loads using the
Three parameters help diagnose slow queries:
AVG_RUNNING— the average number of queries executing
AVG_QUEUED_LOAD— the average number of queries queued because the warehouse is overloaded
AVG_QUEUED_PROVISIONING— the average number of queries queued because Snowflake is provisioning the warehouse
This query retrieves the load history of your warehouse for the past hour:
use warehouse mywarehouse; select * from table(information_schema.warehouse_load_history(date_range_start=>dateadd ('hour',-1,current_timestamp())));
Use the Maximum Concurrency Level
Every Snowflake warehouse has a limited amount of computing power. In general, the larger (and more expensive) your Snowflake plan, the more computing horsepower it has.
A Snowflake warehouse's
MAX_CONCURRENCY_LEVEL setting determines how many queries are allowed to run in parallel. In general, the more queries running simultaneously, the slower each of them. But if your warehouse's concurrency level is too low, it might cause the perception that queries are slow.
If there are queries that Snowflake can't immediately execute because there are too many concurrent queries running, they end up in the query queue to wait their turn. If a query remains in the line for a long time, the user who ran the query may think the query itself is slow. And if a query stays queued for too long, it may be aborted before it even executes.
Next Steps for Improving Snowflake Query Performance
Your Snowflake query may run slowly for various reasons. Caching is effective but doesn’t happen for all your queries. Check your joins, check for disk spilling, and check to see if your queries are spending time stuck in the query queue.
When investigating slow Snowflake query performance, the query history page, warehouse loading chart, and query profile all offer valuable data, giving you insight into what is going on.
Now that you understand why your Snowflake query performance may not be all that you want it to be, you can narrow down possible culprits. Your next step is to get your hands dirty and fix them.
Don't miss the second part of this series, What Do I Do When My Snowflake Query Is Slow? Part 2: Solutions, for tips on optimizing your Snowflake queries and other choices you can make if real-time query performance is a priority for you.