Michael J. Swart

August 28, 2012

Sharpshooting Query Plans From The Cache

Filed under: SQL Scripts,SQLServerPedia Syndication,Technical Articles — Tags: — Michael J. Swart @ 12:00 pm

Takeaway: I talk a bit about DBCC FREEPROCCACHE, using the plan_handle parameter to remove exactly one query plan from cache. I also explain why and when you might want to.

This is another post where I share a script of mine. It’s one of my favorite kinds of posts so I’ve created a new category for it: SQL Scripts

The Situation

  • Crisis mode: The database server is slow (all of a sudden) and the performance is unacceptable.
  • You used your troubleshooting checklist and you see CPU is suddenly pegged on the database server.
  • And sp_whoisactive (or other monitoring software) tells you that there’s a particular query that is running frequently. In this example, the query starts with “SELECT TOP (@ResultLimit) M.Name FROM MyTable M JOIN … etc“.
  • The query is called in exactly one place in your application (Oh man! A fix will take time to recompile, QA and deploy).
  • The query usually runs for a few minutes and using up almost all the CPU.
  • You’ve collected the query plan and the query text.
  • But, if you run the query in SQL Server Management Studio, the query completes quickly.

Some Analysis

  • The query plan looks different when run in Management Studio
  • The production database must have picked a plan based on unlucky compile-time parameters.
  • Confirmed! When running the query in Management Studio using the OPTIMIZE FOR clause with similar parameters, you can reproduce the bad plan.
  • To buy some time, it would be good to have SQL Server recompile that plan.

This is what I want to explore in this post. What is the best way to get SQL Server to recompile that plan?

Before SQL Server 2008

How do we get SQL Server to recompile this plan? Before SQL Server 2008, there were a couple things I could do:

  • DBCC FREEPROCCACHE: Holy cow! That’s a bit extreme. We ask SQL Server to recompile all of its plans. (Missed opportunity: Illustration featuring Darth Sidious saying “Wipe them out. All of them.”)
  • UPDATE STATISTICS [tablename]: Find a table that is used in the query and have SQL Server update statistics on it. All plans using the table will be recompiled afterwards (including our troubled plan!) We might not actually need updated statistics; it’s the side effect of recompiled plans that we want here. I talked a bit about that in Updating Statistics Helps, But Not For The Reasons You Think.
  • EXEC sp_recompile [tablename]: This technique is similar to UPDATE STATISTICS in that plans are dropped. The benefit is that we don’t have all the overhead and time spent creating the statistics. It’s still not ideal though. It takes a schema modification lock and based on the server’s current busy state, that will mean at least a couple minutes of blocked processes.

Targeting a Single Query Plan

SQL Server version 2008 and later allows you to take out a single query plan from cache using the FREEPROCCACHE command with a plan handle. This is a sharp-shooter technique compared to the other techniques.

I like this technique because it takes no important locks. Existing calls to this query will continue to execute and complete (using the old bad plan). But in the meantime, new calls to this query will use a recompiled plan based on the different (and hopefully better) parameters. If so you’ve just bought yourself time to find and fix this query correctly.

But you can’t just use FREEPROCCACHE without some preparation. Finding the plan handle is not straightforward. So I’ve got a script that only asks that you identify a query based on its text.

The Script

First find out how bad your query is. I’m arbitrarily defining more than 10 active queries as bad. Change the query text here. Use enough text to uniquely identify the problem query. In my case I used “TOP (@ResultLimit)”.

-- Part 1: 
-- Use this query to look for bad performing queries using "TOP (@ResultLimit)"
select 
       count(1) as [count of running queries, should be less than 10], 
       max(datediff(second, start_time, getdate())) as [longest running such query in seconds],
       min(datediff(second, start_time, getdate())) as [shortest running such query in seconds]
from sys.dm_exec_requests er
cross apply sys.dm_exec_sql_text(er.sql_handle) st
where st.text like '%TOP (@ResultLimit)%'
    and st.text not like '%80B82FDD-0297-4057-9AEB-D811F3B5C7DB%' -- filter self
GO

Then run this part to knock out that one query (remembering again to adjust the query text).

-- Part 2:
-- If the above indicates poor performance (i.e. many active MyTable queries), 
-- use the following to clear bad plan from cache.
declare @count int = 0;
declare @planhandle varbinary(64)
 
select 
    @count = count(1), 
    @planhandle = max(plan_handle) -- any arbitrary plan
from sys.dm_exec_requests er
cross apply sys.dm_exec_sql_text(er.sql_handle) st
where st.text like '%TOP (@ResultLimit)%'
    and st.text not like '%80B82FDD-0297-4057-9AEB-D811F3B5C7DB%' -- filter self
 
-- this query recompiles the one identified plan.
if (@count > 10)
    --working
    dbcc freeproccache (@planhandle);
GO

Hopefully You’ve Bought Yourself Time

The query is recompiled with a better plan and you can sit back (for now). Your query is still vulnerable to bad plans and you’ve got to fix that, but you’ve got time.

In practice I’ve used this script maybe three or four times successfully. Your mileage may vary.

How Risky Is This Script?

Well, What’s the worst thing that could happen?

To be honest, I can’t think of much that could go wrong other than it doesn’t solve your problem. In the worst case scenario:

  • your analysis was wrong, the plan is recompiled and still performs poorly.
  • Or perhaps somehow the script picks the wrong plan to recompile, that’s not bad news either, it’s just news. I certainly can’t say the same for the other techniques that were available before version 2008.
  • Maybe expectations were set incorrectly. You propose this change, It doesn’t fix anything and now it may look like your guessing (This isn’t a technical issue, but it’s something to keep in mind).

By the way, don’t take my word for it. You need to tell others that the risk is small and you have to back this claim up without my help. So you should understand what’s going on here.

 

7 Comments »

  1. […] Sharpshooting Query Plans From The Cache – Here’s a script from Michael J. Swart (Blog|Twitter) to help you immediately treat a query with a poor execution plan using laser like precision. […]

    Pingback by Something for the Weekend - SQL Server Links 31/08/12 — August 31, 2012 @ 6:02 am

  2. Thanks, that’s very useful, it can help with testing performance of specific queries as well, without affecting other people.

    I believe there is also a fourth option for the scenario you mentioned – plan guides.

    Comment by Glen Turnbull — September 1, 2012 @ 8:39 pm

  3. Well put Glen,

    Plan guides. I had forgotten about those. Plan guides are a way of “hardening” SQL Server (By hardening I mean avoiding changing plans, changing statistics, etc… even in the face of changing data). Plan guides could be a good way to “pin” a query plan to buy time. Personally I don’t have much experience with it…

    Hmmm…. Maybe it’s something for me to practice until either I know it well enough to recommend as a solution, or know it well enough to know not to recommend it (for some reason :-).

    Comment by Michael J. Swart — September 4, 2012 @ 9:06 am

  4. In addition to “hardening” the plan, you can also use plan guides in SQL 2005 to remove a single plan from the cache, or prevent it from ever being cached in the first place by setting @hints = N’OPTION(RECOMPILE)’ when calling sp_create_plan_guide. If you just want to remove the cached plan, you can then immediately drop the plan guide with sp_control_plan_guide.

    Comment by Daniel Brown — September 4, 2012 @ 9:28 am

  5. Thank you for this post.
    Wouldn’t recompiling the sproc by running “ALTER PROC procName…” achieve a similar outcome?

    Comment by Marios Philippopoulos — September 4, 2012 @ 8:00 pm

  6. Hi Marios,

    That would certainly do it. The procedure is recompiled and any processes that happen to be running the procedure continue to run without blocking. I really wish I mentioned that in my article.

    In my case specifically, I’m not dealing with procedures but with SQL coming directly from the application and so that option isn’t open to me. That’s why I mentioned that “a fix will take time to recompile, QA and deploy”.

    It’s interesting that the command is called FREEPROCCACHE when it should be called FREEPLANCACHE.

    Comment by Michael J. Swart — September 5, 2012 @ 8:22 am

  7. Just finished this article by Kimberly Tripp called Stored procedures, recompilation and .NetRocks and she reminded me of one thing:

    Say the scenario I described was different in that the query wasn’t an ad hoc query coming from an application, but was a query coming from a stored procedure. Well in that scenario, it’s dead simple to sharpshoot that plan out of the cache: Execute

    sp_recompile procname

    Comment by Michael J. Swart — October 24, 2012 @ 8:31 am

RSS feed for comments on this post. TrackBack URL

Leave a comment

Powered by WordPress