Michael J. Swart

March 12, 2019

Lonely Tables in SQL Server

Filed under: SQL Scripts,SQLServerPedia Syndication,Technical Articles — Michael J. Swart @ 12:00 pm

Takeaway: I provide a script that looks at the procedure cache and reports tables that are never joined to other tables.

Recently, I’ve been working hard to reduce our use of SQL Server as much as possible. In other words, I’ve been doing some spring cleaning. I pick up a table in my hands and I look at it. If it doesn’t spark joy then I drop it.

If only it were that easy. That’s not quite the process I’m using. The specific goals I’m chasing are about reducing cost. I’m moving data to cheaper data stores when it makes sense.

So let’s get tidying. But where do I start?

Getting rid of SQL Server tables should accomplish a couple things. First, it should “move the needle”. If my goal is cost, then the tables I choose to remove should reduce my hardware or licensing costs in a tangible way. The second thing is that dropping the table is achievable without 10 years of effort. So I want to focus on “achievability” for a bit.

Achievable

What’s achievable? I want to identify tables to extract from the database that won’t take years. Large monolithic systems can have a lot of dependencies to unravel.

So what tables in the database have the least dependencies? How do I tell without a trustworthy data model? Is it the ones with the fewest foreign keys (in or out)? Maybe, but foreign keys aren’t always defined properly or they can be missing all together.

My thought is that if two tables are joined together in some query, then they’re related or connected in some fashion. So that’s my idea. I can look at the procedure cache of a database in production to see where the connections are. And when I know that, I can figure out what tables are not connected.

Lonely Tables

This script gives me set of tables that aren’t joined to any other table in any query in cache

use [your db name here];
 
SELECT qs.query_hash,
       qs.plan_handle,
       cast(null as xml) as query_plan
  INTO #myplans
  FROM sys.dm_exec_query_stats qs
 CROSS APPLY sys.dm_exec_plan_attributes(qs.plan_handle) pa
 WHERE pa.attribute = 'dbid'
   AND pa.value = db_id();
 
WITH duplicate_queries AS
(
  SELECT ROW_NUMBER() OVER (PARTITION BY query_hash ORDER BY (SELECT 1)) n
  FROM #myplans
)
DELETE duplicate_queries
 WHERE n > 1;
 
UPDATE #myplans
   SET query_plan = qp.query_plan
  FROM #myplans mp
 CROSS APPLY sys.dm_exec_query_plan(mp.plan_handle) qp;
 
WITH XMLNAMESPACES (DEFAULT 'http://schemas.microsoft.com/sqlserver/2004/07/showplan'),
my_cte AS 
(
    SELECT q.query_hash,
           obj.value('(@Schema)[1]', 'sysname') AS [schema_name],
           obj.value('(@Table)[1]', 'sysname') AS table_name
      FROM #myplans q
     CROSS APPLY q.query_plan.nodes('/ShowPlanXML/BatchSequence/Batch/Statements/StmtSimple') as nodes(stmt)
     CROSS APPLY stmt.nodes('.//IndexScan/Object') AS index_object(obj)
)
SELECT query_hash, [schema_name], table_name
  INTO #myExecutions
  FROM my_cte
 WHERE [schema_name] IS NOT NULL
   AND OBJECT_ID([schema_name] + '.' + table_name) IN (SELECT object_id FROM sys.tables)
 GROUP BY query_hash, [schema_name], table_name;
 
WITH multi_table_queries AS
(
    SELECT query_hash
      FROM #myExecutions
     GROUP BY query_hash
    HAVING COUNT(*) > 1
),
lonely_tables as
(
    SELECT [schema_name], table_name
      FROM #myExecutions
    EXCEPT
    SELECT [schema_name], table_name
      FROM #myexecutions WHERE query_hash IN (SELECT query_hash FROM multi_table_queries)
)
SELECT l.*, ps.row_count
  FROM lonely_tables l
  JOIN sys.dm_db_partition_stats ps
       ON OBJECT_ID(l.[schema_name] + '.' + l.table_name) = ps.object_id
 WHERE ps.index_id in (0,1)
 ORDER BY ps.row_count DESC;

Caveats

So many caveats.
There are so many things that take away from the accuracy and utility of this script that I hesitated to even publish it.
Here’s the way I used the script. The list of tables was something that helped me begin an investigation. For me, I didn’t use it to give answers, but to generate questions. For example, taking each table in the list, I asked: “How hard would it be to get rid of table X and what would that save us?” I found it useful to consider those questions. Your mileage of course will vary.

5 Comments »

  1. […] Michael J. Swart wants to help you find single tables in your area: […]

    Pingback by Finding Singleton Tables – Curated SQL — March 13, 2019 @ 8:11 am

  2. Did you thank the tables for their service before dropping them?

    Comment by DBAMa — March 25, 2019 @ 2:45 pm

  3. @DBAMa Hahaha! Of course!

    Comment by Michael J. Swart — March 27, 2019 @ 9:27 am

  4. Michael, I tried this out. Didn’t find anything I needed to fix, but the ones it returned would be worth looking at if you didn’t know the db. Thanks for posting it!

    Comment by Andy Warren — April 10, 2019 @ 8:12 am

  5. That’s cool Andy. I found the same thing. It’s good to verify my understanding of a schema I already know fairly well.
    Have you tried out the related script yet? The one at https://michaeljswart.com/2019/04/finding-tables-with-few-dependencies/

    Comment by Michael J. Swart — April 10, 2019 @ 8:59 am

RSS feed for comments on this post. TrackBack URL

Leave a comment

Powered by WordPress