Trigger Happy Rotating Header Image

Scripts

Query Tuning and Index Review June 2010

Here are the scripts and slides for my session on Query Tuning and Index Reviews.  I hope you enjoy(ed) the session.

QueryTuningAndIndexReview

Bullet Dodging Scripts Prairie Devcon 2010

Here are the scripts for my bullet-dodging and spoon-bending session at Prairie DevCon Regina 2010.  Enjoy!

BulletSpoon

Do your T-SQL variables have the right values?

For some of us who are bewildered by even the code we write ourselves, this might be a question we ask ourselves in more circumstances than we’d like to admit, but what I’m talking about here is assigning values to a variable with a SELECT statement, and expecting the value to be null if the query didn’t find anything.  This can be particularly nefarious in cursor loops as I will show.  First, to set up the scenario, run the following script.  It creates two temp tables – one with 10 rows and one with 5:

IF OBJECT_ID('tempdb..#one') IS NOT NULL DROP TABLE #one
IF OBJECT_ID('tempdb..#two') IS NOT NULL DROP TABLE #two
CREATE TABLE #one (number INT IDENTITY (1,1))
CREATE TABLE #two (number INT IDENTITY (1,1))
GO
INSERT #one DEFAULT VALUES
GO 10
INSERT #two DEFAULT VALUES
GO 5
SELECT * FROM #one
SELECT * FROM #two

number
-----------
1
2
3
4
5
6
7
8
9
10
(10 row(s) affected) number
-----------
1
2
3
4
5
(5 row(s) affected)

Now we will use a cursor loop to go through the first (10-record) table and look for matching values in the second table – we should obviously only find 5 matching values – and display our findings.  This is what it *should* look like:

DECLARE @CR CURSOR, @Number INT, @Found INT
SET
@CR = CURSOR LOCAL FAST_FORWARD FOR
SELECT
number FROM #one
OPEN @CR
FETCH NEXT FROM @CR INTO @Number
WHILE @@FETCH_STATUS = 0
BEGIN
   RAISERROR
('Looking for number %i...',0,0,@Number)
  
SET @Found =(SELECT number FROM #two WHERE number = @Number)
  
IF @Found IS NOT NULL
      
RAISERROR('Found: %i',0,0,@Found)
  
FETCH NEXT FROM @CR INTO @Number
END

Looking for number 1...
Found: 1
Looking
for number 2...
Found: 2
Looking
for number 3...
Found: 3
Looking
for number 4...
Found: 4
Looking
for number 5...
Found: 5
Looking
for number 6...
Looking
for number 7...
Looking
for number 8...
Looking
for number 9...
Looking
for number 10...

Of course, it only found the five (by the way, please don’t take this as a recommendation to use cursors for this type of thing – I’m illustrating an unrelated point here…) How about the following, though – there is a subtle difference – do you see it right away?

DECLARE @CR CURSOR, @Number int, @Found int
SET
@CR = CURSOR LOCAL FAST_FORWARD FOR
SELECT
number FROM #one
OPEN @CR
FETCH NEXT FROM @CR INTO @Number
WHILE @@FETCH_STATUS = 0
BEGIN
   RAISERROR
('Looking for number %i...',0,0,@Number)
  
SELECT @Found = number FROM #two WHERE number = @Number
  
IF @Found IS NOT NULL
      
RAISERROR('Found: %i',0,0,@Found)
  
FETCH NEXT FROM @CR INTO @Number
END

Looking for number 1...
Found: 1
Looking
for number 2...
Found: 2
Looking
for number 3...
Found: 3
Looking
for number 4...
Found: 4
Looking
for number 5...
Found: 5
Looking
for number 6...
Found: 5
Looking
for number 7...
Found: 5
Looking
for number 8...
Found: 5
Looking
for number 9...
Found: 5
Looking
for number 10...
Found: 5

So “SELECT @Var =” and “SET @Var =” are the same thing, right? Well, it appears not.  Using the SELECT method, the variable is not resetting to NULL.  There is a simple explanation for this.  The SET method is setting the variable to the output of a query – that query did not return anything which is a missing value – the very definition of NULL – but the SELECT method is changing the variable as part of the query.  Since the query returns no rows, the variable assignment never takes place so the value of the variable does not change.

You may want the variable to change to NULL and you may not – so choose your method accordingly, but I have seen code that uses the SELECT method followed by a test for NULL clearly expecting the variable to be nulled if no record is found.

Exercise:  What happens if you use an aggregate like MAX() with the SELECT method – do you know what it will do before you try it?  Try it and see if you’re right (hint:  aggregate queries always return at least one row).

Be careful out there – the devil is in the details.


Query Tuning Scripts Prairie DevCon 2010

Here are the scripts from the session.  I hope you enjoyed it.  Sorry that I got a bit carried away and didn’t quite finish.  At the end of the demo script you’ll see that we were going to actually optimize a query…  But the tools, and what to look for are the most important piece anyway.  Feel free to post a comment or send me mail if you have any questions.  Enjoy the rest of the conference.

IntroQueryTuning

How to run a server-side trace

This is my first “how to” post.  In stark contrast to the “for dummies” and idiot’s books, my posts will assume a little basic knowledge.  For example, when I say “run profiler” below, I assume you don’t need a screenshot showing you where in the start menu it probably is on your computer.  If you need a little more detail, that doesn’t make you a dummy or an idiot, just post a comment and I’ll do my best to help out.

I assume you already know how to run a profiler trace.  Note that when I say “profiler trace” vs. a “Server-side trace” I’m talking about a trace that is run from within profiler.  Both types run traces on the server, but a profiler trace captures the output and displays it on your screen while a server-side trace sends the output directly to an output device (usually a file on the server) with no interaction from the client while it is running.  Linchi Shea’s article illustrates why you might want to do a server-side trace – this article is not about the “why” but I will only mention that my rule of thumb is that if the server you are tracing is a “production” server or if other users are relying on their connections to it in any way (which is pretty much the definition of a production server, but it could be a UAT server or some other well-used testing server) I don’t go near it with a profiler trace (well you’ll see later that I go *near* it, but you know what I mean).

In broad strokes, we will be generating a script for our trace from profiler, running that script on our server, then looking at a couple of ways to analyze the output.

Here are the steps:

  • Run profiler and configure the trace as you would normally.  As you probably know, it’s a good idea to add some filters, limit the number of events you are tracing and limit the columns to what you are looking for – this post isn’t about “the what” either.  What to trace is dependent on what you’re looking for – I hope you know that if you’re messing with profiler in the first place.
  • Run the trace in profiler (yes, you heard me) but stop it right away.  If you feel you absolutely can’t even do this to your server, you won’t be able to generate the script from profiler, you’ll have to go through the tedious process of writing the script yourself.  We need to do this in order to get access to the menu item we’ll need in the next step.
  • Now that you have access to the File menu, go File | Export | Script Trace Definition and choose the appropriate option for your server after that.  You will be asked to save a script file somewhere
  • Open the script file in Management Studio.  You will need to make one change – You will see the string “InsertFileNameHere” (probably on line 19).  The comment above this line has useful information – read it.  I will only add that, depending on the nature of your server and your trace, your DBA might be quite particular about where files like this go.  If you’re not sure, run the following line to get the default backup directory and put it there – this should be relatively safe.

EXEC master.dbo.xp_instance_regread N'hkey_local_machine',N'software\microsoft\mssqlserver\mssqlserver',N'backupdirectory'

  • Execute the script – take note of the number it gives you at the end for TraceID – you will need it to shut down the trace.  At this point your trace is running – if you shut down your computer and go on holidays, it will continue running on the server until something happens – it is a server side trace, after all.
  • You can view the results of the trace at any time in profiler by just double-clicking on the TRC file you chose in the script – even while it is running (note that in Windows 7 you probably have to explicitly give yourself access to the file).  You can close the results and re-open them to get fresh entries.  This is not the same as a profiler trace because it’s just showing you what was in the file when you double-clicked it – profiler is not actively retrieving updates from your server.  This approach has most of the advantages of running profiler traces without the disadvantages.
  • When you need to stop the trace, you’ll need the following code:

EXEC sp_trace_setstatus @TraceID, 0
EXEC sp_trace_setstatus @TraceID, 2

  • The first line stops the trace, the second line deletes the trace definition from the server.  Note that you need to put the TraceID you made not of above in here – if you didn’t make note of it, you’ll have to figure it out by looking at the results of this:

SELECT * FROM fn_trace_getinfo(NULL)

  • Hint: it’s probably the one with the highest number
  • You can get the trace results into SQL server with the following command:

SELECT * FROM fn_trace_gettable('InsertFileNameHere',1)

That’s about it – there are many options with rotating files once they are of a certain size and stuff like that – read up on it and have fun – and don’t forget to stop your trace 🙂

Explanation for TSQL Challenge 14

First I want to mention that the challenges on databasechallenges.com are a blast.  I recommend trying some of them to hone your TSQL skills.  I haven’t had as much time as I would have liked to try as many as I would have liked, but I submitted solutions to a couple of them, and one of them was chosen as an accepted solution.

Jacob Sebastian has requested that I write an explanation for my submission, so here goes…

For a statement of the challenge itself, go here: http://beyondrelational.com/blogs/tc/archive/2009/09/28/tsql-challenge-14-identify-the-longest-sequence-of-characters-in-a-string.aspx

For my posted solution, go here: http://databasechallenges.com/SQLServer/TSQL/Challenge14/Submissions/Mike_DeFehr

In a nutshell, you’re given some strings and asked to identify and quantify repeating characters within the string.  As with many TSQL problems (and particularly with problems on the challenges site) the solution begins with a table of numbers or a tally table.  A table of numbers is just that:  a table with one row for each of the integers from 1 to as many as you need.  Typically, one might have such a table with potentially millions of rows in it somewhere in your SQL server – the rules of the challenge do not allow you to create one, but they do allow you to use one of the system tables (spt_values) as a table of numbers.  You get a table of a couple of thousand numbers with the following query (which is plenty for this challenge):

    SELECT number

    FROM master..spt_values

    WHERE number > 0

        AND type = ‘P’

 

 

One of the things we can do with a table of numbers is to effectively turn a string into an array of characters.  By feeding the number from the table into the SUBSTRING function, this can be accomplished quite easily:

    DECLARE @String varchar(100)

    SET @String = ‘abcdefghijk’

 

    SELECT      number AS ArrayIndex,

                SUBSTRING(@String,number,1) AS [Character]

    FROM master..spt_values

    WHERE number > 0

          AND type = ‘P’

          AND number <= LEN(@String)

    ArrayIndex  Character

    ———– ———

    1           a

    2           b

    3           c

    4           d

    5           e

    6           f

    7           g

    8           h

    9           i

    10          j

    11          k

 

    (11 row(s) affected)         

       

  

Note that if you like a zero-based array, a few subtle changes are required, but it’s no problem:

    DECLARE @String varchar(100)

    SET @String = ‘abcdefghijk’

 

    SELECT  number AS ArrayIndex,

            SUBSTRING(@String,number+1,1) AS [Character]

    FROM master..spt_values

    WHERE number >= 0

        AND type = ‘P’

        AND number < LEN(@String)

    ArrayIndex  Character

    ———– ———

    0           a

    1           b

    2           c

    3           d

    4           e

    5           f

    6           g

    7           h

    8           i

    9           j

    10          k

 

    (11 row(s) affected)

Now that we have our string in array form, we can partition a new row number by Character.  This puts all like characters together whether they are in sequence or not.  This is where the solution starts to depart from traditional iterative thinking.  We will not be “going through” the string looking at the characters and seeing if they are in sequence.  We are establishing properties about our set of characters that we will use to form our solution.  Here is what it looks like with the Partitioned number.  I’ll use a slightly more interesting string in this example.

    DECLARE @String varchar(100)

    SET @String = ‘AFBB04BFFF5’

 

    SELECT    Number AS ArrayIndex,

            SUBSTRING(@String,Number,1) AS [Character],

            ROW_NUMBER() OVER (

                    PARTITION BY SUBSTRING(@String,Number,1)

                    ORDER BY Number) AS PartitionedNumber

    FROM master..spt_values

    WHERE Number <= LEN(@String)

        AND number > 0

        AND [type] = ‘P’

   

    ArrayIndex  Character PartitionedNumber

    ———– ——— ——————–

    5           0         1

    6           4         1

    11          5         1

    1           A         1

    3           B         1

    4           B         2

    7           B         3

    2           F         1

    8           F         2

    9           F         3

    10          F         4

 

    (11 row(s) affected)

 

 

Notice that when you partition a row number, it starts over every time the character changes.  Our string is “out of order” now, but that is fine – this is one of the challenges when starting to think in sets – order is not important.  We have preserved the property, or the idea of string order in the ArrayIndex column.  The result set happened to come out this way because the optimizer found it most efficient to change the order to satisfy the requirements of the ROW_NUMBER operation, but it could come out in any order for the purposes of our solution.  This order *is* convenient, however, to visualize the next step.  Notice that the ArrayIndex and The PartitionedNumber columns track together when there is a sequence of characters.  They can be completely different numbers, but when a sequence starts happening, they go up together.  When the same character exists, but is out of sequence, the pattern is broken (as what happens with the ‘B’).  This is a technique I adapted from Itzik Ben-Gan’s solution for finding islands (unbroken ranges of values in a set).  If you get a chance to see his “tips and tricks” demos (or any of his presentations for that matter), there is much gold there.  Taking the difference between the two columns not only illustrates this better, but gives us something we can use:

    DECLARE @String varchar(100)

    SET @String = ‘AFBB04BFFF5’

 

    SELECT    Number AS ArrayIndex,

            SUBSTRING(@String,Number,1) AS [Character],

            ROW_NUMBER() OVER (

                    PARTITION BY SUBSTRING(@String,Number,1)

                    ORDER BY Number) AS PartitionedNumber,

            number – ROW_NUMBER() OVER (

                    PARTITION BY SUBSTRING(@String,Number,1)

                    ORDER BY Number) AS RNDiff

    FROM master..spt_values

    WHERE Number <= LEN(@String)

        AND number > 0

        AND [type] = ‘P’

   

    ArrayIndex  Character PartitionedNumber    RNDiff

    ———– ——— ——————– ——————–

    5           0         1                    4

    6           4         1                    5

    11          5         1                    10

    1           A         1                    0

    3           B         1                    2

    4           B         2                    2

    7           B         3                    4

    2           F         1                    1

    8           F         2                    6

    9           F         3                    6

    10          F         4                    6

 

    (11 row(s) affected)

 

We have 2 sequences in our string:  2 B’s and 3 F’s – these are highlighted by identical amounts in the new RNDiff column.  In fact, within any given character, if the RNDiff amounts are the same, they represent a sequence – so we can apply a group by.  We will make our existing query into a derived table and feed it into a group by query:

    DECLARE @String varchar(100)

    SET @String = ‘AFBB04BFFF5’

 

    SELECT    [Character],

            MIN(ArrayIndex) AS Pos,

            COUNT(*) AS [Len]

    FROM (    SELECT    Number AS ArrayIndex,

                    SUBSTRING(@String,Number,1) AS [Character],

                    ROW_NUMBER() OVER (

                        PARTITION BY SUBSTRING(@String,Number,1)

                        ORDER BY Number) AS PartitionedNumber,

                    number – ROW_NUMBER() OVER (

                        PARTITION BY SUBSTRING(@String,Number,1)

                        ORDER BY Number) AS RNDiff

            FROM master..spt_values

            WHERE Number <= LEN(@String)

                AND number > 0

                AND [type] = ‘P’    ) a

    GROUP BY [Character], RNDiff

   

    Character Pos         Len

    ——— ———– ———–

    A         1           1

    F         2           1

    B         3           2

    0         5           1

    B         7           1

    4         6           1

    F         8           3

    5         11          1

 

    (8 row(s) affected)

 

What this is telling us is that we have 8 sequences of characters – most of which have a length of only one.  We’re only interested in sequences greater than one, so a HAVING clause will take care of that:

    DECLARE @String varchar(100)

    SET @String = ‘AFBB04BFFF5’

 

    SELECT    [Character],

            MIN(ArrayIndex) AS Pos,

            COUNT(*) AS [Len]

    FROM (    SELECT    Number AS ArrayIndex,

                    SUBSTRING(@String,Number,1) AS [Character],

                    ROW_NUMBER() OVER (

                        PARTITION BY SUBSTRING(@String,Number,1)

                        ORDER BY Number) AS PartitionedNumber,

                    number – ROW_NUMBER() OVER (

                        PARTITION BY SUBSTRING(@String,Number,1)

                        ORDER BY Number) AS RNDiff

            FROM master..spt_values

            WHERE Number <= LEN(@String)

                AND number > 0

                AND [type] = ‘P’    ) a

    GROUP BY [Character], RNDiff

   

    Character Pos         Len

    ——— ———– ———–

    A         1           1

    F         2           1

    B         3           2

    0         5           1

    B         7           1

    4         6           1

    F         8           3

    5         11          1

 

    (8 row(s) affected)

 

Now that this is starting to look suspiciously like the answer to our question – note that the minimum of the ArrayIndex column is the position of the start of the sequence and the count is the length of the sequence.  We’re ready to apply this technique to the data in the challenge.  Funny I should use the word “apply” because the cross apply operator is what we need here.  Cross apply will execute the query using each row of data as it’s input and will expand the number of rows in the result by the number of rows it returns – we see that this is exactly what is required by the expected output.  Replacing the @String variable with the data from the sample data table, here is the final result:

    DECLARE @t TABLE (Data VARCHAR(40) )

 

    INSERT @t (Data) SELECT ‘9992EDC6-D117-4DEE-B410-4E5FAE46AE97’

    INSERT @t (Data) SELECT ‘0BFC936B-BD9A-4C6A-AFB2-CF3F1752F8B1’

    INSERT @t (Data) SELECT ‘4A73E7EB-7777-4A04-9258-F1E75097977C’

    INSERT @t (Data) SELECT ‘5AAF477C-274D-400D-9067-035968F33B19’

    INSERT @t (Data) SELECT ‘725DA718-30D0-44A9-B36A-89F27CDFEEDE’

    INSERT @t (Data) SELECT ‘8083ED5A-D3B9-4694-BB04-F0B09C588888’

 

    SELECT    t.Data,

            c.[Char],

            c.[Pos],

            c.[Len]

    FROM @t t

        CROSS APPLY (SELECT    [Char],

                        MIN(RowNum) AS Pos,

                        COUNT(*) AS [Len]

                FROM    (    SELECT    Number AS RowNum,

                                SUBSTRING(t.Data,Number,1) AS [Char],

                                ROW_NUMBER() OVER (

                                    PARTITION BY SUBSTRING(t.Data,Number,1)

                                    ORDER BY Number) – Number AS RNDiff

                        FROM master..spt_values

                        WHERE Number <= LEN(t.Data)

                            AND number > 0

                            AND [type] = ‘P’

                    ) b

                GROUP BY [Char], RNDiff

                HAVING COUNT(*) > 1

                ) c

    ORDER BY MAX(c.[Len]) OVER (PARTITION BY t.Data) DESC, Data, Pos

   

    Data                                     Char Pos         Len

    —————————————- —- ———– ———–

    8083ED5A-D3B9-4694-BB04-F0B09C588888     B    20          2

    8083ED5A-D3B9-4694-BB04-F0B09C588888     8    32          5

    4A73E7EB-7777-4A04-9258-F1E75097977C     7    10          4

    4A73E7EB-7777-4A04-9258-F1E75097977C     7    34          2

    9992EDC6-D117-4DEE-B410-4E5FAE46AE97     9    1           3

    9992EDC6-D117-4DEE-B410-4E5FAE46AE97     1    11          2

    9992EDC6-D117-4DEE-B410-4E5FAE46AE97     E    17          2

    5AAF477C-274D-400D-9067-035968F33B19     A    2           2

    5AAF477C-274D-400D-9067-035968F33B19     7    6           2

    5AAF477C-274D-400D-9067-035968F33B19     0    16          2

    5AAF477C-274D-400D-9067-035968F33B19     3    32          2

    725DA718-30D0-44A9-B36A-89F27CDFEEDE     4    15          2

    725DA718-30D0-44A9-B36A-89F27CDFEEDE     E    33          2

 

    (13 row(s) affected)

Note that the final piece was to get the sorting right.  The challenge required that the string with the longest sequence be on top, but then it should be sorted by the order in which the sequences appear in the string.  This is accomplished by partitioning a maximum of the length of the sequences by Data.  The important thing to note here is that the OVER clause can be applied to any aggregate and does not require a group by since the value is simply repeated for each member of the group.

This completes the solution.  This was an excellent challenge because it required an interesting set-based solution to what is intuitively an iterative problem and made good use of the newer language elements (CROSS APPLY and the OVER clause applied to both an aggregate and ROW_NUMBER).

Enjoy and be sure to try the next TSQL challenge.

 

Winnipeg Code Camp 2010 Scripts

This post is for those who attended my session on query tuning at the Winnipeg Code Camp on Saturday, February 27.

I hope you enjoyed the session.  Attached is the demo script I was using along with a setup script that will install the database I was using.  Note that the setup script does some randomization, so the results will be similar, but not identical to what we saw in the session.  Don’t hesitate to let me know if you have any questions.

Intro Query Tuning Scripts