Feb 18 2014

The CASE Statement and Performance

In case you don’t know, this query:

UPDATE dbo.Test1
SET C2 = 2
WHERE C1 LIKE '%33%';

Will run quite a bit slower than this query:

UPDATE dbo.Test1
SET C2 = 1
WHERE C1 LIKE '333%';

Or this one:

UPDATE dbo.Test1
SET C2 = 1
WHERE C1 = '333';

That’s because the second two queries have arguments in the filter criteria that allow SQL Server to use the statistics in an index to look for specific matching values and then use the balanced tree, B-Tree, of the index to retrieve specific rows. The argument in the first query requires a full scan against the index because there is no way to know what values might match or any path through the index to simply retrieve them.

But, what if we do this:

UPDATE dbo.test1
SET C2 =
CASE
WHEN C1 LIKE '19%' THEN 3
WHEN C1 LIKE '25%' THEN 2
WHEN C1 LIKE '37%' THEN 1
END;

We’re avoiding that nasty wild card search, right? So the optimizer should just be able to immediately find those values and retrieve them… Whoa! Hold up there pardner. Let’s set up a full test:

IF (SELECT  OBJECT_ID('Test1')
   ) IS NOT NULL 
    DROP TABLE dbo.Test1; 
GO
CREATE TABLE dbo.Test1 (C1 VARCHAR(50),C2 INT, C3 INT IDENTITY);

SELECT TOP 1500
        IDENTITY( INT,1,1 ) AS n
INTO    #Nums
FROM    Master.dbo.SysColumns sC1,
        Master.dbo.SysColumns sC2;
        
INSERT  INTO dbo.Test1
        (C1,C2)
        SELECT  n, n
        FROM    #Nums;
        
DROP TABLE #Nums;

CREATE CLUSTERED INDEX i1 ON dbo.Test1 (C1) ;


UPDATE dbo.test1
SET C2 =
CASE
WHEN C1 LIKE '%42%' THEN 3
WHEN C1 LIKE '%24%' THEN 2
WHEN C1 LIKE '%36%' THEN 1
END

DBCC FREEPROCCACHE()

UPDATE dbo.test1
SET C2 =
CASE
WHEN C1 LIKE '19%' THEN 33
WHEN C1 LIKE '25%' THEN 222
WHEN C1 LIKE '37%' THEN 11
WHEN C1 LIKE '22%' THEN 5
END

I added the extra CASE evaluation in the second query in order to get a different query hash value.

Here are the execution plans from the two queries:

TwoPlans

They’re pretty identical. Well, except for me forcing a difference in the hash values, they’re identical except for the details in the Compute Scalar operator. So what’s going on? Shouldn’t that second query use the index to retrieve the values? After all, it avoided that nasty comparison operator, right? Well, yes, but… we introduced a function on the columns. What function you ask? The CASE statement itself.

This means you can’t use a CASE statement in this manner because it does result in bypassing the index and statistics in the same way as using functions against the columns do.

14 Comments

  • By Trever, February 18, 2014 @ 10:36 am

    So what would be a more efficient way to do this?

  • By Mark, February 18, 2014 @ 11:50 am

    Im guessing the faster way would be to have queries for each like and then union all them together.

  • By Grant Fritchey, February 18, 2014 @ 11:51 am

    To a degree I’m comparing apples & oranges since the top examples are updating where a certain value is equivalent and the lower methods are updating all values based on the CASE statement resolutions. But… they are actually the same thing too. It’s just one is a batch update, requiring the scan, no matter what, and the other is targeted.

    Your choices, eat the scan, or switch to targeted updates. No real choice. Personally, I’d do the targeted updates.

  • By Mark, February 18, 2014 @ 11:55 am

    EDIT above comment: in this case it was an update, my above comment was thinking of a select. You could do separate updates for each condition with a where clause, I guess.

  • By Saeid Hasani, February 19, 2014 @ 3:42 am

    Thanks Grant!
    This is an invaluable post. Just test with one condition in CASE statement, and result is the same!

    UPDATE dbo.test1
    SET C2 =
    CASE
    WHEN C1 LIKE ‘19%’ THEN 33
    –WHEN C1 LIKE ‘25%’ THEN 222
    –WHEN C1 LIKE ‘37%’ THEN 11
    –WHEN C1 LIKE ‘22%’ THEN 5
    END

  • By Grant Fritchey, February 19, 2014 @ 5:53 am

    Without the WHERE clause, it’s doing a scan, no matter what.

  • By Gianluca Sartori, February 19, 2014 @ 11:52 am

    Nice one, Grant.
    Here’s a way to achieve the same results with a couple of seeks instead of a scan:

    WITH somevalues AS (
    SELECT *
    FROM (
    VALUES
    (‘19%’,33),
    (‘25%’,222),
    (‘37%’,11),
    (‘22%’,5)
    ) v (v1, v2)
    )
    UPDATE t
    SET c2 = v.v2
    FROM dbo.test1 AS t
    INNER JOIN somevalues AS v
    ON t.c1 LIKE v.v1;

  • By Grant Fritchey, February 19, 2014 @ 12:21 pm

    That’s pretty neat. Nicely done.

  • By Dwain Camps, February 24, 2014 @ 6:33 am

    I’m not sure exactly why, but at 1M rows in the test harness, this seems to use about 10-15% less CPU and runs in nearly the same amount of time (maybe 10% more) as Spaghetti DBA’s:

    WITH somevalues AS (
    SELECT *
    FROM (
    VALUES
    (‘19%’,33),
    (‘25%’,222),
    (‘37%’,11),
    (‘22%’,5)
    ) v (v1, v2)
    )
    UPDATE t
    SET c2 = v2
    FROM dbo.test1 t
    CROSS APPLY (SELECT v2 FROM somevalues WHERE c1 LIKE v1) b
    WHERE EXISTS (SELECT v2 FROM somevalues WHERE c1 LIKE v1);

    This certainly was a learning experience.

  • By Grant Fritchey, February 24, 2014 @ 8:14 am

    Nice. So it’s about applying a mechanism that lets SQL Server filter as opposed to a straight scan. While the CASE statement works in a way that is similar to a WHERE clause, it’s not a WHERE clause and therefore, scans. Getting an actual filtering mechanism in place, performance shoots through the roof.

    Nice work.

  • By Brunno Lira, February 24, 2014 @ 9:39 am

    In my tests the following is much better to read the command and the execution plan, plus the less time and io.

    UPDATE dbo.test1
    SET C2 =
    CASE
    WHEN C1 LIKE ‘19%’ THEN 33
    WHEN C1 LIKE ‘25%’ THEN 222
    WHEN C1 LIKE ‘37%’ THEN 11
    WHEN C1 LIKE ‘22%’ THEN 5
    END
    where C1 LIKE ‘19%’ or C1 LIKE ‘25%’ or C1 LIKE ‘37%’ or C1 LIKE ‘22%’

  • By Grant Fritchey, February 24, 2014 @ 9:51 am

    I’m sure it’ll work better. It has a WHERE clause. I’ll bet tossing the entire construct and going with a series of individual updates will be even better (don’t take that bet, it just will be).

  • By Dwain Camps, March 26, 2014 @ 9:06 am

    Of course, there is always this approach that may help:
    http://dwaincsql.com/2014/03/26/getting-an-index-seek-to-speed-up-like-string-searches/

Other Links to this Post

  1. (SFTW) SQL Server Links 21/02/14 • John Sansom — February 21, 2014 @ 4:25 am

RSS feed for comments on this post. TrackBack URI

Leave a comment