A View Is Not A Table

12Apr 2016 by Grant Fritchey 8 Comments

Blog post #4 in support of Tim Fordâ€™s (b|t) #iwanttohelp, #entrylevel

In SQL Server, in the T-SQL you use to query it, a view looks just like a table (I’m using the AdventureWorks2014 database for all these examples):

SELECT  *
FROM    Production.vProductAndDescription AS vpad;

SELECT  vpad.Name,
        vpad.Description,
        vpmi.Instructions
FROM    Production.vProductAndDescription AS vpad
JOIN    Production.Product AS p
        ON p.ProductID = vpad.ProductID
JOIN    Production.vProductModelInstructions AS vpmi
        ON vpmi.ProductModelID = p.ProductModelID
WHERE   vpad.ProductID = 891
        AND vpad.CultureID = 'fr';

The above query actually combines two views and a table. This is what is commonly referred to as a “code smell”. A code smell is a coding practice that works, but that can lead to problems. In this case, we’re talking about performance problems. The performance problems when using views to join to tables and other views as if they were real tables comes about because a standard view is not a table. Its a query. For example, the second view introduced, vPorductModelInstructions looks like this:

ALTER VIEW [Production].[vProductModelInstructions] 
AS 
SELECT 
    [ProductModelID] 
    ,[Name] 
    ,[Instructions].value(N'declare default element namespace "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/ProductModelManuInstructions"; 
        (/root/text())[1]', 'nvarchar(max)') AS [Instructions] 
    ,[MfgInstructions].ref.value('@LocationID[1]', 'int') AS [LocationID] 
    ,[MfgInstructions].ref.value('@SetupHours[1]', 'decimal(9, 4)') AS [SetupHours] 
    ,[MfgInstructions].ref.value('@MachineHours[1]', 'decimal(9, 4)') AS [MachineHours] 
    ,[MfgInstructions].ref.value('@LaborHours[1]', 'decimal(9, 4)') AS [LaborHours] 
    ,[MfgInstructions].ref.value('@LotSize[1]', 'int') AS [LotSize] 
    ,[Steps].ref.value('string(.)[1]', 'nvarchar(1024)') AS [Step] 
    ,[rowguid] 
    ,[ModifiedDate]
FROM [Production].[ProductModel] 
CROSS APPLY [Instructions].nodes(N'declare default element namespace "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/ProductModelManuInstructions"; 
    /root/Location') MfgInstructions(ref)
CROSS APPLY [MfgInstructions].ref.nodes('declare default element namespace "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/ProductModelManuInstructions"; 
    step') Steps(ref);

GO

That’s a query against the XML stored in the ProductModel table. The view was created to mask the complexity of the necessary XPath code, while providing a mechanism for retrieving the data from the XML. This is a common use of views. However, when we then treat the view as a table, and join it to other tables and views, we present a problem for the optimizer. Because a view is not a table, but is instead a query, the optimizer has to resolve this query in combination with any other views or tables to arrive at an execution plan for the whole combined mess. While the optimizer is very good at what it does, because of the complexity caused by the additional unnecessary processing to figure out which parts of the view is not needed to satisfy the query, it can make poor choices. That can result in poor performance.

If I were to rewrite the query, it would look something like this:

SELECT  p.Name,
        pd.Description,
        pm.Instructions.value(N'declare default element namespace "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/ProductModelManuInstructions"; 
        (/root/text())[1]', 'nvarchar(max)') AS Instructions
FROM    Production.Product AS p
JOIN    Production.ProductModelProductDescriptionCulture AS pmpdc
        ON pmpdc.ProductModelID = p.ProductModelID
JOIN    Production.ProductDescription AS pd
        ON pd.ProductDescriptionID = pmpdc.ProductDescriptionID
JOIN    Production.ProductModel AS pm
        ON pm.ProductModelID = p.ProductModelID
CROSS APPLY Instructions.nodes(N'declare default element namespace "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/ProductModelManuInstructions"; 
    /root/Location') MfgInstructions (ref)
CROSS APPLY MfgInstructions.ref.nodes('declare default element namespace "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/ProductModelManuInstructions"; 
    step') Steps (ref)
WHERE   p.ProductID = 891
        AND pmpdc.CultureID = 'fr';

That’s a lot more complex than the query we had above that only referenced three objects and had only two JOIN operations. However, if you capture the I/O and the execution time for these queries, you’ll see a different story.

I used two methods for measuring performance. I used SET STATISTICS IO and SET STATISTICS TIME to ON for the queries for one set of tests. For another set I used Extended Events. Consistently the execution time for the query with the view was around 110ms. The query that didn’t reference any views was around 37ms. The reads were 155 for the query with views, but only 109 for the query without. If you look at the individual table I/O, you can start to see where the differences come from. These are the results from the query with the views:

Table ‘ProductDescription’. Scan count 0, logical reads 56
Table ‘ProductModelProductDescriptionCulture’. Scan count 28, logical reads 56
Table ‘xml_index_nodes_418100530_256001’. Scan count 13, logical reads 37
Table ‘ProductModel’. Scan count 0, logical reads 2
Table ‘Product’. Scan count 0, logical reads 4

These are the results for the query without the view:

Table ‘ProductDescription’. Scan count 0, logical reads 56
Table ‘xml_index_nodes_418100530_256001’. Scan count 13, logical reads 37
Table ‘ProductModelProductDescriptionCulture’. Scan count 6, logical reads 12
Table ‘ProductModel’. Scan count 0, logical reads 2
Table ‘Product’. Scan count 0, logical reads 2

You can see the differences in both ProductModelProductDescriptionCulture and Product. This is because of the differences in the execution plans caused by differences in the choices made by the optimizer.

A standard view is not a table. There is such a thing as a materialized view or indexed view, which is a table. That’s not what we’re talking about here. While you can use a view as if it wasÂ a table, don’t mistake it for a table. A view is just a mask in front of a query. It can’t be used like an object so that you avoid rewriting the same JOIN. That will lead to issues for the optimizer as this simple set of examples showed. Don’t shy away from using views, just understand what their real behavior is. A view is a query, not a table.

8 thoughts on “A View Is Not A Table”

Greg Moore

I’m going to pedantically disagree, but for what I think is for an important reason.

Codd was fairly clear that everything in a RDBMS should be represented by as a table. This was his information rule: “All information in an RDBMS is represented logically in just one way – by values in tables.”

So, strictly speaking a view IS a table. Just as is the result of a query.

That said, you’re right. Treating a view like a physically realized table can be the recipe for a disaster if you don’t understand the view.

But I think a common mistake a lot of folks make coming into RDBMSs in general is they start to think that query sets, tables, and views somehow operate differently or should be treated differently. This can lead to bad code and design. Ultimately they’re just all forms of a table, virtual, realized, or otherwise.

So I agree on one hand, I disagree on the other. On the gripping hand… I’ll leave that reference for the sci-fi geeks.

April 14, 2016 at 9:57 pm
Grant Fritchey

Ha!

First, thanks for the feedback.

I realize I’m violating Codd for stating this the way I state it. It’s semi-intentional.

More importantly, I was just having a conversation with the spawn (my son) that he needed to read “The Mote in God’s Eye”. This just reinforced that. Thanks!

April 14, 2016 at 10:27 pm
Views Aren’t Tables – Curated SQL

[…] Grant Fritchey dives into the problem with nested views: […]

April 15, 2016 at 8:01 am
Dony

There are two differences in your queries: 1. In the VIEW-query you select on vpad.ProductID while on the TABLE-query you select on p.ProductID, that might lead to different choices from the optimizer; 2. In the VIEW-query, the second CROSS APPLY ensures that only instructions with at least a single step are shown (it is not an OUTER APPLY), while this extra condition has been ommited in the TABLE-query. The later emphasizes on the difficulty of ‘stripping’ queries without changing the results. I’ve this kind of minor change quite often, it may long stay unnoticed until some day the first instruction without steps is added to this database. You were talking about code smells, but these kind of subtle changes are very hard to track down; I strongly advice to never remove an INNER JOIN or a CROSS APPLY from a query unless you can be absolutely sure (for example from a checked FOREIGN KEY constraint) that it will not have any effect on the query results. Just my two cents …

April 19, 2016 at 4:02 am
Dony

Sorry Grant, due to scrolling I missed the second CROSS APPLY, you did not miss it. Please remove my post, I was in error.

April 19, 2016 at 4:09 am
Grant Fritchey

Hi Dony,

Nah, I think it’s still worth mentioning. It is a valid point that just because you toss the view, if you toss it incorrectly, that will lead to other problems. There ain’t no such thing as a free lunch.

April 19, 2016 at 6:45 am
Stan Geiger

I am going to disagree with Greg. A view is not a table, maybe in logical terms, but what we are really talking about is bad query design and performance. From that standpoint you treat it like a query and have to understand that you are going to possibly incur performance issues when using them in joins. They can be a good thing and a bad thing. I have had to deal with people who fall in love with materialized views which have caused all kinds of issues.

We can talk semantically about them being logical tables but they don’t work that way at a physical level. I am personally not fan of having a bunch code with multi-view joins in them. IMO, the best use cases were around creating them for applications to use to simplify complex SQL. Its when you start mixing and matching as Grant mentioned that you run into issues.

April 19, 2016 at 10:08 am
Aaron

I’d like to argue.. that views are convienent.. but the optimization route for views should be to replace them with raw tables (that are properly indexed).

building a view that does some XML scrubbing MAKES SENSE to me.. with the idea that you can tehn take that DATA and push it directly into a table.

Database Development without views.. is barbaric, in my opinion.

April 20, 2016 at 3:39 pm