You could be looking at an execution plan on a query and see this message: Warning, No Join Predicate. With a very few exceptions (very few), that’s not a warning, like “Mind the gap” or “Your Mileage May Vary” or “Never fight a land war in Asia.” No. It’s a screaming shout saying “BY ALL THE GODS IN VALHALA, YOU DON’T HAVE ANY JOIN PREDICATES IN THIS QUERY!” Seriously, that’s exactly what it says.
But, you might be asking yourself, that’s what it says, but what does it mean? Please allow me to explain. The tables in your system, whether it’s a classic BI star schema, or a highly normalized OLTP system, or even (shudder) ORM objects, are related to one another. Usually they’re related through the use of primary and foreign keys. The primary key uniquely identifies a row, like a particular value in a dimension table. This then is related to some number of rows in another table, like in a fact table, through the foreign key. The foreign key acts to protect the integrity of the data, ensuring that if you insert something that’s supposed to relate to another row, then that other row really exists and if you try to delete that other row, the foreign key will remind you that, hey, that row has things that depend on it existing in another table. All of this can be quite complicated or quite simple. Heck, you may be using an ORM system and don’t have a single foreign key in site.
But, when you go to retrieve data from two tables that are related in a TSQL query, you need to define that relationship in the TSQL itself. SQL Server 2008 relies on what was defined as the ANSI standard back in 1992. But, if you learned your SQL coding skills back in the day or on Oracle, you may be used to simply listing tables and then defining their relationships through the WHERE clause. Just a side note, unless you learned your SQL skills in Oracle, if you were born AFTER the 1992 standard was enacted, you don’t have an excuse. Anyway, the old way that code looked like was this:
SELECT... FROM TableA AS a, TableB as b, TableC as c WHERE a.MyID = b.MyID AND b.YourID *= c.YourID AND....
What you have there is an inner join, or in the old parlance, an equi-join because of the use of the equals sign, between TableA and TableB and an left outer join, defined through the use of the ‘*’, between TableB and TableC. Any other criteria that defined the WHERE clause then followed.
Flat out, in SQL Server 2008, you can’t do this any more. It will not support that style of outer join. The new fangled method, remember, new as of 1992, is this:
SELECT ... FROM TableA as a INNER JOIN TableB as b ON a.MyID = b.MyID LEFT JOIN TableC as c ON b.YourID = c.YourID WHERE....
This forces you to separate the relations between the tables as defined by the JOIN’s and the filtering methods supplied through the WHERE clause. And yes, it’s a lot more complicated than I’m making it out to be, but I’m trying to establish the ground rules here. Where was I? Oh yeah, missing join criteria.
What happens is, someone that’s used to writing things in the ANSI ’89 syntax can either forget to define the JOIN criteria or accidently remove it while editing or something. Regardless of how it happened, you’ve ended up with what is known as a cartesian product, shortest possible explanation, everything joined on everything else. For example, here’s a real query against the AdvenureWorks2008 database:
SELECT * FROM Sales.SalesOrderHeader AS soh ,Sales.SalesOrderDetail AS sod ,Production.Product AS p WHERE soh.SalesOrderID = 43659;
That query does not define join criteria between any of the tables involved. This results in the following estimated execution plan (estimated because I got sick of waiting for the query to finish):
If you look at the final Nested Loop operation right before the SELECT statement operator, you’ll notice there is a little exclamation point inside a yellow triangle. This is an indication of a warning in your query. You can hover over the operator to get the tool tip, which looks like this:
You can see at the bottom the warning that no join predicate is in use. If you constructed the query appropriately it would look more like this:
SELECT * FROM Sales.SalesOrderHeader AS soh JOIN Sales.SalesOrderDetail AS sod ON soh.SalesOrderID = sod.SalesOrderID JOIN Production.Product AS p ON sod.ProductId = p.ProductID WHERE soh.SalesOrderID = 43659;
Which, instead of returning 800,000 rows before I interrupted it returned only 12 rows and had a very clean execution plan:
I would never consider “No Join Predicate” as a warning. It’s a blaring klaxon of a problem and you need to do something about it, right away.