1. About this Table Joins Article
“Table Joins” are useful for bringing data together from different tables based on their database relations. First, we will see how the join operates between tables. Then we will test the Order of Execution when Join and where condition both lies. Finally, we will move our exploration to Importance of the Join order. It is a good idea to know about database table relationships before reading this article.
2. Script For The Sample Tables
Before you begin, download the attached script. The downloaded script has T-SQL for creating three tables and data for this article. You should also run the Northwnd script as some example here uses the NorthWnd DB. Once you downloaded the script CreateObject.zip run the script in the NorthWnd Db. Below is the content of three tables created by the Script:
We will use these tables to perform the Table Joins. These tables are just for demo purpose only and so we may not have proper table relationship in terms of Primary key and Foreign keys. OK, Let us move on.
3. Cartesian Product of Table
Usually, Table Join will be done between two tables based on the key columns in them. These keys together creates the database table relationship. For Example, DeptId in the employee table and DeptId in the Department table can make a relationship between these two tables.
The below example is joining two tables using no key columns. Here, TableA and TableB are clubbed together to form the whole result-set based on “Cartesian Product”. The Cartesian product will take a single record in the first table and ties it with all the records in the second table. Then picks up the second records in the first table and attaches it with all the records the second table and this goes on till the end of the record in the first table.
The result of the Cartesian Join is shown below:
4. Mapping Column of Joining Tables
When joining two tables to avoid the bulk number of records that results as shown in the previous example, we should choose a join column from both the tables. The example below joins Table_A and Table_B based on the ID column. Since we set up column mapping between two tables, we will cut down huge redundant records when compared to Cartesian Product. Below is the result of the Table Join:
Note, SQL Server returns Row Number 1 and Row number 5 because of the Table Join. These rows satisfy the mapping condition A.Id = B.Id. The below picture shows the column mapping in Red Box. You can see that mapping produces the sub-set of the Cartesian Join.
5. Multiple Table Join
In the previous example, two tables took part in the Table Join. To join multiple table, we should consider the previous join result as one table and then join it with the third one. This way we can join multiple numbers of tables. One should consider whatever joint so far as a first table and join it with the new one. The below example shows Table Join involving three tables:
First Table_A joins with Table_B, which is nothing but our example in the previous section. We consider the join result of A and B as a single table say AB. Now, this AB is joint with the Table_Trans forming the join of three tables. The below picture shows this:
6. Types of Join
In SQL, there are various types of joins available based on the way we join columns on two different tables. The types we will discuss here are:
- Full Join
- Inner Join
- Left outer Join
- Right outer Join
What we saw in the previous two sections are the Inner Table joins. If we join the same table, we call it as Self join. Let us see an example for the join types in next coming examples. Before we go into those examples, remember that the result computed so for in the join is seen as LEFT and the new table coming to join the existing result is RIGHT. This is useful when we are joining multiple tables.
7. Full Join
A full Table Join differs from the Cartesian product. Cartesian product will get all the row combination between the two joining tables. Full join takes the matching columns plus all table rows from the left table that does not match the right and all tables rows in the right that does not match the left. It applies null for unmatched row on the other end when doing so. The below example shows the full join between Table_A and Table_C:
- In the above picture, the Blue Row is the matching row on both the table.
- Second row (Green First, red next) is the unmatched one. Row exists on the Left table and null substituted for all the columns in the Right.
- Third row (Red First, Green next) is also the unmatched one. Row exists on the Right side table, null returned for the left one.
Now let we Look at the From Clause. We took Table_A first and join it with Table_C. Here, the result set computed so for always termed as Left side of join and the new table going to be joint is treated as a Right side of the join. So, when say Left table, in our query it is Table_A and Table_C is Right Table.
8. Left Join Example
Left Join makes sure to take all the rows on the Left table. SQL Server does it by placing the null entries for the table, joining on the right side when there is no matching row in it.
In the above example, Id value of 2 in the Left table does not exist on the right side table Table_C.Id. But we still got the row ‘ 2,BBB’ from the Table_A by placing the null entries for the right side table. The above picture shows this in Green and Red boxes. Also note that when SQL is processing, it takes the rows for the Table_A first (So the rows the Table_A is LEFT) then joins it with the Table_C (Right side). It does not matter whether we provide Table Join condition as A.Id = C.Id or C.Id = A.Id.
9. Right Join Example
It is the reverse of the left join. It implies take all the rows on the right side of the table by placing the null on the left table for unmatched rows. Below is the example for it:
Blue Box : Matched rows.
Green : Row exits on the right side table Table_B and match (Based on Id column) not available on the left
Red : Null placement for the columns of Table_A
10. Inner Join Example
We already saw an example for it in section 4 of this article. In Inner Join, SQL Server returns only the matching rows. Inner join returns same result even when we interchange the joining tables.
11. Self Join Employee & Manager
Joining the table with the same table is called the Self Join. To explain Self Join, let us go to the table in the Northwnd database (Uploaded with this article). Have a look at the columns in the employee table. The EmployeeId is the Primary key column and each row in the table belongs to a single member. The ReportsTo column refers some other row in the same table stating that referred row is the manager for the referring employee row. But the referred manager row is also an employee likely having a valid entry on its ReportsTo column. So, in the NorthWnd database this relationship ends up with a hierarchical reporting structure. Also, this makes Self Join possible.
Now let us look at the Self Join Example below:
In the above Self Join, the row pointed by ReportTo column is a Manager. So, the table on the left-hand side of the join represents Employee and table on the Right-hand side denotes Manager (But also an Employee). When we pick the EmployeeName column from the Left side of the join, it is name of the Employee and if we pick it from the right side table; it is Manager Name.
12. Sequence of Execution
When the query combines the outer join with the inner join, the execution sequence is important. If we have only inner Join, the execution sequence is not important as they will provide the same result. Well, what are we talking about?
Let us say we have a query which has both Inner Join and Outer Join. Also, let us assume that we have a where clause that filters the records. In addition, we also assume that the join column does not take part in the where clause. Now, which operation SQL Server performs first? We have two options:
- Apply the where clause record filter first then perform the Table Join
- Apply the Table Join first then perform the Where Clause filter
The above two option returns same result when all the joins involved are inner joins. But the result may differ when we have at least one outer join. OK. SQL chose the second option. Let us examine and prove this. The Example and the result is below:
How the Sequence differs is shown below:
Option 1 [SQL Server 2005 does not do this]
Option 2 [SQL Server 2005 Performs Join First]
So, we have to remember that SQL Server 2005 first performs the Table Join and then applies the where clause on the Join Result. This is done especially when Table Join involves at least one outer join.
13. Order of The Joins
Like the Operation sequence, the Order of the Join also important when we want to mix the Inner Table Joins with Outer Joins. Again, if the entire join involved between the tables are Inner Joins, then the join order is not important. But it matters when we mix the inner and outer Table joins.
What is Order of the Table Join? If my query joins three tables like [X inner Y] Left Z, the order here is Inner Join performed first, and then the Left Join. OK. Let us go back to the NorthWnd Database again. The result you want to achieve is below:
“Get all customer names whether they have ordered products or not. Also, list the quantity of order placed by the customer if they actually placed at-least one order”.
13.1 Outer Join Then Inner Join
Look at the Query and result below: [Outer Join then Inner Join]
From the above query, we can see the Order Of Join as mentioned below:
We can see a Right Join between Orders and Customers. SQL first queries the Orders table (As it appears first) and treats the result as Left. Then it queries the Customers table and treats the result-set as Right. Finally, from both the result-set, SQL Server performs the Right Join. This means, SQL Server ensures us it will not lose any rows on the Right side. To put it differently, it will not lose any rows from the Customers table. So we will get all customers, including those who does not place any orders. Since there are some non-mapped orders, SQL will fill null entries in it. Now the outcome this join is available for the next join and also now it is treated as Left.
The above returned result (Left side) is joint with the Order Details table. SQL knows it already has the Left result-set so it will query the table Order Details to have the Right part of the join. Finally, an Inner join is performed between Left (Result of Right Outer Join) and Right (Order Details) based on the order id. But note that we have two null entries for the ordered column in the Left. So the Inner Table Join just skips those records. Thus, we got 2155 rows skipping the two customers who does not place any orders. This is not the result we want. Read the Underlined text at the top of this section.
13.2 Inner Join Then Outer Join
Now look at the Query and Result below: [Inner Join then Outer Join]
Here, first, Inner Join between Orders and Order Details is performed using OrderId as mapped column. This result of Left side join is then Right joint against the Customers table.
Now let us analyse how this is giving the result we want.
The inner Table join between Order and Order Details brings all the matching records based on the order id. Note, we are not losing any order id here by null values. Then by keeping this already brought result on the left, customers table is queried and kept in the Right side. With both tables on hand, we perform the right join between them using customer id as mapped column. Now, we get all the customers, including the two for which we don’t have any matching records on the Left side table.
So, we have to keep in mind that Table Join order is important when we mix the Inner Join with an Outer Join.
14. Other Way of Achieving The Same Result
When I had a chat with one of my office friend (Mr. Varaprasad), he told that the result we are expecting could be achieved without using the Right Join. How? That is the question I asked him. He told, “Crystal reports does that man! I will show you”. Well, now this section is added to this article based on what I got from him.
OK. Now let us see how this works and gives the expected result of not losing any customers. Note that the Rule remains same, whatever computed so for is Left and the Joining table is on the Right.
- SQL first queries the table Customers and keeps it as the Left side result.
- It reads the open parenthesis and queries the table Orders and keeps it as Left again. Why? SQL Server Says “OK, I know that I should not join this table now and the right side table is not yet ready because of the encountered Open Parenthesis. So I kept this also on the Left side. Now, I need two right side tables to complete the join”.
- Now the Order Details table is queried and kept as a Right side of join as a pairing Left Side is already available.
- A join between Order and Order Details is performed based on the Order Id. The resultant records are treated as right because the Customer table is already queried and kept in the Left. Now the Left Join between the Left and Right side of the result-sets brings all customers because of the final Left Outer Join & Customers being the Left side of the join.
Note: The scripts for creating the Demo tables and NorthWnd database is available as a download from Google Drive.