Novell Home

To JOIN or to LEFT JOIN

From Developer Community

Contents

Introduction

Relational databases are built specifically to allow relationships to be created between different sets of data. The way to discover the type of relationship is to JOIN data from two separate tables in a database. The two most common ways to join tables are to perform an INNER JOIN or a LEFT JOIN. Why would you use one instead of the other? To determine that, let's first look at how the two JOIN syntaxes work and then discuss when to use each of them.


INNER JOIN

Example 1
SELECT * FROM A INNER JOIN B
Table A
value
1
2
x
Table B
value
3
4
=
Result Set
A.value B.value
1 3
1 4
2 3
2 4

First, an inner join is often called a simple join or just a join. This is the most common way to join data between tables. In general, an inner join is a cross product of all tables that are joined together. For example, if you join table A with values 1 and 2 and table B with values 3 and 4, you end up with a result set similar to that shown in example 1. You can think of a result set as a newly created table that was based on some relationship between other tables. In example 1, the relationship is every possible combination of values from tables A and B, which is the most basic join.

This type of join is not very useful unless you can define a very specific relationship. It is very seldom that you want the entire cross product of two tables. The SQL ON clause allows you to narrow the cross project result to just the values you are interested in. If you are only interested in results where the value from the first table is 1, you could narrow your result set by stating something like ON A.value=1. If you wanted to see where the two tables intersect, you could use something like ON A.value=B.value. Example 2 shows a few different ON clauses and their results.

Example 2
SELECT * FROM A INNER JOIN B ON A.value=1
Table A
value
1
2
x
Table B
value
3
4
=
Result Set
A.value B.value
1 3
1 4
2 3
2 4
=>
Result Set
A.value B.value
1 3
1 4

SELECT * FROM A INNER JOIN B ON B.value=3
Table A
value
1
2
x
Table B
value
3
4
=
Result Set
A.value B.value
1 3
1 4
2 3
2 4
=>
Result Set
A.value B.value
1 3
2 3

SELECT * FROM A INNER JOIN B ON A.value=B.value
Table A
value
1
2
x
Table B
value
3
4
=
Result Set
A.value B.value
1 3
1 4
2 3
2 4
=>
Result Set
A.value B.value

SELECT * FROM A INNER JOIN B ON A.value=B.value
Table A
value
1
2
x
Table B
value
1
4
=
Result Set
A.value B.value
1 1
1 4
2 1
2 4
=>
Result Set
A.value B.value
1 1

The preceding examples show how a database engine determines what values should be returned, based on the relationship you define. The database then creates a cross product of the tables you specify and then narrows the result set based on your specifications in the ON clause. Of couse, good database systems optimize this process and employ short cuts to find the answer much quicker; but the basic process is the same.

LEFT JOIN

Think of a left join as an inner join with one added stipulation: every value in the left table (the one on the left of the words left join) are in the result set at least once. If there is no corresponding value in the right table for a row in the left table (based on the defined relationship), the row is added to the result set with the values from the right table set to null. For example, in the last select statement from Example 2 above, there were no results returned. If we change the join type to be a left join, results like those in Example 3 occur.

Example 3
SELECT * FROM A LEFT JOIN B ON A.value=B.value
Table A
value
1
2
x
Table B
value
3
4
=
Result Set
A.value B.value
1 3
1 4
2 3
2 4
=>
Result Set
A.value B.value
1 null
2 null

In essence, what happens is an inner join. If there is a value from the left table that is not represented in the result set, it is appended to the result set. In the preceding example, the inner join creates a result set where none of the left values are represented. Therefore, both of the left values are appended to the result set with null values, which represent the values from the table on the right of the join. Example 4 shows a few more queries with their result sets.

Example 4
SELECT * FROM A LEFT JOIN B ON A.value=1
Table A
value
1
2
x
Table B
value
3
4
=
Result Set
A.value B.value
1 3
1 4
2 3
2 4
=>
Result Set
A.value B.value
1 3
1 4
2 null

SELECT * FROM A LEFT JOIN B ON B.value=3
Table A
value
1
2
x
Table B
value
3
4
=
Result Set
A.value B.value
1 3
1 4
2 3
2 4
=>
Result Set
A.value B.value
1 3
2 3

SELECT * FROM A LEFT JOIN B ON A.value=1 AND B.value=3
Table A
value
1
2
x
Table B
value
3
4
=
Result Set
A.value B.value
1 3
1 4
2 3
2 4
=>
Result Set
A.value B.value
1 3
2 null


Which type of join should I use?

It depends completely on the type of information you are trying to discover. A general rule is: if you want to know about only those who have done something, you want an inner join. If you want to know about everyone, regardless of whether they have done something or not, you want a left join. For example, suppose you have a forum. Many people are registered to use your forum, but only some of those people have actually posted information to your forums. You have two tables, one with user information and another with posting information. If you want to know who has posted to your forums, join the two tables with an inner join, which strips out of your result set anyone who has not posted. If you want to know how many times every user has posted, use a left join. If a user hasn't made any posts, a null appears next to their name in the result set.

Novell® Making IT Work As One

© 2008 Novell, Inc. All Rights Reserved.