SQL Joins

SQL Joins
Image by Brand X Pictures/ Canva

Data are stored in multiple tables in real life. They simply aren't one jumbo table with everything you need. I wish such a thing existed! We often need to link the tables together through a related column. So I'm about to cover one of those essential fundamentals-Joins. You need to learn and be familiar with it if you are in the long haul for analytics. Join is a SQL clause. You'll come across this not just in the context of SQL but also in ETL (Extract, Transform, Load) tools.

In this blog, we will go over what a SQL join is and the types of joins.

What is SQL Join?

A SQL Join is an operation that allows you to combine records from two or more tables into a meaningful result set based on related column(s). Typically you'll specify two types of conditions for the join. One type is which columns from each table to use in the join. The second type is to refine what's being returned where you would combine it with logical operators such as =, <, or > x value to return only rows that meet specified conditions.

Types of Joins

There are four common types of joins you'll encounter.

Blog-21--Types-of-Join

Inner Join

Inner join retrieves all records from multiple tables that satisfy the specified join condition. It is a popular form of join and is the default join. If we omit the INNER keyword with the JOIN query, the output will be the same.
Blog-21--Inner-Join-Image

Here is inner join syntax.
Blog-21--Inner-Join-Syntax

Self Join

A self join is a regular type of join, but the table is joined with itself. You may wonder why you would want to join a table to itself. It's a pretty common practice to join a table to itself. Often used to find out duplicate data, correlate or find out data relation.

A classic example is where you have an employee table containing a SupervisorID column that points to the employee who is the current employee's boss.

Blog-21--Self-Join-Example

Cross Join

Cross join is a join that returns the Cartesian product of rows from the tables in the join. What does that mean? Cartesian product is a mathematical term in set theory. Essentially it combines each row from table 1 with each row from table 2. So, it takes all the rows present in the first table and multiplies them by all rows present in the second table.

If you're still confused, check out this illustration.
Blog-21--Cross-Join-Tables

Outer Join

Outer joins are joins that return matched values and unmatched values from either or both tables. There are three types:

  • LEFT OUTER JOIN
  • RIGHT OUTER JOIN
  • FULL OUTER JOIN

Left Outer Join

Left outer join returns all the records from the left table and matching rows from the right table. NULL is returned when no matching record is found in the right table. The keyword OUTER in SQL is optional, often, it is omitted, and people just specify LEFT JOIN.
Blog-21--Left-outer-join-image

Blog-21--LEFT-OUTER-JOIN-Syntax

Right Outer Join

Right outer join is the opposite of left outer join. Right join returns all the records from the right table and matching rows from the left table. NULL is returned when no matching record is found in the left table. Again, the keyword OUTER in SQL is optional, often it is omitted, and people just specify RIGHT JOIN.

People rarely ever use RIGHT JOIN. If you switch your tables around, you could use the left join.

Blog-21--Right-Outer-Join-Image

Full Outer Join

Full outer joins return a result that includes all rows from both tables. The columns of the right table return NULL when no matching records are found in the left table. Likewise, if no matching records are located in the right table, the left table column returns NULL.
Blog-21--Full-outer-join-image

Blog-21--FULL-OUTER-JOIN-syntax

That's it, folks. Now you know the four common joins. There are a few more joins I've come across, but I have not entirely understood it myself. I'll leave this for a future blog after I've wrapped my head around it.