Semi-Join & Anti-Join

Semi-Join & Anti-Join
Image by Danni Liu adapted from malerapaso/ Canva

In my previous blog, we covered the common joins you'll likely come across. I also mentioned that there are more, like anti and semi-joins. While these two types of joins aren't as frequently seen as the ones shared earlier, it has been mentioned in several articles that the two are important to learn quite early as we'll inevitably need them.

Both types of joins have a LEFT and RIGHT counterpart. I'm aware of the two types of joins but found it quite challenging to comprehend. I've learned about it in online courses and read several articles on it, but I just didn't get it. Well, guess what? After returning from a month of traversing Europe, it finally clicked for me!

So, feeling rather smug, I would like to share this newfound understanding of anti-join and semi-join with you.😊

Now that I understand, I'm baffled as to why I didn't get it in the first place. It's pretty straightforward. Perhaps I wasn't focused at the time. Anyhoo, that aside, let's start with semi-join.

Semi-Join

Semi-join is a type of join where the resulting table contains only the columns from the first table. It is similar to an inner join. There are two key differences:

  1. Only columns from Table 1 are returned (hence semi-join, meaning half)
    Blog-22--Semi-and-Inner-differences-1

  2. In the case of duplicates in Table 2, the inner join will return duplicates, whereas the semi-join will only return the first match.
    Blog-22--Semi-and-Inner-differences-2

Semi-Join comes in LEFT and RIGHT flavours, just like many other joins. With the left semi-join, your table on the left is the main table, and the returned values are predicated on what's in the left table. For the right semi-join, the table on the right is your main table. Most people don't use right because it can be achieved by using left semi-join just by switching the order of the tables.

How do you write a semi-join?

Semi-join is written using EXISTS or IN.
Blog-22--Semi-Join-code

Anti-Join

How do you find values from one table that are absent in another? Anti-join is designed precisely to answer this question. You use it when you want to see all those records that don't match between the two joining tables.

Anti-joins help answer business questions like:

  • customers did not place an order
  • salespeople who did not close a deal
    Blog-22--Anti-Join

Here is another visual illustration of anti-join. As mentioned in the beginning, it also comes in LEFT and RIGHT.

Image by Andreas Martinson sourced on Towards Data Science 

A left anti-join returns rows in the left table that have no matching rows in the right table.
A right anti-join returns rows in the right table that have no matching rows in the left table.
Again, just like the RIGHT semi-join, most people don't use it for the same reason explained.

How do you write an anti-join?

Unlike other joins, it doesn't have its own syntax. To perform anti-joins, a combination of other SQL queries is required. We would need to use a combination of LEFT JOIN and WHERE to find all the values from Table_1 that are absent in Table_2.

Blog-22--Anti-Join-Code

That's semi-join and anti-join in a nutshell. Both types of joins are a tad quirky because they don't have the same built-in syntax as common joins, such as INNER JOIN, LEFT JOIN, etc.
They are valuable techniques for filtering one's table's records on the records of another table.