A Visual Explanation of SQL Joins

October 11, 2007

I thought Ligaya Turmelle's post on SQL joins was a great primer for novice developers. Since SQL joins appear to be set-based, the use of Venn diagrams to explain them seems, at first blush, to be a natural fit. However, like the commenters to her post, I found that the Venn diagrams didn't quite match the SQL join syntax reality in my testing.

I love the concept, though, so let's see if we can make it work. Assume we have the following two tables. Table A is on the left, and Table B is on the right. We'll populate them with four records each.

id name       id  name
-- ----       --  ----
1  Pirate     1   Rutabaga
2  Monkey     2   Pirate
3  Ninja      3   Darth Vader
4  Spaghetti  4   Ninja

Let's join these tables by the name field in a few different ways and see if we can get a conceptual match to those nifty Venn diagrams.

SELECT * FROM TableA
INNER JOIN TableB
ON TableA.name = TableB.name

id  name       id   name
--  ----       --   ----
1   Pirate     2    Pirate
3   Ninja      4    Ninja

Inner join produces only the set of records that match in both Table A and Table B.

Venn diagram of SQL inner join
SELECT * FROM TableA
FULL OUTER JOIN TableB
ON TableA.name = TableB.name

id    name       id    name
--    ----       --    ----
1     Pirate     2     Pirate
2     Monkey     null  null
3     Ninja      4     Ninja
4     Spaghetti  null  null
null  null       1     Rutabaga       
null  null       3     Darth Vader

Full outer join produces the set of all records in Table A and Table B, with matching records from both sides where available. If there is no match, the missing side will contain null.

Venn diagram of SQL cartesian join

SELECT * FROM TableA
LEFT OUTER JOIN TableB
ON TableA.name = TableB.name

id  name       id    name
--  ----       --    ----
1   Pirate     2     Pirate
2   Monkey     null  null
3   Ninja      4     Ninja
4   Spaghetti  null  null

Left outer join produces a complete set of records from Table A, with the matching records (where available) in Table B. If there is no match, the right side will contain null.

Venn diagram of SQL left join
SELECT * FROM TableA
LEFT OUTER JOIN TableB
ON TableA.name = TableB.name
WHERE TableB.id IS null

id  name       id     name
--  ----       --     ----
2   Monkey     null   null
4   Spaghetti  null   null

To produce the set of records only in Table A, but not in Table B, we perform the same left outer join, then exclude the records we don't want from the right side via a where clause.

join-left-outer.png
SELECT * FROM TableA
FULL OUTER JOIN TableB
ON TableA.name = TableB.name
WHERE TableA.id IS null 
OR TableB.id IS null

id    name       id    name
--    ----       --    ----
2     Monkey     null  null
4     Spaghetti  null  null
null  null       1     Rutabaga
null  null       3     Darth Vader

To produce the set of records unique to Table A and Table B, we perform the same full outer join, then exclude the records we don't want from both sides via a where clause.

join-outer.png

There's also a cartesian product or cross join, which as far as I can tell, can't be expressed as a Venn diagram:

SELECT * FROM TableA
CROSS JOIN TableB

This joins "everything to everything", resulting in 4 x 4 = 16 rows, far more than we had in the original sets. If you do the math, you can see why this is a very dangerous join to run against large tables.

Posted by Jeff Atwood
190 Comments

Years ago I had diagrams like this posted on my desk and I refered to it nearly every other day. Thank you for posting this as I am sure I will be refering to it often as I try to understand and explain to others the impacts of SQL statements

Ernest on March 11, 2008 11:35 AM

This has been extremely helpful for a project that I am currently working on. I am by no means a coding expert(not a novice either), but there is a lot of stuff I have yet to learn. The way this is all displayed with the use of venn diagrams has made it very easy to understand, thanks for the help.

MattG on March 19, 2008 11:43 AM

This page is so helpful!

I am constantly needing to build MySQL queries for my php and .net applications and I come back to this page time and time again to prevent me from having to think too hard!

Just that one less moment spent on these details when you're in the middle of solving big problems makes all the difference.

Thank you!

AndrewM on March 25, 2008 12:03 PM

Fine tutorial.....

dhiman on April 1, 2008 8:14 AM

Good work.

Nash on April 14, 2008 5:40 AM

Great job.

Nicholas on April 21, 2008 8:16 AM

Thanks a lots for visual diagram. Its really helps me.

Mit on May 23, 2008 10:30 AM

YEY good thing for studdy fir IPT thanks

shade101 on May 28, 2008 10:39 AM

Rofl

Jack Jones on June 2, 2008 7:03 AM

Pretty 1337 thx!

Jack Jones on June 2, 2008 7:04 AM

hi jeff,
Thanks a lot for this ven diagram explanation. i opened the website without any understanding on sql joins and after reading ur explanation, i've understood it so well. really very helpfull.
Thanks a lot!

gayathri on June 3, 2008 10:35 AM

Great explain, thanks a lot.

Isa on June 4, 2008 6:03 AM

Its nice way of explanation of SQL Joins .Too easy!

Jolly on June 9, 2008 3:18 AM

I hope u can post such visual explanation for stored procedures also.

Jolly on June 9, 2008 3:19 AM

Very very good!!!
I'm listen!

Jazzy on June 10, 2008 6:39 AM

Good Job,

Thanks

Shamil on June 11, 2008 4:35 AM

Great Work.. Could u explain in a similar fashion joins between more than 2 tables lets say 4 tables.. I want to undersatnd the various steps in which the sql engine joins more than 2 tables

Thanks in advance

Harsh on June 20, 2008 1:08 PM

Excellent way of exhibiting the concept.
Simple , easy to understand.

SivaAnanth on June 23, 2008 1:08 PM

Excelent

Abhishek on June 24, 2008 8:25 AM

EXCELLENT

Abhishek on June 24, 2008 8:27 AM

Great and very simply to understand!

Charlie on June 25, 2008 3:51 AM

Great Work, Im teacher, and Iwill use this, to explain de joins.

Dulcinea on June 29, 2008 4:15 AM

Explained very well. Helped me a lot

Pallavi on July 17, 2008 1:01 PM

It's really great article!!!

anabel on July 17, 2008 1:29 PM

Muy bueno este articulo !! Los grficos ayudan mucho a enterder.
Gracias !
Desde Argentina, Buenos Aires.

Juan on July 25, 2008 9:44 AM

adfgfdgdgreytrtuykhjkmhjnkj

shanthi on August 2, 2008 3:28 AM

Great and very useful query and join query.............

shanthi on August 2, 2008 3:30 AM

it was great explanation , i do like the way u explained using venndiag , superb understaning on subj after long time , it was refreshing , pls do keep posting ........thnx sumanth

Sumanth Kumar on August 12, 2008 1:10 PM

2 THUMBS UP for this work

Sandeep on August 14, 2008 2:35 AM

Thanks a lot for your effort. Now i am confident about joins i mean basic concepts.. this article is the best.

Bindu on September 5, 2008 2:06 AM

Thank you very much jeff. I was always confuused about these joins. Today I have got clear idea about inner ande outer joins.

Narendra Ojha on September 5, 2008 2:57 AM

Great work. Now i can visualize my big query's into circles. Good.

Ganesh on September 15, 2008 1:01 PM

#1085;#1080;#1096;#1090;#1103;#1082;

rassa on September 16, 2008 9:01 AM

Best of rest,

Vijay on October 15, 2008 7:43 AM

Your explanation on SQL Join helped me a lot to complete my project easily. I was struggling to find out an exact query to match my requirement.

Thanks a lot

anoop on October 26, 2008 7:52 AM

So what is the volumetric shape of a circle multiplied by a circle?

Nat on October 29, 2008 2:32 AM

Good and simple explanation of joins my fears of joins are erased
Thank you

pavan on November 5, 2008 4:24 AM

very informative.

sss on November 19, 2008 2:50 AM

Its a really interesting article.. Joins was really confusing all the time and u have presented it in a very simple manner.. i gothrough some books like silbershwats and all but they have given it in very complex manner.. and u are simply great....... Please prsent such wonderfull article again sir.. Thankyou very much.

jewel jose on December 3, 2008 1:54 AM

very useful article

Konstantin on December 12, 2008 7:34 AM

Very helpful...

Kamran on December 16, 2008 12:00 PM

Thanks Jeff. Very useful and simple explanation.

Benny Hill on December 28, 2008 1:45 AM

thanks

msn on January 3, 2009 9:01 AM

Khup Chaan

G on January 5, 2009 2:41 AM

Very elegant and easy explanation. congratulations, Very useful!

Daniel Suarez on January 7, 2009 4:14 AM

I'd say the post achieves the authors purpose. I thought to myself WTF is the difference between those bloody joins again? and find this, 15 seconds later I remember all of the CS lore I needed. I don't want to put words in the authors mouth (page?) but I doubt he was thinking of the diagrams in terms of their mathematical correctness to set theory when he put this together, simply trying to demonstrate the difference between the joins, which it achieved very well.

Once modification that might make you all happy, though. Just imagine an undrawn Z axis along which the seeming-circle of the diagram can expand to hold all of the duplicate values on each side making a sort of three dimensional blob. Then imagine this as the 2 dimensional representation, and it all works. Yay!

Matt on January 9, 2009 12:16 PM

Very useful. Thanks

leela on January 22, 2009 3:53 AM

Great post.
Thanks much.
http://slsecurity.blogspot.com/

winguard on February 3, 2009 10:37 AM

Impressive thinking!

Vibzy on February 14, 2009 4:44 AM

Jeff,

thanks. been looking for some indepth on joins.

jake on February 17, 2009 3:22 AM

thanks a lot :)

Caglar on April 2, 2009 8:58 AM

Well, as a (not-so-dumb) newbie to MySQL queries I was unclear about the JOIN statement and generally what is was all about. A quick Google brought me to this site.

Personally I think that it's an excellent top-level explanation of the general principles behind the Join and great for newcomers such as myself. Sure there are exceptions here and alternatives there, and lots of live learning hurdles to overcome in given situations, but this IS a good intoductory explanation nonetheless.

Pay no attention to those that clearly know it all and are quick to sound off with nothing but those tiresome criticisms and negative sentiment, but do nothing to contribute support for newcomers.

Excellent work Jeff, and thank you for your time on this :o)

Brendan on April 4, 2009 4:05 AM

Hi,

Tanks for the great article. The visual explanation is the best way to learn Joins. :D

Rodrigo Maia on April 6, 2009 8:39 AM

Thanks for great article and this selfless help.

Sachin on April 8, 2009 9:52 AM

Great article!!

Mauro on April 10, 2009 9:58 AM

Thats a good bit of work there. I found it hard to follow at first but got it in the end, very intresting.

http://something4free.net/free-xbox-360/how-to-get-a-free-xbox-360/

http://something4free.net/

Free Xbox 360 on April 27, 2009 1:51 PM

OMG thanks for this visual explanation folks, I never understood that. Angela from http://www.squidoo.com/regcuresoftwarereview

Angela on May 5, 2009 2:03 AM

Thanks for the article..It was very useful.

Harsh on June 17, 2009 10:37 AM

I finally understand! Thanks :D hehe

MacBook on June 21, 2009 12:45 PM

It took me a while to get my head around this but when I found this page everything made sense. Your layout example is great.
Thanks
Bob

Bob on June 23, 2009 6:18 AM

Hey, I've been coding in MS SQL at my job for about a year. I still have trouble wrapping my head around the joins sometimes but I have this site bookmarked and I probably visit it at least once or twice a week.

Thanks for the great diagrams!

Matt on June 25, 2009 10:22 AM

Looks very interesting. Thanks for sharing..
http://www.mpos.net/s/p3.asp
http://coolday.blog.com/
http://www.mpos.net/s/p4.asp

COOL on June 28, 2009 9:29 AM

Looks very interesting. Thanks for sharing..

http://www.mpos.net/s/p3.asp
http://coolday.blog.com/
http://www.mpos.net/s/p4.asp
http://fleafreesmart.com/

COOL on June 28, 2009 9:30 AM

Hi,
couldn't on express the outer join as an cartesian product of the discs A and B, which in this case would be a torus ??

Mathias on June 29, 2009 3:37 AM

The graphic for the cross join (or cartesian product) would be something like the old Games magazine logic puzzles (or similarly, the sheet in the Clue boardgame) with all options from set 1 on the horizontal, and all options from set two on the vertical. Each square in this grid is an item in the resultant table.

keith on June 29, 2009 11:57 AM

Hi Jef
I think your illustrations are really good, thanks for the information.

psp repairs on July 1, 2009 5:17 AM

couldn't on express the outer join as an cartesian product of the discs A and B, which in this case would be a torus ??

Printer Parts on July 13, 2009 2:08 AM

Pictures, examples, breakdown. Best way to explain it!

free iphone 3gs on July 20, 2009 7:56 AM

please explain

free ipod on July 29, 2009 8:08 AM

Oh the horror!

free ipod on July 29, 2009 8:08 AM

great stuff

free ipod on July 29, 2009 8:09 AM

some knowledge to understand, thank you, the author
http://www.ew09.com/

Botani on August 4, 2009 3:10 AM

BuggyFunBunny, are you capable of writing even a single comment that doesn't express your cantankerous political views? Or do you simply not realize that this is a technical blog and that nobody cares?

Andreas: I was under the impression that INNER JOINs use a hash match internally (in fact, I'm quite certain of this). CROSS JOINs obviously don't. Therefore I don't see how it's reasonable to say that an IJ is just a special case of the CJ. I suppose in bread-and-butter set theory, where there's no notion of a hash or an index, this could work, but this is the world of CS, where the clean academic solution is almost never the most practical one. The IJ you speak of is the naivest possible algorithm, only used when there are no indexes (and if you're designing databases without any indexes, well... stop designing databases).

This post is a pretty decent tutorial for green developers having to work with SQL, and it's something I might pass on if I felt that somebody else wasn't "getting it"; I do agree, however, that it omits most of the non-trivial cases. If every relationship was one-to-one, you wouldn't really need a database at all.

Aaron G on February 6, 2010 10:14 PM

Nice post. It explains the results gathered from simple joins quite well. It doesn't address *how* they work, but a nice visual on what you actually get.

Clinton P on February 6, 2010 10:14 PM

Ricardo-

I believe Oracle also has the plus (+) operator as well for joins which I have seen on one or two projects. Th INNER JOIN, OUTER JOIN SYNTAX appeared in ANSI SQL 92 (1992). Most everyone uses that syntax, but most database sytems support the older way as well.

Jon Raynor on February 6, 2010 10:14 PM

I'm happy that "monkey" and "spaghetti" made the list, but I really miss "more spaghetti". Maybe that's in another table.

Also fun are self-joins:
select * from TableA first inner join TableA second on
first.name = second.name where first.id != second.id

SQL Server's always supported joins via the where clause (where TableA.Name = TableB.Name), but the left and right join operator syntax ("*=" and "=*") has been deprecated in favor of the explicit join syntax. The "*=" and "=*" syntax doesn't specify join order and doesn't clarify what's in the where clause vs. a join operation.

Jon Galloway on February 6, 2010 10:14 PM

The arguments about how inner joins are implemented are all correct, and all wrong at the same time.

They are at times done by a hash, sometimes a nested loop, sometimes a cartesian product and weeding out the ones that do not fit. You will need to look a the query plan to know which one your database will use for this particular query, and it can even change over time for the same query as the statistical distribution of data changes. The database query planner tries to pick the best for the current situation. This is why it is important to keep the statistics up to date.

Sometimes, it even does something stupid, that you know is wrong because you know how the data is, and you know a better way to do it. This is what query hints are for, to force it to do a particular join a particular way.

This was, though, just a basic overview. It was just to show the result sets, and help developers understand SQL joins if they have not used them before. Details on the more advanced topics, like what I mentioned above are available (just search the net) but those who this article was aimed at are not yet ready for that level of complexity.

Grant Johnson on February 6, 2010 10:14 PM

Powerlord-

I've only seen the (+) in an Oracle (queries), but other RDMS system may support it (Although I cannot say for certain). I believe someone mentioned that (+) predated the ANSI syntax, so hopefully you should not encounter it. I beleive I witnessed it some stored procedures in a banking system.

At this point, any new code should use the ANSI 92 syntax as that seems to be the defacto standard at this point.

Jon Raynor on February 6, 2010 10:14 PM

Thanks man, you saved my day

Remon Georgy on February 6, 2010 10:14 PM

thanks, this is the best Joins explanation, the diagrams HELPS ALOT!
this solved my problem, thanks again

Farid Mohd Ismail on June 30, 2010 10:41 PM

There are sixteen possible Venn states with two variables, you forgot eleven of them ;-) Seriously, I wonder how many people have independently made the connection between SQL join and Venn. I also worked this out a while back, it being a expansion of the UNIX sort and uniq combinations I used to.

Given two lists, and given that each list has only unique entries inside itself:

sort a b b | uniq -u (gives only what is in a)
sort a b | uniq -d (gives what is in both a and b)
sort a a b | uniq -u (gives what is only in b)

I found this useful for simple lists like hostnames or IPs

Carnright on July 9, 2010 2:00 PM

Thank you so much for this explannation!!!!!! NOW I CAN FINALLY UNDERSTAND THIS!!!

Cotyrocksteady on July 18, 2010 6:16 PM

Thanks for this great visual explanation - just what I needed.

Helpstring on December 16, 2010 2:48 AM

I think a venn diagram of a cross join would show a single filled circle equal to the area^2 of the previous sets of circles.

Ken Weston on January 6, 2011 3:24 PM

Superbly explained Jeff. I must say that I am very naive when it comes to database programming since I am an applications and networking programmer primarily and as such, have a tendency to view databases as a big black box of data, nothing more...

I liked your usage of Venn Diagrams, it helped me see joins in a completely different light, I was very narrow minded.

That being said, could Venn Diagrams be applied to three or more table join statements? I find myself running into some problems recently in my rather inferior SQL statements hence the question.

Once again, great work, really enjoy reading your posts.

Jonathon Ogden on January 24, 2011 6:55 AM

Old post, and this will probably be lost among the spam, but...

While joins themselves can't be accurately defined by Venn diagrams, the relationships between joins can be. With the left and right circles representing the left and right joins, their union is the outer join and their intersection is the inner join.

outis on April 19, 2011 1:57 AM

Awesome explanation, thanks!

Hendrik Kleine on July 10, 2011 10:43 AM

Nice Visuals...
thanks for the explanations...

jpk on August 1, 2011 2:40 AM

«Back

The comments to this entry are closed.