I thought Ligaya Turmelle's post on SQL joins was a great primer for novice developers. Since SQL joins appear to be set-based, the use of Venn diagrams to explain them seems, at first blush, to be a natural fit. However, like the commenters to her post, I found that the Venn diagrams didn't quite match the SQL join syntax reality in my testing.
I love the concept, though, so let's see if we can make it work. Assume we have the following two tables. Table A is on the left, and Table B is on the right. We'll populate them with four records each.
id name id name -- ---- -- ---- 1 Pirate 1 Rutabaga 2 Monkey 2 Pirate 3 Ninja 3 Darth Vader 4 Spaghetti 4 Ninja
Let's join these tables by the name field in a few different ways and see if we can get a conceptual match to those nifty Venn diagrams.
SELECT * FROM TableA INNER JOIN TableB ON TableA.name = TableB.name id name id name -- ---- -- ---- 1 Pirate 2 Pirate 3 Ninja 4 Ninja Inner join produces only the set of records that match in both Table A and Table B.
|
|
SELECT * FROM TableA FULL OUTER JOIN TableB ON TableA.name = TableB.name id name id name -- ---- -- ---- 1 Pirate 2 Pirate 2 Monkey null null 3 Ninja 4 Ninja 4 Spaghetti null null null null 1 Rutabaga null null 3 Darth Vader Full outer join produces the set of all records in Table A and Table B, with matching records from both sides where available. If there is no match, the missing side will contain null. |
|
|
SELECT * FROM TableA LEFT OUTER JOIN TableB ON TableA.name = TableB.name id name id name -- ---- -- ---- 1 Pirate 2 Pirate 2 Monkey null null 3 Ninja 4 Ninja 4 Spaghetti null null Left outer join produces a complete set of records from Table A, with the matching records (where available) in Table B. If there is no match, the right side will contain null. |
|
SELECT * FROM TableA LEFT OUTER JOIN TableB ON TableA.name = TableB.name WHERE TableB.id IS null id name id name -- ---- -- ---- 2 Monkey null null 4 Spaghetti null null To produce the set of records only in Table A, but not in Table B, we perform the same left outer join, then exclude the records we don't want from the right side via a where clause. |
|
SELECT * FROM TableA FULL OUTER JOIN TableB ON TableA.name = TableB.name WHERE TableA.id IS null OR TableB.id IS null id name id name -- ---- -- ---- 2 Monkey null null 4 Spaghetti null null null null 1 Rutabaga null null 3 Darth Vader To produce the set of records unique to Table A and Table B, we perform the same full outer join, then exclude the records we don't want from both sides via a where clause. |
|
There's also a cartesian product or cross join, which as far as I can tell, can't be expressed as a Venn diagram:
SELECT * FROM TableA CROSS JOIN TableB
This joins "everything to everything", resulting in 4 x 4 = 16 rows, far more than we had in the original sets. If you do the math, you can see why this is a very dangerous join to run against large tables.
Years ago I had diagrams like this posted on my desk and I refered to it nearly every other day. Thank you for posting this as I am sure I will be refering to it often as I try to understand and explain to others the impacts of SQL statements
Ernest on March 11, 2008 11:35 AMThis has been extremely helpful for a project that I am currently working on. I am by no means a coding expert(not a novice either), but there is a lot of stuff I have yet to learn. The way this is all displayed with the use of venn diagrams has made it very easy to understand, thanks for the help.
MattG on March 19, 2008 11:43 AMThis page is so helpful!
I am constantly needing to build MySQL queries for my php and .net applications and I come back to this page time and time again to prevent me from having to think too hard!
Just that one less moment spent on these details when you're in the middle of solving big problems makes all the difference.
Thank you!
AndrewM on March 25, 2008 12:03 PMFine tutorial.....
dhiman on April 1, 2008 8:14 AMGood work.
Nash on April 14, 2008 5:40 AMGreat job.
Nicholas on April 21, 2008 8:16 AMThanks a lots for visual diagram. Its really helps me.
Mit on May 23, 2008 10:30 AMYEY good thing for studdy fir IPT thanks
shade101 on May 28, 2008 10:39 AMRofl
Jack Jones on June 2, 2008 7:03 AMPretty 1337 thx!
Jack Jones on June 2, 2008 7:04 AMhi jeff,
Thanks a lot for this ven diagram explanation. i opened the website without any understanding on sql joins and after reading ur explanation, i've understood it so well. really very helpfull.
Thanks a lot!
Great explain, thanks a lot.
Isa on June 4, 2008 6:03 AMIts nice way of explanation of SQL Joins .Too easy!
Jolly on June 9, 2008 3:18 AMI hope u can post such visual explanation for stored procedures also.
Jolly on June 9, 2008 3:19 AMVery very good!!!
I'm listen!
Good Job,
Thanks
Shamil on June 11, 2008 4:35 AMGreat Work.. Could u explain in a similar fashion joins between more than 2 tables lets say 4 tables.. I want to undersatnd the various steps in which the sql engine joins more than 2 tables
Thanks in advance
Excellent way of exhibiting the concept.
Simple , easy to understand.
Excelent
Abhishek on June 24, 2008 8:25 AMEXCELLENT
Abhishek on June 24, 2008 8:27 AMGreat and very simply to understand!
Charlie on June 25, 2008 3:51 AMGreat Work, Im teacher, and Iwill use this, to explain de joins.
Dulcinea on June 29, 2008 4:15 AMExplained very well. Helped me a lot
Pallavi on July 17, 2008 1:01 PMIt's really great article!!!
anabel on July 17, 2008 1:29 PMMuy bueno este articulo !! Los grficos ayudan mucho a enterder.
Gracias !
Desde Argentina, Buenos Aires.
adfgfdgdgreytrtuykhjkmhjnkj
shanthi on August 2, 2008 3:28 AMGreat and very useful query and join query.............
shanthi on August 2, 2008 3:30 AMit was great explanation , i do like the way u explained using venndiag , superb understaning on subj after long time , it was refreshing , pls do keep posting ........thnx sumanth
Sumanth Kumar on August 12, 2008 1:10 PM2 THUMBS UP for this work
Sandeep on August 14, 2008 2:35 AMThanks a lot for your effort. Now i am confident about joins i mean basic concepts.. this article is the best.
Bindu on September 5, 2008 2:06 AMThank you very much jeff. I was always confuused about these joins. Today I have got clear idea about inner ande outer joins.
Narendra Ojha on September 5, 2008 2:57 AMGreat work. Now i can visualize my big query's into circles. Good.
Ganesh on September 15, 2008 1:01 PM#1085;#1080;#1096;#1090;#1103;#1082;
rassa on September 16, 2008 9:01 AMBest of rest,
Vijay on October 15, 2008 7:43 AMYour explanation on SQL Join helped me a lot to complete my project easily. I was struggling to find out an exact query to match my requirement.
Thanks a lot
anoop on October 26, 2008 7:52 AMSo what is the volumetric shape of a circle multiplied by a circle?
Nat on October 29, 2008 2:32 AMGood and simple explanation of joins my fears of joins are erased
Thank you
very informative.
sss on November 19, 2008 2:50 AMIts a really interesting article.. Joins was really confusing all the time and u have presented it in a very simple manner.. i gothrough some books like silbershwats and all but they have given it in very complex manner.. and u are simply great....... Please prsent such wonderfull article again sir.. Thankyou very much.
jewel jose on December 3, 2008 1:54 AMvery useful article
Konstantin on December 12, 2008 7:34 AMVery helpful...
Kamran on December 16, 2008 12:00 PMThanks Jeff. Very useful and simple explanation.
Benny Hill on December 28, 2008 1:45 AMthanks
msn on January 3, 2009 9:01 AMKhup Chaan
G on January 5, 2009 2:41 AMVery elegant and easy explanation. congratulations, Very useful!
Daniel Suarez on January 7, 2009 4:14 AMI'd say the post achieves the authors purpose. I thought to myself WTF is the difference between those bloody joins again? and find this, 15 seconds later I remember all of the CS lore I needed. I don't want to put words in the authors mouth (page?) but I doubt he was thinking of the diagrams in terms of their mathematical correctness to set theory when he put this together, simply trying to demonstrate the difference between the joins, which it achieved very well.
Once modification that might make you all happy, though. Just imagine an undrawn Z axis along which the seeming-circle of the diagram can expand to hold all of the duplicate values on each side making a sort of three dimensional blob. Then imagine this as the 2 dimensional representation, and it all works. Yay!
Matt on January 9, 2009 12:16 PMVery useful. Thanks
leela on January 22, 2009 3:53 AMGreat post.
Thanks much.
http://slsecurity.blogspot.com/
Impressive thinking!
Vibzy on February 14, 2009 4:44 AMJeff,
thanks. been looking for some indepth on joins.
jake on February 17, 2009 3:22 AMthanks a lot :)
Caglar on April 2, 2009 8:58 AMWell, as a (not-so-dumb) newbie to MySQL queries I was unclear about the JOIN statement and generally what is was all about. A quick Google brought me to this site.
Personally I think that it's an excellent top-level explanation of the general principles behind the Join and great for newcomers such as myself. Sure there are exceptions here and alternatives there, and lots of live learning hurdles to overcome in given situations, but this IS a good intoductory explanation nonetheless.
Pay no attention to those that clearly know it all and are quick to sound off with nothing but those tiresome criticisms and negative sentiment, but do nothing to contribute support for newcomers.
Excellent work Jeff, and thank you for your time on this :o)
Brendan on April 4, 2009 4:05 AMHi,
Tanks for the great article. The visual explanation is the best way to learn Joins. :D
Rodrigo Maia on April 6, 2009 8:39 AMThanks for great article and this selfless help.
Sachin on April 8, 2009 9:52 AMGreat article!!
Mauro on April 10, 2009 9:58 AMThats a good bit of work there. I found it hard to follow at first but got it in the end, very intresting.
http://something4free.net/free-xbox-360/how-to-get-a-free-xbox-360/
Free Xbox 360 on April 27, 2009 1:51 PMOMG thanks for this visual explanation folks, I never understood that. Angela from http://www.squidoo.com/regcuresoftwarereview
Angela on May 5, 2009 2:03 AMstatement and generally what is was all abouthttp://www.nike-max-tn.com
http://www.mensclothingstore.us/
http://www.cheappolos.us
http://www.polo-shirts.us
http://www.ralphlaurenpoloshirts.us
http://www.nbpal.com
Thanks for the article..It was very useful.
Harsh on June 17, 2009 10:37 AMI finally understand! Thanks :D hehe
MacBook on June 21, 2009 12:45 PMIt took me a while to get my head around this but when I found this page everything made sense. Your layout example is great.
Thanks
Bob
Hey, I've been coding in MS SQL at my job for about a year. I still have trouble wrapping my head around the joins sometimes but I have this site bookmarked and I probably visit it at least once or twice a week.
Thanks for the great diagrams!
Matt on June 25, 2009 10:22 AMLooks very interesting. Thanks for sharing..
http://www.mpos.net/s/p3.asp
http://coolday.blog.com/
http://www.mpos.net/s/p4.asp
Looks very interesting. Thanks for sharing..
http://www.mpos.net/s/p3.asp
http://coolday.blog.com/
http://www.mpos.net/s/p4.asp
http://fleafreesmart.com/
Hi,
couldn't on express the outer join as an cartesian product of the discs A and B, which in this case would be a torus ??
The graphic for the cross join (or cartesian product) would be something like the old Games magazine logic puzzles (or similarly, the sheet in the Clue boardgame) with all options from set 1 on the horizontal, and all options from set two on the vertical. Each square in this grid is an item in the resultant table.
keith on June 29, 2009 11:57 AMHi Jef
I think your illustrations are really good, thanks for the information.
couldn't on express the outer join as an cartesian product of the discs A and B, which in this case would be a torus ??
Printer Parts on July 13, 2009 2:08 AMThanks for the article..It was very useful.
http://forum.en86.com/
http://www.dk68.com/post/9/160
http://forum.96169.org/forum-63-1.html
http://www.tvdramazone.com/macau-hotel/
Pictures, examples, breakdown. Best way to explain it!
free iphone 3gs on July 20, 2009 7:56 AMplease explain
free ipod on July 29, 2009 8:08 AMOh the horror!
free ipod on July 29, 2009 8:08 AMgreat stuff
free ipod on July 29, 2009 8:09 AMsome knowledge to understand, thank you, the author
http://www.ew09.com/
BuggyFunBunny, are you capable of writing even a single comment that doesn't express your cantankerous political views? Or do you simply not realize that this is a technical blog and that nobody cares?
Andreas: I was under the impression that INNER JOINs use a hash match internally (in fact, I'm quite certain of this). CROSS JOINs obviously don't. Therefore I don't see how it's reasonable to say that an IJ is just a special case of the CJ. I suppose in bread-and-butter set theory, where there's no notion of a hash or an index, this could work, but this is the world of CS, where the clean academic solution is almost never the most practical one. The IJ you speak of is the naivest possible algorithm, only used when there are no indexes (and if you're designing databases without any indexes, well... stop designing databases).
This post is a pretty decent tutorial for green developers having to work with SQL, and it's something I might pass on if I felt that somebody else wasn't "getting it"; I do agree, however, that it omits most of the non-trivial cases. If every relationship was one-to-one, you wouldn't really need a database at all.
Aaron G on February 6, 2010 10:14 PMNice post. It explains the results gathered from simple joins quite well. It doesn't address *how* they work, but a nice visual on what you actually get.
Clinton P on February 6, 2010 10:14 PMRicardo-
I believe Oracle also has the plus (+) operator as well for joins which I have seen on one or two projects. Th INNER JOIN, OUTER JOIN SYNTAX appeared in ANSI SQL 92 (1992). Most everyone uses that syntax, but most database sytems support the older way as well.
Jon Raynor on February 6, 2010 10:14 PMI'm happy that "monkey" and "spaghetti" made the list, but I really miss "more spaghetti". Maybe that's in another table.
Also fun are self-joins:
select * from TableA first inner join TableA second on
first.name = second.name where first.id != second.id
SQL Server's always supported joins via the where clause (where TableA.Name = TableB.Name), but the left and right join operator syntax ("*=" and "=*") has been deprecated in favor of the explicit join syntax. The "*=" and "=*" syntax doesn't specify join order and doesn't clarify what's in the where clause vs. a join operation.
Jon Galloway on February 6, 2010 10:14 PMThe arguments about how inner joins are implemented are all correct, and all wrong at the same time.
They are at times done by a hash, sometimes a nested loop, sometimes a cartesian product and weeding out the ones that do not fit. You will need to look a the query plan to know which one your database will use for this particular query, and it can even change over time for the same query as the statistical distribution of data changes. The database query planner tries to pick the best for the current situation. This is why it is important to keep the statistics up to date.
Sometimes, it even does something stupid, that you know is wrong because you know how the data is, and you know a better way to do it. This is what query hints are for, to force it to do a particular join a particular way.
This was, though, just a basic overview. It was just to show the result sets, and help developers understand SQL joins if they have not used them before. Details on the more advanced topics, like what I mentioned above are available (just search the net) but those who this article was aimed at are not yet ready for that level of complexity.
Powerlord-
I've only seen the (+) in an Oracle (queries), but other RDMS system may support it (Although I cannot say for certain). I believe someone mentioned that (+) predated the ANSI syntax, so hopefully you should not encounter it. I beleive I witnessed it some stored procedures in a banking system.
At this point, any new code should use the ANSI 92 syntax as that seems to be the defacto standard at this point.
Jon Raynor on February 6, 2010 10:14 PMThanks man, you saved my day
Remon Georgy on February 6, 2010 10:14 PMthanks, this is the best Joins explanation, the diagrams HELPS ALOT!
this solved my problem, thanks again
There are sixteen possible Venn states with two variables, you forgot eleven of them ;-) Seriously, I wonder how many people have independently made the connection between SQL join and Venn. I also worked this out a while back, it being a expansion of the UNIX sort and uniq combinations I used to.
Given two lists, and given that each list has only unique entries inside itself:
sort a b b | uniq -u (gives only what is in a)
sort a b | uniq -d (gives what is in both a and b)
sort a a b | uniq -u (gives what is only in b)
I found this useful for simple lists like hostnames or IPs
Thank you so much for this explannation!!!!!! NOW I CAN FINALLY UNDERSTAND THIS!!!
Cotyrocksteady on July 18, 2010 6:16 PMThanks for this great visual explanation - just what I needed.
Helpstring on December 16, 2010 2:48 AMI think a venn diagram of a cross join would show a single filled circle equal to the area^2 of the previous sets of circles.
Ken Weston on January 6, 2011 3:24 PMSuperbly explained Jeff. I must say that I am very naive when it comes to database programming since I am an applications and networking programmer primarily and as such, have a tendency to view databases as a big black box of data, nothing more...
I liked your usage of Venn Diagrams, it helped me see joins in a completely different light, I was very narrow minded.
That being said, could Venn Diagrams be applied to three or more table join statements? I find myself running into some problems recently in my rather inferior SQL statements hence the question.
Once again, great work, really enjoy reading your posts.
Jonathon Ogden on January 24, 2011 6:55 AMOld post, and this will probably be lost among the spam, but...
While joins themselves can't be accurately defined by Venn diagrams, the relationships between joins can be. With the left and right circles representing the left and right joins, their union is the outer join and their intersection is the inner join.
outis on April 19, 2011 1:57 AMAwesome explanation, thanks!
Hendrik Kleine on July 10, 2011 10:43 AMNice Visuals...
thanks for the explanations...
The comments to this entry are closed.
|
|
Traffic Stats |