July 10, 2004
There's been a lot of discussion recently about the Object to Relational mapping problem, which is a serious one. This Clemens Vasters blog entry summarizes it best:
Maybe I am too much of a data (read: XML, Messages, SQL) guy by now, but I just lost faith that objects are any good on the "business logic" abstraction level. The whole inheritance story is usually refactored away for very pragmatic reasons and the encapsulation story isn't all that useful either.
What you end up with are elements and attributes (infoset) for the data that flows across, services that deal with the data that flows, and rows and columns that efficiently store data and let you retrieve it flexibly and quickly. Objects lost (except on the abstract and conceptional analysis level where they are useful to understand a problem space) their place in that picture for me.
A followup from Steve Maine's blog
elaborates a bit:
A typical business problem is the converse of a typical object-oriented problem. Business problems are generally interested in a very limited set of operations (CRUD being the most popular). These operations are only as polymorphic as the data on which they operate. The Customer.Create() operation is really no different behaviorally than Product.Create() (if Product and Customer had the same name, you could reuse the same code modulo stored procedure or table name), however the respective data sets on which they both operate are likely to be vastly different. As collective industry experience has shown, handing polymorphic data with language techniques optimized for polymorphic behavior is tricky at best. Yes, it can be done, but it requires fits of extreme cleverness on the part of the developer. Often those fits of cleverness turn into fugues of frustration because the programming techniques designed to reduce complexity have actually compounded it.
All I can say to the above is, I concur. We've concluded the same thing in a few projects at work. We started with naive Object implementations, and then scaled back-- purely for reasons of simplicity-- to passing around raw DataSets. As one of my co-workers said:
At first you're like "whee! objects!" and then you realize-- hey, this is a lot of tedious, error-prone mapping code I didn't have to write before...
I've always maintained that the IDE should be able to support named dot-style access to the database and tables, which it automatically absorbs from the database schema behind the scenes. I know we have Typed Datasets, but those are not transparent and certainly not automatic. So instead of this syntax, which raises the hackles of SmallTalk fans worldwide:
We could use this syntax:
Again, this is only useful if it is completely automatic in the IDE, with intellisense support-- that is, zero code required from the developer! It also would force you to have a clean schema design for your DB, which can't be a bad thing.
Posted by Jeff Atwood
You are an ignorant stupid moron. Try pulling your head out of your ass.
Hey Curt: would you care to back that up with any solid real-world analysis of why a domain model is in fact "better"?
No; I thought not.
Rails does this kind of thing.
The ActiveRecord pattern is also great for auto-CRUD.
VB6 Enterprise (possibly Professional) can also do this, though it uses 'exclamation mark' syntax, rather than just a dot.
Personally I used it once and forgot about it, but it is there.
What you are talking about is a "simple" GUI where the mapping between objects and tables is fairly evident (Customer.FistName - Field "FirstName" in table "Customer").
For these kind of applications, OOP (Object-Oriented Programming) is perhaps not necessary. And some object languages/IDE (Delphi for instance), let you directly link graphical components to database fields.
***But*** if you have a complex business logic, with several concepts that do not match directly in the database, if you want to reduce the coupling between the different parts of your project, then OOP is a must.
Re: "if you have a complex business logic, with several concepts that do not match directly in the database, if you want to reduce the coupling between the different parts of your project, then OOP is a must."
Why is this? I keep hearing the buzzphrase "coupling", but actual examples are either not applicable to the biz domain, or make poor change-pattern assumptions.
OOP lacks any native ties to set theory, and I think this is the problem. OO is based on physical decomposition, but biz apps often deal with intellectual property decomposition, in which sets are better than aggregation. Sets can be hard to get your head around, but once you do you see the world differently.
Actually, using a higher-level language can deal with a LOT of the pain. I've recently been working on a pair of projects--one at work, one at home--which both implement database-backed web sites (nothing groundbreaking, just useful stuff). I've managed to use macros to write all the tedious mapping code for me, and even have some escape hatches for when the defaults aren't what I want. A single piece of code creates the object, associates it with the database, creates an accessor function for the web page templating language and creates a predicate function for the templating language.
Several of my database tables use inheritance, so the classes do the same thing. It all Just Works, and adding a new class takes less than a minute. If anything, it's become rather boring--boring in a good way:-)
Actually, Gemstone Smalltalk has had almost that exact same notation for years (well over a decade). Screw OR mapping, get a OODBMS.
Uh Oh, somebody doesn't understand OO Programming!
I think that like many who have lost the oop faith you miss a critical aspect of oop design that is rarely mentioned. Oop is largely an outgrown of simulation programs that sought to model a real situation through decomposition of the problem into individually comprehensible features. The system was then supposed to emerge from these pieces. This works great when you want to run a physics expirement in your silicon or if you want an appearently intellegent game universe but for 99% of business apps, embedded apps, stuff people get paid for the problem domain is precicly the reverse of that. Usually the overall behvaior of the system is known beforehand, if your accounting code demonstraits emergent behavior you probably have to explain yourself to the SEC. In order to stop this runaway problem most oop developers consciously restrict the expressiveness of thier code to combat the problem.
The real solution is to abandon oop when it is counter productive.
What do you think about the LINQ stuff?
I for one think that, for the first time, it will make sense for me to have a data layer.
In tha past I have avoided them because they tend to load too much data that I don't actually need. But with LINQ, it looks like I can create lightweight classes without all the headache.
I work at the company linked with my name, so take that as a disclaimer, but it has a product that does what you ask for in this post. Given a schema definition (created by dragging and dropping tables into a designer window within VS), you automatically get a data access layer that is strongly typed and easily accessed.
It seems the winds have changed since this post was written in 2004 and most people are in love with O/R mapping, but I still think that typed datasets are the way to go.
Datasets lose for me due to their enormous memory footprint. It's worth the extra effort for the data mapping to save RAM (in my scenario).
Hey, Russell Garner... If you actually want a response you can email me at my posted email address. Making your own post over a year after mine and expecting me to answer you is indicative of your limitations. In the meantime, you and Jeff should adopt the following adage: Don't criticise what you don't understand.
I disagree completely to the statement that objects and object oriented designs are not suitable for business software. In fact, it is business needs that force us to use object oriented architecture in the first place.
Data in a database is just that. Data. It does not have any business rules or logic attached to it. And it is supposed to be that way. An RDBMS or database system is just a repositery for data. It is not supposed to contain business rules and logic.
While building software which uses business logic and rules, we combine the data from the database and enhance it with business rules and logic by encapsulating the data into business objects. These business objects model the business that has to be modeled. And when used properly, these business objects can prevent bad/illogical data from creeping into the database.
For how to do this, read the book Expert C# 2005 Business Objects by Rockford Lhotka or Expert VB 2005 Business Objects by Rockford Lhotka (if you prefer VB) or visit his website www.lhotka.net.
Cosmic, you are 100%, completely wrong. A RDBMS is not a repository for data, it is supposed to have business rules and logic. This mysql3 attitude that inexperienced developers have been getting lately is terrible. Your database management system should manage your data. It almost seems like that's the exact purpose they were designed for or something. Maybe that's why its right there in the name?
Putting data logic and rules into your code not only makes things far more difficult (and in fact is what creates the so called object relational mismatch), but it also means you have to duplicate that code in every app that touches your data for no reason, and thus artificially limits you to using one language so you can reuse that data logic easily.
Harold, not all data logic is best left to the DB. Not a perfect example, but consider a list of employees -- the bigger the set of records, the faster it is to sort the list in Java than via SQL.
While it takes some sematics out of SQL and pops into into a single language, there are times when you're want to do such violations for performance's sake.
(Disclaimer: this comment is based on my experience with .NET/SQL, and may not be applicable to other languages/databases)
Putting data logic and rules into code does not imply that the logic must be duplicated in every subsequent application. More precisely, it DOES mean you need to reuse that logic in the new application, whether as uncompiled code or a compiled DLL.
Putting logic in code rather than the database gives the advantage of allowing new interfaces to the same data to implement different business rules (within reason!) with less risk of impacting on previous interfaces. This is especially the case where the new interface has stricter data requirements and isn't concerned with subsequent reads matching its validation rules.
Databases can mitigate the risk of changes to one consumer impacting others, but I hope we can agree that the options (versioned stored procedures or pollution of the database schema/logic with information about the consumer) aren't nice.
Check out Progress 4GL - it lets you access database records and fields from the same language/IDE (etc) as you use to output reports or GUI or whatever. I wouldn't want to write a compiler with it, but for business apps where 99% of the work involves trundling through the database, I can't see how the Java/C#'s/Rubys of the world can possibly compete.
Disclaimer: I am not involved with Progress and I hate them in some ways, but I would still rather use them (for business apps).
I would love to hear Martin Fowler, Uncle Bob, or someone of like caliber and experience weigh in on this discussion.
Please revisit topic in light of LINQ to SQL and Visual Studio 2008 (and Entity Framework beta releases). kthx!
are there any non oop languages left apart from C
I bet perl 6 (if it ever ships) will be more oop orientated.
It's oop, whether you like it or not. Doesn't seem to stop overruns, bad code, and software disasters though
Heh, you predicted LINQ to SQL.
You don't understand OOP. Admit it. You have problem creating domain models. Admit it.
I bet Your projects all look like C program with Main function containing all code. Milions of if-s, case-s.
Let's take example like yours with Customers but with Employees. There are different kinds of them. How do You calculate pay for them? You iterate rows, check flag (column) and then calculate pay based on it? Are you kidding me? What if calculation should base on other tables (timecards, commissions)? You iterate other tables, check if they contain rows for Employee.Id and include them in calculationa? Are You kidding me?
Your database management system should manage your data. It almost seems like that's the exact purpose they were designed for or something. Maybe that's why its right there in the name?
The name Database Management System states that it manages the database. If it would manage the data, then the name would be Data Management System.
You don't understand OOP there is nothing funnier than this statement. The fact that one programmer can justify obvious suckiness by a lack of understanding on the part of another means that there is definitely some kind of hidden deficiency in the paradigm.
Domain models are as the name states models. they were never meant to be autonomous programs.
Just my opinion, after coding in everything form assembly to multiple high level languages, from procedural to OOP style.
I've noticed that I can pretty much do 98% any complex problem coding using just arrays (with custom data types).
I'm not saying OOP doesn't have its place, but its probably not the best tool in every situation.
Re: "Let's take example like yours with Customers but with Employees. There are different kinds of them. How do You calculate pay for them?"
No, in practice there are NOT different "kinds" of employees. You are thinking in terms of hierarchical taxonomies. Hierarchical taxonomies are poor at modeling most business domains and domain objects. Modeling employees with *sets* of features or attributes is more realistic and more flexible. But OOP get's ugly when it tries to handle sets. It's not designed around sets, but rather physical objects and hierarchical taxonomies or nesting of physical objects.
If you are modeling physical objects, that's fine, but in the biz domain one is modeling intellectual and conceptual things and relationships such as laws, management decrees, and customer and vendor inter-relationships; not so much physical things. OOP can't give decent examples of how great it is without resorting to physical analogies. This is a sign of the weakness, the "physical problem".
An important aspect of software development is being able to manage complexity. We see this everywhere in IT. One example is layered architectures, where higher layers don't care about how lower layers are doing their work, etc. This is all about managing complexity. Can you imagine how difficult it would be to try and implement TCP/IP without a layered approach? In many business domains, OO programming languages are very good at helping to manage complexity. OO design patterns, when applied appropriately and correctly, can very elegantly solve complex problems, and help developers understand and maintain that underlying complexity. There is a very good reason why OO has become increasingly popular over the years, and why it is now the norm in a countless number of industries.
OO design patterns, when applied appropriately and correctly, can very elegantly solve complex problems
A very true statement, but vague enough to be true of any paradigm. It seems to me that relational, procedural, semantic(www.w3.org/standards/semanticweb), rule-based, pipeline (*nix shell),prototype-based and functional methods of process and data abstraction are all very useful; just like OO. Ideally we would use whichever abstractions (or lack thereof) suited the problem domain best, but we don't for a few reasons:
1. It would mean more developer time spent not only learning another language but learning a whole new set of best practices
2. Integration between langauges is nontrivial (perhaps less so in .NET)
3. Large tool systems (by which I mean the Java Platform) are OO
4. Good tools and techniques for modeling are either not as widely developed or are less important because the language is itself the most succint way of modeling a certain style of programming
5. It is hard to be creative enough to model a problem space in the same way a domain expert would. By the very fact of not being a domain expert we lack the same degree of sophisitication and organization in our mental models.
Ideally I would use simple, extensible languages to model a problem in the most precise and elegant way. Given that it takes a better programmer than myself to do that well, I will continue to rely on ready made abstractions like OO and relational modeling. What is important to look out for is when our abstractions fail us (http://sites.google.com/site/steveyegge2/decision-time) . OO may be effective but it is not the most general or flexible means of modeling. It is represented by Trees or Acyclic Digraphs, whereas the Semantic Web and rule-based programming spans Graph theory in general. Relational algebra depends upon set theory which was considered for a while to be the foundations of mathematics. With category theory (more foundation that set theory) being of interest to language researchers, who knows what more flexible abstractions we might see?