October 28, 2004
It's intended as sarcasm, but I believe this Daily WTF entry on Stored Procedures should be taken at face value:
I'm sure we've all heard, over and over, that inline SQL is generally a bad practice, and that we should use Stored Procedures when possible. But let's be realistic for a minute. Who wants to write a stupid stored procedure for every stupid little simple query needed.
Have you ever worked on a system where someone decreed* that all database calls must be Stored Procedures, and SQL is strictly verboten? I have, and this decision leads to incredible development pain:
- Stored Procedures typically cannot be debugged in the same IDE you write your UI. Every time I isolate an exception in the procs, I have to stop what I am doing, bust out my copy of Toad, and load up the database packages to see what's going wrong. Frequently transitioning between two totally different IDEs, with completely different interfaces and languages, is not exactly productive.
- Stored Procedures don't provide much feedback when things go wrong. Unless the proc is coded interally with weird T-SQL or PL/SQL exception handling, we get cryptic 'errors' returned based on the particular line inside the proc that failed, such as Table has no rows. Uh, ok?
- Stored Procedures can't pass objects. So, if you're not careful, you can end up with a zillion parameters. If you have to populate a table row with 20+ fields using a proc, say hello to 20+ parameters. Worst of all, if I pass a bad parameter-- either too many, not enough, or bad datatypes-- I get a generic "bad call" error. Oracle can't tell me which parameters are in error! So I have to pore over 20 parameters, by hand, to figure out which one is the culprit.
- Stored Procedures hide business logic. I have no idea what a proc is doing, or what kind of cursor (DataSet) or values it will return to me. I can't view the source code to the proc (at least, without resorting to #2 if I have appropriate access) to verify that it is actually doing what I think it is-- or what the designer intended it to do. Inline SQL may not be pretty, but at least I can see it in context, alongside the other business logic.
So why use Stored Procedures at all? Conventional wisdom
says we do it because:
- Stored procedures generally result in improved performance because the database can optimize the data access plan used by the procedure and cache it for subsequent reuse.
- Stored procedures can be individually secured within the database. A client can be granted permissions to execute a stored procedure without having any permissions on the underlying tables.
- Stored procedures result in easier maintenance because it is generally easier to modify a stored procedure than it is to change a hard-coded SQL statement within a deployed component.
- Stored procedures add an extra level of abstraction from the underlying database schema. The client of the stored procedure is isolated from the implementation details of the stored procedure and from the underlying schema.
- Stored procedures can reduce network traffic, because SQL statements can be executed in batches rather than sending multiple requests from the client.
And many people buy into this philosophy, lock stock and barrel:
At just about every talk I give I always try to make several consistent statements. One of which is: Ã¢â‚¬ËœWhenever possible use stored procedures to access your data'.
However, there's one small problem: none of these things are true in practice. The benefits are marginal, but the pain is substantial. And I'm not the only person that feels this way. John Lam also adds:
As a guy who dabbles in low-level bit twiddling stuff from time-to-time, the performance claims are quite interesting to me. The new (as of SQL Server 7.0) cached execution plan optimization in SQL Server looks to me a lot like JIT compilation. If this is, in fact, the case it seems to me that the only overhead that would be associated with dynamic SQL would be:
I can imagine quite a few scenarios where the above overhead would disappear into the noise of the network roundtrip. What upsets me are the folks who spout forth anecdotal arguments that claim stored procedures have "much better" performance than dynamic SQL.
- The amount of bandwidth + time it takes to transmit the dynamic SQL text to the database.
- The amount of time it takes to calculate the hash of the dynamic SQL text to look up the cached execution plan.
For modern databases and real world usage scenarios, I believe a Stored Procedure architecture has serious downsides and little practical benefit. Stored Procedures should be considered database assembly language: for use in only the most performance critical situations. There are plenty of ways to design a solid, high performing data access layer without resorting to Stored Procedures; you'll realize a lot of benefits if you stick with parameterized SQL and a single coherent development environment.
* "the man".
Posted by Jeff Atwood
I worked at a Pfizer Central Research in 1998 when
VIA GRA (the space was needed to post.....sheesh!) was ratified and I was the WORKFLOW ADMIN and in charge of supporting the Medical Document Imaging System for the Case Report Form Library and I did everything in SQL on INFORMIX using DB-ACCESS.
Just be sure that *ALL* drives are MIRRORED on PRODUCTION DATABASES, and the servers are powered by a tested UPS.
I couldn't disagree more (at least for SQL Server)
I have seen massive performance differences. Procedures can be cached (heck, there is even a dedicated procedure cache).
There are huge performance hits. Parsing and execution plants are just one hit. I run about 500 concurrent users and even a .5 second speed improvement for a common process can lead to hours of saved CPU time per day.
Also - I find maintenance easier and it DOES integrate with VS (it even did back in 2004) so I can't see an advantage in NOT using procedures.
No - hold that thought. If you are writting a small one-off application that is never actually going to be run live, then sure. Otherwise, I simply wouldn't use embeded code (not evel LinQ - read up the performance impact of that over-hyped technology).
Philip, I believe when you use sp_ExecuteSql for dynamic sql it also creates and caches an execution plan. There's a good chapter in this book (http://www.microsoft.com/MSPress/books/8564.aspx) about dynamic sql and how in some cases it's faster than stored procedures.
It seems like that the writer has learned a new technique Web Service and just as a newbie trying to use it anywhere he can he is refusing to accept the stored procedure and is blindly advocating the use of Web Services. I behaved same when I was 18. Jeff forgets 1 thing...not all client apps developed by many different developers may utilize a web service. Not everyone is doing development in VS.NET
Someone should take a look at the Code Image... Only a fool can prefer inline code for this.
I think some of you are picking on SPs because someone shoved business logic in there. Business logic is supposed to be its own layer folks.
Declaring using SPs a dubious practice because people aren't using them correctly is like saying guns are bad because they shoot people. Just put the weapon down and back away slowly if you don't know how to use it.
Wow, long read. I support the moderate voices here. (And the blog author's quixotic spirit at the time). I love SQL and MS SQL Server. And yet when permitted... I've had great success with dynamically compiled sql. I put it in a DAL so it's easy to find. I generate the sql for most of my procs/udf with C# so they're easy to maintain. (Not so easy to pitch the homegrown solution now that we have Linq). I have no philosophical issues that require my DAL to be dumb about persistence, and since my clients aren't made of money, I don't think I could advise my client that it's a must...
I always use a proc when a lot of data needs to flow across several steps - situations where you might use a temporary table-type variable for large intermediate results and in the end return a relatively small amount of data to the middle tier.
Way back when, mandatory crud procs was the common wisdom. For all the reasons mentioned by the free thinkers, it's not so dogmatic anymore but it's still hard to argue against a zealot with a list of undated material to back up their rhetoric, superstition and ego.
I have seen abuse of inline SQL. Especially with recursive calls... Even recently, a large corporation had amazing engineering in many areas but their DAL consisted of a couple of methods that took a concatenated string of sql and returned an untyped dataset, (with no compile). They didn't care. That was most likely politics, but reading through the comments, I guess smartly normalized tables went out of fashion for awhile?
At any rate, it's nice to see MS remove the web service from SQL Server. It's productive to have a full featured db on tap. Oslo is making no apologies.
I don't understand the whole make it a web service deal. The examples above all talk about VB with dynamic SQL versus VB with SPs. If I'm running a VB app then I already have a connection to the database. I am presumably inside a transaction. Now instead of reusing an existing resource to access the database, you want me to open a socket and call a web service? There are several issues here:
1) That web service call is outside my transaction so I can't roll it back should something downstream require a rollback. REST and transactions don't really mix.
2) That web service needs to open a database connection and rerun all the authentication code I've already run when the client logged in. If there are several calls to the abstraction layer, there are several database connections.
3) My deployment is now more complex. Before I had a client application and a database server. Now I've added an http server to the requirements.
Jeff's argument that SPs are not the only way to section off the database through APIs. That's true. MQ services and CORBA calls are also ways to do this. They benefit though from maintaining transactional integrity through the use of transaction managers. REST-based web services are the antithesis of transactional programming.
Another point that I should have made above - if you use stored procedures, users can run them directly from the database in situations where you just want the raw data. This means that your users do not require sql skills, and nor do they table-level permissions - they simply need permissions to the procedure.
learn to use database! please!
I have worked on projects architected from both perspectives, and have come to the conclusion that:
1) Using SPs for CRUD is insane. If you do this, most schema changes require changing the SP interfaces...when this happens developers will choose not to change the schema because it is too much trouble. Instead in many cases they will choose to overload fields, make use of Table1.ExtraColumn1, Table1.ExtraColumn2, etc...refactoring a schema in any useful way is a huge PITA. Making refactoring harder makes refactoring not happen.
2) Using a normal language like Java or C# to do reporting is usually way too much coding and the result is extremely slow reports. This is one case where business logic will usually end up getting duplicated in the middle tier and data tier as reports will need to do calculations that the middle tier also has to do (like sales tax calculations for example).
3) The security argument in favor of using SPs for everything doesn't hold water. Any security setup that you have can be implemented with a proper application of table-based security with roles and views. This is only even necessary if you *have* to have rock-solid database security. 90% of the time you can get by with security implemented in the middle tier. If you are paranoid about this, just stick your middle tier behind a trusted authentication wall (a physical application server) which has access to the database.
4) Business logic is *much* easier to implement in a real OO language than in T-Sql, Pl-Sql, or whatever.
5) With a modern ORM, you can detect schema changes and get compile-time errors when a field is renamed, a table is renamed, field made not-nullable, etc...try doing this with SPs that can exist not only on one server, but may be spread all over the place using linked server connections. Bottom line: if you rely on SPs for business logic, changes are HARD and very error-prone.
6) Version control...TFS has some decent database version control capabilities, but TFS is very heavyweight and expensive. In general, keeping your code in actual code files saves a lot of time when you need to roll back a change.
7) Unit testing is much easier when your business logic is contained in an actual class where dependencies can be mocked. You cannot do this with SPs...the best you could do is some awkward test database configuration which only one developer can use at a time if this database is on a centralized server.
8) If you use SPs for your business logic, all the dependencies for this business logic had better exist in the database! Need to check the results of a web service to validate some input? Have fun with that using T-Sql. Want to validate input without putting it in the database first? Try doing this in an SP for the rule: An new order must have at least one OrderItem.
Like I said, I have seen both sides of this and can tell you with certainty that a standard business application is much more maintainable and flexible with business logic in a real middle tier. I have seen no good use for SPs with the exception of reporting and mass, complicated data manipulations.
Declaring using SPs a dubious practice because people aren't using them correctly is like saying guns are bad because they shoot people. Just put the weapon down and back away slowly if you don't know how to use it.
Database engines are designed to be able to process sets of data quickly and efficiently and compile Stored Procedures for this very purpose. MS best practices also recommend the use of Procs wherever possible for reasons of performance, security and data abstraction. I work as a DBA in a development environment and every sane developer I have met firmly agrees that SPs are the way to go. Like everything in IT there is more than 1 way to achieve the same end result but some ways are most definitely better than others.
This is like debating the value of mineral deposits to a tree. The true reality is that for the majority of development environments it would be fine to put that tiny 500 row CMS database in a csv and access it with a text driver. Lots of OO developers contributing to this have never..
1. Tuned a highly concurrent RDBMS.
2. Agonized over multi-path statistics optimizations on a VLDB.
3. Understand what index optimization is.
And even if you do know how to write explicitly parametrized dynamic calls in the middle tier and want to show off your new-found skills, what about your colleges? What about that new junior developer that I have to deal with on a daily basis? What about the normal lazy sloth that just wants to spit out code as fast as possible so they don't miss Grey's Anatomy? Son, we live in a world that has walls, and those walls have to be guarded by men with guns. Whose gonna do it? You?
So will Yukon help with any of these problems?
"The issue is not how many pieces of code will be affected, the issue is how do I find all of the affected pieces of code?"
Try Edit-Find And Replace-Find In Files
This has been a good read for me. I enjoy hearing both sides of the argument and really haven't had a strong opinion either way. Until now.
I believe the security and abstraction that comes with using stored procedures make them well worth implementing. I know the maintenance aspects have really saved my team on a number of occasions.
Security wise, preventing direct access to the tables make sense. It keeps them from being abused by the general public (or casual developer). Stored procedures create a standard interface that can be enforced, keeping an application developer from unknowingly putting your data into an invalid state.
Although it's bad practice, stored procedures can be modified to add functionality to a deployed application without re-release. This makes my boss happy because he doesn't like the time or the process involved in certifying and releasing builds. A stored procedure change can often take less than a few minutes and get an application back up and running. It can also add new functionality to a well designed application.
As far as being able to swap out databases, I think you can still do this with a small amount of effort. If your database supports stored procedures, you'll have to re-code them in the new database. One plus to note here is that your code should require NO changes to support this. I should say "in theory" but this has been the case for me on almost every occasion so far. If the new database doesn't support stored procedures (mySql) then you can override your standard datalayer (that interfaces to the stored procedures) and put the logic in there. A word of experience here: it takes a lot more effort to implement logic from a stored procedure in your data layer. One plus of database script languages is that they're designed specifically to manipulate data. They're very compact and focus primarily on that function.
Think about accessing the database through stored procedures (only) as a means to using your database like a service. It provides whatever (and only) the functionality implemented by your stored procedures, regardless of what application is accessing it. It you have one application and 4 import (applications) all using the database, stored procedures will help ensure that they all do it in a consistent way. New applications will already have a good baseline of database logic to choose from. You database becomes more like an entity, or single object that has specific functionality available.
If you've never tried stored procedures, you owe it to yourself to do so and formulate an opinion based on your experience. Do some more reading on the subject too. Quite often, you'll find that a development method/practice doesn't make sense because you haven't used it in the same way that other people have successfully used it.
You guys are clueless if you think that stored procedures have no real world benefit. When you have worked on a database that has over 2 TB of data, and literally more than 3k tables/views you will understand not only the power, but NECESSITY of them. You use stored procedures to present an API for a schema, and get the benefit of faster execution, security, etc.
Dynamic SQL queries are an absolute killer in terms of database performance. They do not scale, at all. When your users are screaming because your query is taking 20+ seconds to generate output, and the plan analyzer shows that it's taking 19+ seconds trying to figure out what to do with it, you know it's time for an SP. Actually, up-front would have been the time. They are also a killer when it comes to maintaining consistency for client applcations, which may not always be your simple compiled application. When you have 70+ projects sitting on one giant shared database + schema you simply cannot have developers mucking about directly with tables or views unless ABSOLUTELY necessary. Otherwise when you change the schema under them, they are screwed. And so are all your users using that dirty code.
It's nice to bitch about how much of a pain in the ass they are (and I agree - they are), but if you're that vehement about how useless they are it would seem you don't really know pain yet.
I have to ask....why is everyone saying that the alternative to stored procedures is SQL? Has anyone heard of an ORM? I mean, seriously, this is 2006.
For thos .NET people, check out nHibernate, or DataSets directly from MS. For all you open source people (re: java), I am sure you have heard of Hibernate and are probably using it because you don't have you head up your butt!
It seems that this stored procedure debate only happens with .NET people. Why is that? This issue has been settled for years on every other platform. it would be nice if .NET people would join the rest of hte world.
Sorry , i respect your right to your point of view but...
I can not think of one VALID reason for using messy inline code, Not one, using SQL in the db gives you Scalability for one, Security and a lot easier to use.
If your supporting more than 1 database environment, don't go the stored procedure route because you will need 2 sets of procedures, one for database A and one for database B. This will be hard to maintain and a pain in general. Plus right now just two, what happens when another database is added, now you need 3 sets.
Create a DAL (Interface) that each database layer will implement
How you implement the DAL is up to you.
I have a stored procedure that validates user logins and was wondering whether this was the best approach. The sproc is called from my middle tier and performs some of the following functions depending on whether the user is valid or invalid:
1. checks to see if the user exists in the user table
2. logs the login attempt to a log table
3. logs the user IP to a log table
3. checks to see if the user IP or username is banned
4. checks to see if the user status is valid i.e active user vs not verified etc...
5. updates the user record setting last login date to current datetime
6. returns invalid user or a valid user record including role, status etc...
The sproc performs multiple functions and I only need to make one call from the middle tier into the database. The application is subject to heavy load, 1000's of concurrent users and is transaction intensive.
How could this best be achieved using non sproc methods without incurring multiple calls into the database from the application layer ?
Thanks in advance
Anyone who has worked on enterprise level software would understand the benefits of sps over ad-hoc queries.
1. Using f00's user validation example, would you rather call
"uspValidateUser 'userId'" or would you rather send the whole query each time you need to validate a user? It could a difference of couple of bytes sent to the server or a couple of hundred Kilobytes PER validation.
2. Do you want to expose the query and its inner workings over the network or would you rather send "uspValidateUser 'userId'".
3. You can deny all permissions to a login and just allow the logins to execute certain sps. This simplifies security immensely. If you allow ad-hoc queries in the validation example, you must give that user access to the underlying tables. There may be columns that you do not want the user/developers to see. How would you prevent this? Column level permissions? Give me a break. Using views? This requires additional performance or space penalties depending on whether it's an indexed view or not.
4. If you were to send ad-hoc queries for multi-step transaction processes, such as f00's validation sp, you'd need to use a transaction for the duration of the connection as you process each step with ad-hoc queries. If it is a 5 step process, you are making at least 5 trips. With the sp, you only need a transaction within the sp, which is for the during of the session. This is orders of magnitude better in terms of blocking, performance, security and simplicity.
5. How can people say that managing schema changes are easier with ad-hoc queries than with sps. Let's say there is a change to a table that is accessed by uspValidateUser. If there are many groups( web group, java group, .Net group ) that use uspValidateUser, just changing uspValidateUser would suffice. If all these groups issued ad-hoc queries, each would have to search through their code and update/test.
Also, many schema changes, such as table or column name changes would not affect the developer if sps are used because the changes can be managed at the sp level. If you use ad-hoc queries, it would involve having to update the ad-hoc queries and compiling the app if it is an app.
6. As for migrating to different servers, it's infinitely simpler having the DBAs update the sps and developers update the DAL. This way the developers only have to worry about accessing the new servers because the sps should be the same on the new servers. You don't have to worry about how the underlying objects( tables/columns/functions/views/etc) are implemented.
Having been through stints as a C++ developer of enterprise software and a DBA of enterprise level rdbms, there is no doubt in my mind that having sps as an interface for raw data access provides more flexibility, performance and security than using ad-hoc queries. Also, sps fit nicely into the N-tier development model. Don't say that SQL is "archaic", it makes you sound ignorant. Is assembly language archaic, what about interpretive languages? Is Scheme and other procedural languages archaic? Are markup-language archaic? SQL is a specialized language for a specialized task. The only reason we are having this discussion is solely due to incompetance and ignorance. If you are unable or unwilling to spend the time to learn SQL properly, you shouldn't be a developer ( especially of data-driven software ). If you are like the 75% of "developers" out there who finished an online PHP tutorial and fancy himself an HTML "hacker", then go ahead and keep using those ad-hoc queries to your hearts content.
It seems to me that instead of fighting over one or the other we should be able to come to a common agreement: there are pros and cons to either side and there isn't a simple answer. Each one of us has different goals, viewpoints, experiences, skill sets, likes/dislikes, deadlines, ROI, etc. which can lean it one way or the other. If all the pros were on one side, we wouldn't be having this discussion. Having said that, I think these discussions are good so that we can make informed decisions.
I think the closest we can ever come to a common conclusion is a decision tree, but even then there will be differences of opinion. Here are some ideas to prove my point and get you started. I flattened the tree for this post. I am by no means suggesting I know it all, so bear with me.
1. Are you creating simple throw-away software? Are you sure? Use whatever is fastest to develop (RAD, DataSet, no n-tier/n-layer whatever).
1a. Does the throw-away software run for a long time (batch) and needs to perform better? Consider optimizing your code, using faster classes (no DataSet but rather DataReader, custom class...), less recursive functions, etc.
1b. Will you be working with lots of data where memory could be a problem? Process one piece at a time.
2. Are you creating a LOB app for a small amount of users? Are you sure it won't need to scale soon? Focus more on coding/maintenance and not so much on performance. An ORM, dynamic SQL, would probably work fine.
3. Are you creating a website for a large amount of users? Consider using stored procedures to reduce roundtrips and the amount of data that goes back and forth.
4. Are you creating a website for a VERY large amount of users?
4a. Is it mostly used as read-only (e.g. news site)? Consider using application caching to reduce database load. Consider having one write-only database and replicate to read-only databases.
4b. Is it mostly used for writing? It's easier to scale a web farm than a database, so consider not using stored procedures, triggers, etc.
5. Is there a real chance you will want to switch databases?
5a. Do you need it to perform well? Consider using stored procedures to tune the queries for each database type. This will require more work.
5b. Otherwise, keep everything in a DAL/BLL to reduce dependency on database type.
6. Will you be accessing the business logic from different languages (C#, Java, etc.)?
6a. Do you want to go through the trouble of creating a web service with the pros/cons it has? Would all systems go through the web service? Keep your business logic outside of the DB and in your code.
6b. Do you want to flatten some layers and keep it simple? Put your business logic in the DB.
6c. Will it need to scale a lot (#4)? Reconsider the web service. You may need a web service farm as well.
7. Will the business logic always be accessed from the same language, but several applications, websites, etc.? Keep it in the BLL.
8. Do you have expert DBAs, able/willing to make changes as needed when needed?
8a. Are the DBAs not that expensive and/or you need performance and you're not scaling to extremes (#4)? Consider using the database more.
9. Do you only have programmers who don't understand SQL? Consider getting DBAs or keep more out of the database, use an ORM, etc.
10. Did you decide to not use stored procedures? Do you have a process that requires a lot of back/forth to the DB that could really be optimized with a stored procedure? Consider using a stored procedure anyway in this case.
Note that some questions can go back/override some of the previous "conclusions". I know it's not accurate or complete (could it ever be?) and I'm sure many of you could add and improve on this, but that's not the point. The point is there are a zillion variables and there is no "silver bullet" or "one size fits all". Yes, sometimes one way of doing things is usually better, but there is generally at least one extreme/edge circumstance that overrides "usually better". Other things are closer to middle ground and generally start "religious" wars such as this.
Choose the right tool for the task and good luck with deciding what the right tool is!
Just to clarify, it's a very complex decision and I completely skipped over some parts such as security considerations. :)
Anti-SPROC developers must be writing the equivalent of "See spot run" programs. It's a rare developer indeed that knows what they are talking about when it comes to databases. 100 million record+ systems that have dozens of applications written against them better wall off "programmers" from the database with SPROCS or face a world of woe. Every idiot thinks he can write SQL or worse yet relies on ORM to do the idiot work for them. Do your work and let the people who know do theirs.
This is a great article (at least if you read through all the comments), and as mentioned before, it all depends on the requirements. All in favor of SP's seem to only be on 1 DBMS, and in that case I would propably agree, but what about a retail package which must support more DBMSses because the client already has Oracle or MS SQL Server and doesn't want another DBMS on their network, or isn't willing to pay for a license and wants to resort to MySQL? What if you don't have the 'direct' control of the database and everything has to go through an updateapplication.
I've been reading this article because we are starting up a new projec which must at least be able to support MS-SQL Server, Oracle and MySQL, as it all depends on the client. After reading this whole article I get the impression that SP's aren't the way to go for this kind of project, unless we are willing to invest a lot of time on having to spend updating/maintaining all those SP's.
But as said before, it all depends on what your needs are, and budget/time available..