You may have noticed that my posting frequency has declined over the last three weeks. That's because I've been busy building that Stack Overflow thing we talked about.
It's going well so far. Joel Spolsky also seems to think it's going well, but he's one of the founders so he's clearly biased. For what it's worth, Robert Scoble was enthused about Stack Overflow, though it did not make him cry. Still, I was humbled by the way Robert picked this up so enthusiastically through the community. I hadn't contacted him in any way; I myself only found out about his reaction third hand.
That's not to say everything has been copacetic. One major surprise in the development of Stack Overflow was this recurring and unpredictable gem:
Transaction (Process ID 54) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
Deadlocks are a classic computer science problem, often taught to computer science students as the Dining Philosophers puzzle.
Five philosophers sit around a circular table. In front of each philosopher is a large plate of rice. The philosophers alternate their time between eating and thinking. There is one chopstick between each philosopher, to their immediate right and left. In order to eat, a given philosopher needs to use both chopsticks. How can you ensure all the philosophers can eat reliably without starving to death?
Point being, you have two processes that both need access to scarce resources that the other controls, so some sort of locking is in order. Do it wrong, and you have a deadlock -- everyone starves to death. There are lots of scarce resources in a PC or server, but this deadlock is coming from our database, SQL Server 2005.
You can attach the profiler to catch the deadlock event and see the actual commands that are deadlocking. I did that, and found there was always one particular SQL command involved:
UPDATE [Posts] SET [AnswerCount] = @p1, [LastActivityDate] = @p2, [LastActivityUserId] = @p3 WHERE [Id] = @p0
If it detects a deadlock, SQL Server forces one of the deadlocking commands to lose -- specifically the one that uses the least resources. The statement on the losing side varied, but in our case the losing deadlock statement was always a really innocuous database read, like so:
SELECT * FROM [Posts] WHERE [ParentId] = @p0
(Disclaimer: above SQL is simplified for the purpose of this post). This deadlock perplexed me, on a couple levels.
If you aren't eating -- modifying data -- then how can trivial super-fast reads be blocked on rare writes? We've had good results with SQL Server so far, but I found this behavior terribly disappointing. Although these deadlocks were somewhat rare, they still occurred a few times a day, and I'm deeply uncomfortable with errors I don't fully understand. This is the kind of stuff that quite literally keeps me up at night.
I'll freely admit this could be due to some peculiarities in our code (translated: we suck), and reading through some sample SQL traces of subtle deadlock conditions, it's certainly possible. We racked our brains and our code, and couldn't come up with any obvious boneheaded mistakes. While our database is somewhat denormalized, all of our write conditions are relatively rare and hand-optimized to be small and fast. In all honesty, our app is just not all that complex. It ain't rocket surgery.
If you ever have to troubleshoot database deadlocks, you'll inevitably discover the NOLOCK statement. It works like this:
SELECT * FROM [Posts] with (nolock) WHERE [ParentId] = @p0
It isn't just a SQL Server command -- it also applies to Oracle and MySQL. This sets the transaction isolation level to read uncommitted, also known as "dirty reads". It tells the query to use the lowest possible levels of locking.
But is nolock dangerous? Could you end up reading invalid data with read uncommitted on? Yes, in theory. You'll find no shortage of database architecture astronauts who start dropping ACID science on you and all but pull the building fire alarm when you tell them you want to try nolock. It's true: the theory is scary. But here's what I think:
In theory there is no difference between theory and practice. In practice there is.
I would never recommend using nolock as a general "good for what ails you" snake oil fix for any database deadlocking problems you may have. You should try to diagnose the source of the problem first.
But in practice adding nolock to queries that you absolutely know are simple, straightforward read-only affairs never seems to lead to problems. I asked around, and I got advice from a number of people whose opinions and experience I greatly trust and they, to a (wo)man, all told me the same thing: they've never seen any adverse reaction when using nolock. As long as you know what you're doing. One related a story of working with a DBA who told him to add nolock to every query he wrote!
With nolock / read uncommitted / dirty reads, data may be out of date at the time you read it, but it's never wrong or garbled or corrupted in a way that will crash you. And honestly, most of the time, who cares? If your user profile page is a few seconds out of date, how could that possibly matter?
Adding nolock to every single one of our queries wasn't really an option. We added it to all the ones that seemed safe, but our use of LINQ to SQL made it difficult to apply the hint selectively.
I'm no DBA, but it seems to me the root of our problem is that the default SQL Server locking strategy is incredibly pessimistic out of the box:
The database philosophically expects there will be many data conflicts; with multiple sessions all trying to change the same data at the same time and corruption will result. To avoid this, Locks are put in place to guard data integrity ... there are a few instances though, when this pessimistic heavy lock design is more of a negative than a positive benefit, such as applications that have very heavy read activity with light writes.
Wow, very heavy read activity with light writes. What does that remind me of? Hmm. Oh yes, that damn website we're building. Fortunately, there is a mode in SQL Server 2005 designed for exactly this scenario: read committed snapshot:
Snapshots rely on an entirely new data change tracking method ... more than just a slight logical change, it requires the server to handle the data physically differently. Once this new data change tracking method is enabled, it creates a copy, or snapshot of every data change. By reading these snapshots rather than live data at times of contention, Shared Locks are no longer needed on reads, and overall database performance may increase.
I'm a little disappointed that SQL Server treats our silly little web app like it's a banking application. I think it's incredibly telling that a Google search for SQL Server deadlocks returns nearly twice the results of a query for MySql deadlocks. I'm guessing that MySQL, which grew up on web apps, is much less pessimistic out of the box than SQL Server.
I find that deadlocks are difficult to understand and even more difficult to troubleshoot. Fortunately, it's easy enough to fix by setting read committed snapshot on the database for our particular workload. But I can't help thinking our particular database vendor just isn't as optimistic as they perhaps should be.
first post!
Kevin on August 25, 2008 8:06 AMI just went through this on a system. Row level locking for updates may also be worth looking into for your situation as well.
Tom Willis on August 25, 2008 8:08 AMA Google search for ["sql server" tables] returns 216 times as many hits as [mysql tables]. A search for ["sql server" database] returns mroe than 6 times as many hits as [mysql database]. I'm not sure that I would draw any pro-MySQL conclusions from the statistics you present.
mph on August 25, 2008 8:15 AMYou could try caching (if not already) to eliminate some of the database reads, assuming that it's not dynamically changing data.
I've never ran into this sort of issue and I have some database heavy sites... apparently not heavy enough though.
Kris on August 25, 2008 8:17 AMAs mph said retaled to the fact:
"I think it's incredibly telling that a Google search for SQL Server deadlocks returns nearly twice the results of a query for MySql deadlocks"
you will need how many deadlocks entry per user you find at Google, not the overall.
SoMoS on August 25, 2008 8:20 AMGreat post jeff and thanks for the explanation on the blogging frequency; I was becoming a little antsy about it. I understand now. I've seen this exact problem on production systems and I had no idea what was wrong, now I at least have a chance to fix it.
o.s. on August 25, 2008 8:20 AMYou've probably got a highly normalised database and a long running SELECT, which is backing up the UPDATE, which, in turn, is backing up the other random SELECTS.
By the time you get around to looking at the queries, the long running guy is finished, the UPDATE is more important and your subsequent SELECT gets killed.
It's a common DB design problem.
Take a look at the query log and see if you can find the root cause SELECT.
James B. on August 25, 2008 8:26 AMYour drawing shows something like 6 philosophers and a guy that is maybe serving. There is not enough chopstick. I'm afraid but you might not be able to draw or choose good drawing anymore after Sql Server battle :)
I already read article where you where complaining about Sql Server. Do you think Mysql won't be any better ?
And yes i saw that you had fewer time to blog.
good luck
BobMolb on August 25, 2008 8:28 AMWhy not post it as a question on StackOverflow, Jeff?
Kyle on August 25, 2008 8:30 AMA couple of days ago, I watched a talk on the University of Washington public access channel about some research into resource locking. This talk is specifically about resource locking in the context of multi-core processors, but it applies to the whole class of problems just as readily. The solution proposed by this professor (Dan Grossman) is to build the handling of resource locking into the compiler, the same way that garbage collect has been integrated into comilers over the last 15-20 years (do you remember the days before garbage collection? Blech!)
Jeff, I know this doesn't help you guys now, but it is food for thought.
Summary from the UW website:
With multicore processors bringing parallel computing to the masses, there is an urgent need to make concurrent programming easier. Software transactions hold great promise for simplifying shared-memory concurrency, and they have received enormous attention from the research community in the last couple years. This talk will provide an overview of work done at the University of Washington to help bring transactions to the next generation of programming languages.
The program is available for download at http://www.uwtv.org/programs/displayevent.aspx?rID=22341&fID=1471.
Ken Heutmaker on August 25, 2008 8:31 AM@mph - That's hardly fair. MySQL is itself a "SQL server".
ceejayoz on August 25, 2008 8:32 AMIt's always a little disturbing to see a well-known coder ask a dumb question, but come on, database locks? This is very basic stuff! RDBMSs, in particular, are intended to preserve data integrity in addition to enabling fast read/write access to large data collections. Otherwise file systems could be made sufficient.
In your case, since your innocuous SQL statement doesn't grab the parent along with the child posts, you could easily have a difference between the answer count and the # of children actually returned. Of course it's not a huge problem that you should spend sleepless nights on, but it will lead to a bug that is just shy of impossible to reproduce or debug.
mgb on August 25, 2008 8:35 AMWasn't the diner a drive system for some ship in one of the Hitchhiker's Guide to the Galaxy books?
stussy on August 25, 2008 8:39 AMI can't really think of a justification for the out-of-the-box behavior of SQL Server as you described. As far as I know Oracle always uses snapshots for reading, so a simple SELECT can never cause a deadlock.
Come to think of it, why are you using SQL Server at all, and not something free (beer or speech, take your pick)?
Noah Yetter on August 25, 2008 8:39 AM"What about retries? I find it hard to believe that little write would take so incredibly long that a read would have to wait more than a few milliseconds at most."
You're aware, of course, that a deadlock has nothing to do with how long a particular thread is holding a resource, but instead occurs when multiple threads are all waiting on resources that each other are holding.
Gus on August 25, 2008 8:40 AMNolock is probably a reasonable solution in this case, but I'd like to point out that a deadlock can only happen when multiple resources (usually tables) are involved in simultaneous transactions. If you've got a SELECT that operates on tables A and B and an UPDATE that operates on the same ones, then the SELECT will grab a lock on table A, the UPDATE will grab table B, and then both will sit there while they try to get a lock on the other table. Or if you've got a block of statements in a single transaction, you'll get the same effect.
If your app is denormalized to the point where you only need a single table for most queries (as many simple web apps are), you shouldn't see deadlocks. You may want to check out your transactions and make sure they're not doing more than necessary within a transactional block.
Ira on August 25, 2008 8:43 AMceejayoz: So bitch at Jeff, since I modeled my queries on his. If my "sql server" queries are contaminated with MySQL results, then so are his, and my point stands.
mph on August 25, 2008 8:43 AMYou could use CouchDB, which uses optimistic commits and never ever deadlocks.
Just sayin.
Damien Katz on August 25, 2008 8:49 AMThe issue is probably going to come down to design. Is the Id/ParentId a GUID and does the table have a lot of rows? Is that select resemble the real select that is happening? Do you have indexes including a clustered? Are you sure the deadlock is caused by the select? Are you issuing the select and update in different connections?
The idea that MySQL has less deadlock issues then SQL Server based on a google search returning less results does not tell much. SQL Server is used for many more real applications then SQL Server is what it tells me.
Jason on August 25, 2008 8:50 AMHey, Jeff.
If you haven't found the exact reason why your deadlocks are occuring maybe you should try this method:
http://weblogs.sqlteam.com/mladenp/archive/2008/07/18/Immediate-deadlock-notifications-without-changing-existing-code.aspx
this way you don't have to worry about running the profiler or anything else. when the deadlock happens you get the entire deadlock graph saved to a table and get a notification my mail if that's what you wish.
This method has helped me trace a few bastardly deadlocks with great success.
hope it helps.
Mladen on August 25, 2008 8:58 AMIf your database allows you to set the isolation level as serializable, and supports MVCC, you can gain all of the safety you would need for a full banking application without getting any of the locking problems.
If the update that updates the count was still running, as long as you were careful to have the actual update of the comment in the same transaction, it would always be in sync, the comment just would not appear until the count had been updated.
A better question, however (Cobb would be proud) is why in the world are you storing the count of the replies on the post record??? The last activity and count are easily calculated using aggregate SQL functions.
There are, in effect, two issues.
1) Your database either cannot support concurrent use well, or it is not configured correctly.
2) You schema design looks to have some issues.
I'm no DBA, but this feels like a hack and hacks usually lead to headaches later on.
Hoffmann on August 25, 2008 9:00 AMI'm a big fan of using "with (nolock)" hint by default.
That doesn't mean that all queries should be written with nolock, but most of SQL queries should.
Especially for web app.
In some situations you want to be sure that your data is consistent no matter what. In this case you should use different locking hints. Which ones -- would heavily depend on the situation.
You also need to clearly understand what exactly locking does.
The locking model is relatively simple, the problem is that most of locking tutorials are just terrible.
I wrote the tutorial about how to learn about SQL Server Locking:
http://developmenttips.blogspot.com/2008/08/cure-for-deadlocked-learning-to-use.html
I recommend this episode of software engineering radio:
http://se-radio.net/podcast/2008-05/episode-99-transactions
"This episode takes a close look at transactions from different angles, starting with their fundamental properties of Atomicity, Consistency, Isolation, Durability but also investigating advanced topics like distributed or business transactions."
> If you've got a SELECT that operates on tables A and B and an UPDATE that operates on the same ones, then the SELECT will grab a lock on table A, the UPDATE will grab table B, and then both will sit there while they try to get a lock on the other table. Or if you've got a block of statements in a single transaction, you'll get the same effect.
I get that, but I don't understand why simple read SELECTS need to lock *at all*. I mean, maybe if that database was storing my bank account balance, in which case transactional orders are hugely important.
I guess what I'm saying is that Starbucks Doesn't Use Two-Phase Commit
http://www.eaipatterns.com/ramblings/18_starbucks.html
Jeff Atwood on August 25, 2008 9:06 AMThere are a couple of things you might want to look into:
- First of all SQL Server locks pages by default (not rows) which makes deadlocks more likely if the offending rows lie on the same page
- One aproach that migh help, is actually follow the advice from the error message, and retry the transaction.
1) That's ok to store number of replies for every post.
But instead of storing it into Post table, store number of replies to PostCommentCount table.
(PostId uniqueidentifier, CommentCount int)
One record in Post table would match with one or zero records in PostCommentCount table.
2) Another problem with locks in SQL Server is that typically SQL Server locks not just one record, but whole Page or even whole table.
P.S.: I'm really missing ability to edit my comments (or at least delete them so I can replace them with corrected ones).
> is actually follow the advice from the error message, and retry the transaction.
OK, but why doesn't SQL Server re-issue the transaction on my behalf, then? Automatic retries of some kind?
Jeff Atwood on August 25, 2008 9:15 AMJeff:
I agree with Hoffmann, this sounds like a hack. I deployed dozens of enterprise level .NET/SQL and Classic ASP/SQL applications and have not come across this issue.
It sounds like a transaction is being created on the call to the database, either by LINQ or in code, and it not being released. I would pose the question to Scott Guthrie, I bet he can point to one of the LINQ developers that will shed more light on this subject.
Josh.
Josh Hurley on August 25, 2008 9:15 AMI think you misunderstand what a deadlock really is. A deadlock is not a lock that is taking a long time. It is a lock that will never be released until one of the two processes is killed. Here's what happens in a deadlock:
1. Query 1 begins by locking resource A
2. Query 2 begins by locking resource B
3. Query 1, in order to continue, needs a lock on resource B, but Query 2 is locking that resource, so Query 1 starts waiting for it to release
4. In the meantime, Query 2 tries to finish, but it needs a lock on resource A in order to finish, but it can't get that because Query 1 has the lock on that.
So, you have two queries fighting over SEPARATE objects, not fighting over the same object (in which case one query simply has to wait). That's the important part - these queries will *never* finish. This is quite different from something that is simply taking a long time. There is no solution to the deadlock race condition other than to kill one of the processes.
For this reason, I would suggest focusing only on queries which involve multiple rows or multiple tables. Single-row, single-table queries, such as the one in your example, can never be deadlocked. They can be waiting on a locked resource which is involved in a deadlock condition involved two further queries, but in this case, killing the single-lock query still won't fix the problem.
I do not advise using the NOLOCK hint, but you are correct that it can't really hurt anything on website reads. Someone will see out of date information, but in most cases, that should be ok.
Jasmine on August 25, 2008 9:16 AMSQL Server does not re-issue your command because the command itself could be dependent on the data changes that were being made by the query that wasn't killed. This gives the client application a chance to run the transaction again, including any reads which may need to be done to data that was changed and is now in a consistent state.
Jasmine on August 25, 2008 9:20 AMSQL Server 2005 sucks but you only find out after you've got several millions records and it's too late to migrate off. I've lost all faith in SQL Server as an "enterprise" DB when performance started going in the toilets with heavy reads despite the fact that the behemoth it's running on has almost no disk and CPU activity.
Tae Kim on August 25, 2008 9:20 AMSQL Server 2005 sucks but you only find out after you've got several millions records and it's too late to migrate off. I've lost all faith in SQL Server as an "enterprise" DB when performance started going in the toilets with heavy reads despite the fact that the behemoth it's running on has almost no disk and CPU activity.
Tae Kim on August 25, 2008 9:20 AMThe Stack Overflow beta started already?! How did I miss it? I hope I can still get in. *hurries to send a sign up request*
TM on August 25, 2008 9:22 AM>I get that, but I don't understand why simple read SELECTS need to lock *at all*. I mean, maybe if that database was storing my bank account balance, in which case transactional orders are hugely important.
The description of the pessimistic behavior and the description of 'read committed snapshot' make the latter sound like multi-version concurrency. Near the beginning of the post, when you said that a SELECT was part of the problem, I was thinking, "Why? If it was using MVC then a SELECT wouldn't even lock."
Then you got to the part about snapshots. It seems SQL Server (which I do not use*) can do MVC, but not by default.
* I work with MySQL/PostgreSQL/Firebird.
Joel F on August 25, 2008 9:23 AMBeen there, done that. Anyone using SQL Server in a large system will run into this. As a rule, when we select from a table we use "with (nolock)". Here's another mostly unknown Gem you should know about:
[QUERY] with MAXDOP(4) -- use a max of 4 processors
The command above tells SQLServer how many max processors it can use on a query. We ran into a query that would run MUCH MUCH faster on 1 processor than multiple. Our 8 core superserver was trying to be too smart with the query I guess.
Maximum Degree Of Parallelism (MAXDOP):
http://www.sqlmag.com/Articles/ArticleID/97044/97044.html?Ad=1
I agree with Jasmine. To add to it, because you don't show the comments in coversation style but rather order of 'votes' the appearance of out of date will be minimal as well.
osp70 on August 25, 2008 9:26 AMIt should always be okay to be viewing out of date info on the web shouldn't it? That's the nature of a web page - it's as stale as the last refresh. This web page may have had new comments added since I started viewing it two minutes ago.. what's the point in using fancy locking in that case.
>Wasn't the diner a drive system for some ship in one of the Hitchhiker's Guide to the Galaxy books?
Yup. It was from Douglas Adam's _Life, The Universe, and Everything_:
Bistromathics:
"The most powerful computational force known to parascience. Bistromathics is a way of understanding the behavior of numbers. Just as Einstein observed that space was not an absolute, but depended on the observer's movement in time, so it was realized that numbers are not absolute, but depend on the observer's movement in restaurants.
The first nonabsolute number is the number of people for whom the table is reserved. This will vary during the course of the first three telephone calls to the restaurant, and then bear no apparent relation to the number of people who actually turn up, or to the number of people who subsequently join them after the show/match/party/gig, or to the number of people who leave when they see who else has turned up.
The second nonabsolute number is the given time of arrival, which is now known to be one of those most bizarre of mathematical concepts, a recipriversexclusion, a number whose existence can only be defined as being anything other than itself. In other words, the given time or arrival is the one moment of time at which it is impossible that any member of the party will arrive. Recipriversexclusions now play a vital part in many branches of math, including statistics and accountancy and also form the basic equations used to engineer the Somebody Else's Problem field.
The third and most mysterious piece of nonabsoluteness of all lies in the relationship between the number of items on the check, the cost of each item, the number of people at the table and what they are each prepared to pay for. (The number of people who have actually brought any money is only a subphenomenon in this field.)
Numbers written on restaurant checks within the confines of restaurants do not follow the same mathematical laws as numbers written on any other pieces of paper in any other parts of the Universe.."
Steve on August 25, 2008 9:29 AMI see JC already suggested this, but I have worked on two financial apps using SQL Server and both have had issues with deadlocks (and yes, we tried all the usual deadlock analysis. We arrived at two approaches:
1. NOLOCK hint. As Jeff mentions, this is rarely as bad as people make it out to be. Gee, so the data is 500 ms out of date. Tough. There are only a few cases where this actually matters.
2. Put in your own retry logic (detect the deadlock error and resubmit the query). This may feel like a hack, but guess what, it works. And would you rather have this "hack" or have your users see stupid database errors. If you have centralized database access methods, it is fairly simple to add retry logic.
You can't "fix" SQL Server, so you just have to deal with the real world here and do whatever it takes to make sure your users don't experience database errors.
J Liles on August 25, 2008 9:29 AMJeff, before you post on things like SQL Server locking you should do your homework beyond Google and blogs. Books still have a place, you know.
http://www.insidesqlserver.com/
My guess, without having access to your logs and such, is that your deadlocked transaction (the one with the SELECT) is probably part of a longer outer transaction that acquired a shared lock on the rows that are being UPDATEed by the other one.
Setting your database transactions to snapshot isolation mode can still result in problems, since it will raise an error if you try to update rows that have changed since the beginning of your transaction (version mismatch) and also takes out X locks on writes, so you can still get deadlocks.
The correct way to fix this problem beyond using snapshot isolation mode (which will help greatly, because it removes the use of shared locks) is to a) minimize the length of your transactions to minimize the number of locks and b) always set up your web server to re-run requests in the case of a update conflict or deadlock.
We've built a large service (Earth Class Mail) pretty much exclusively on LINQ to SQL. It's worked great, except that we're still battling all the transaction issues almost a year later (deadlocks, leaky transactions, etc). Make sure you FULLY understand the intersection between TransactionScope (if you're using it) and SqlTransaction. We didn't, and we're still paying for it (with daily deadlocks just like what you describe). Make especially sure you're aware how Read Committed Snapshot (and any other tx isolation level) can "leak" into your connection pool, otherwise, you'll end up running stuff at the wrong isolation level (often causing more deadlocks and "how the he*l did that happen- we're not even in a transaction!" moments. We ended up building a bunch of infrastructure into our own DataContext-extended base class to deal with it, and to ensure that the connection pool stays free of connections stuck in transactions.
Good luck!
Matt Davis on August 25, 2008 9:30 AM>OK, but why doesn't SQL Server re-issue the transaction on my behalf, then? Automatic retries of some kind?
Think of the banking application... you wouldn't want to re-issue anything if you know something slipped... it IS better that SQL server team be pessimistic by default.
Double K on August 25, 2008 9:34 AMI believe what we are seeing here is the result of using ORM magic and now the resulting pain of trying to debug the SQL the ORM is doing.
I've never encountered a deadlock situation, but I use my own handwritten sprocs, so I know exactly what the DB is doing at all times.
Jeff "I get that, but I don't understand why simple read SELECTS need to lock *at all*. I mean, maybe if that database was storing my bank account balance, in which case transactional orders are hugely important."
Because the selected data could be usefull later during the transaction. It's why you don't use transaction if you don't care about your selected data.
Arkh on August 25, 2008 9:38 AMI'm sure you could set all of your SELECT to read uncommitted by default, but -- really -- who cares?
If you have something mission-critical where you can't afford a dirty-read, isolation is good. For the rest of us, NOLOCKS are wonderful as long as you know when to apply them.
Playing it safe by default is fine. Forces you to actually think about what you're doing.
--Kevin
Kevin Fairchild on August 25, 2008 9:39 AMThe reason most people recommend accepting dirty read (with nolock) is because you get better performance, have not seen if this has changed with ms-sql 2008. However don't use it when the results really matter for instance if you are transferring money between two rows and you do a dirty read in the middle you could get the money counted twice once in the original row and also in the row it is being transferred into.
As for Oracle by default it only reads committed information at the time the statement started running
will on August 25, 2008 9:39 AMRe. Row level locks on UPDATE statements. I can demonstrate a scenario where this consistently and reliably doesn't work - it is a very small table, and the WHOLE table ALWAYS gets locked. Microsoft's semi-official response was (paraphrased) "row locking on update is a hint to the server, not an instruction."
Syd on August 25, 2008 9:43 AM> OK, but why doesn't SQL Server re-issue the transaction on my behalf, then? Automatic retries of some kind?
Jeff, would you expect any other API to retry a command if it failed? It would seem like this is a cleanly written API. I think of SQL as an API to SQL Server (or any other RDBMS). It should be up to the client to handle the error.
Jason Jackson on August 25, 2008 9:45 AMI switched to using "(nolock)" years ago -- I would say that it is the right choice at least 90% of the time.
One should build one's application correct first, then optimize. The database's default behavior is "paranoid locking for banking correctness "; this is the sane default. In a web messaging app, slightly stale data is perfectly acceptable, but for banking stale data is a fatal error.
Bob Whiteman on August 25, 2008 10:01 AMJust thinking of DB deadlock (Oracle background), the only time I've seen them is when there are multiple statements in a transaction.
Like..
begin tx
insert
update
commit
I've not worked sufficiently with SQL Server or AutoCommit to know if this could happen.
Just something to think about.
Greg on August 25, 2008 10:05 AMI've got a couple quick thoughts flying around in my head...
> "OK, but why doesn't SQL Server re-issue the transaction on my behalf, then?"
I can think of many ways to argue this point, but really, it comes down to practice... If you know the problem and solution, why not just implement the code to test for that condition and re-submit the batch when it occurs? It's not rocket science... in fact, it's good practice for any data access layer.
Instead of relying on a whole database change or dealing with (nolock) on lots of tables, have you considered using SET TRANSACTION ISOLATION LEVEL on the problematic queries? For example, SET TRANSACTION ISOLATION LEVEL SNAPSHOT: "Except when a database is being recovered, SNAPSHOT transactions do not request locks when reading data. SNAPSHOT transactions reading data do not block other transactions from writing data. Transactions writing data do not block SNAPSHOT transactions from reading data."
Maybe not applicable in your specific case, but just for completeness, I want to mention using SET DEADLOCK_PRIORITY in some instances... maybe where the read is killing the write.
There's a lot of things you can do to troubleshoot the actual cause of the deadlock, but I really hate to see someone configure a whole database setting just because they don't understand the problem, but rather want to assuage the symptoms.
Taylor Gerring on August 25, 2008 10:07 AMWhat is surprising is that SQL Server thinks there is a *deadlock*. It doesn't wait for the writer to finish and then just do the read? You're not even referencing other tables in your UPDATE or doing an INSERT SELECT.
Anyhow, the ANSI standard way to do your "nolock" thing is is to set the transaction isolation level, either globally or at the time that the transaction starts. (It sounds like SQL Server defaults to SERIALIZABLE for its transactions? That's a very strange default.)
-Max
Max Kanat-Alexander on August 25, 2008 10:07 AMJeff,
I would recommend not to use LINQ if you really want to make your site work.
LINQ is raw and not very reliable technology.
Unfortunately not everything that ScottGu promotes is good enough for production use.
ASP.NET 2.0 was really good. LINQ, Silverlight, and Atlas -- not so much.
Jeff, tell us about any foreign keys you have connected to the page table. A lot of times a deadlock can be caused by SQL server attempting to reconcile an FK.
Nolock is the ON ERROR RESUME NEXT of SQL. It's not a good practice to get into unless you understand fully why you're doing it.
Are your primary keys Autonums? If so, take a look at how SQL Server treats locking at a page level instead of a row level. If you have a page that's id #10 and id #11 is being heavily updated, it's possible that the updates to id #11 is locking the entire data page. Maybe you should change field names or something and post the entire SQL here. I get the feeling something is being abstracted out. You should only be seeing a deadlock situation in the real world maybe one out of a million transactions. Otherwise, there's a code problem in there somewhere.
http://msdn.microsoft.com/en-us/library/ms173763.aspx
lix on August 25, 2008 10:17 AM"It ain't rocket surgery"
Nice pericombobulation there :)
Mark on August 25, 2008 10:19 AMI totally agree with Justin on the parallelism business. After turning it off, the performance more than doubled in my system.
In addition, parallelism can also lead to a phenomena known as CPU starvation. It works like this, let's say you have different 20 queries in the queue and an 8 CPU box. Typically many of them would execute simultaneously since they all hit different tables. However, with parallelism, SQL Server breaks up the first query into 7 pieces and sends them off to 7 different CPUs. Now most CPUs are busy and the ones that completed processing are waiting for the results from other CPUs to trickle in.
Meanwhile, the queries in the queue, are sitting around waiting. Then, if the queue wait timeout is hit, SQL Server dumps the queries into tempdb, to be retrieved milliseconds later, when CPUs are available. So all of a sudden, your system is going to hell because of all the writes to tempdb.
As far as the deadlocks are concerned, I'd throw several traceflags on the system - they will tell you exactly why (and what keys, indexes, etc... are involved) your deadlocks are happening.
Try this for starters,
http://support.microsoft.com/kb/832524
then if you need some more help, contact me via the blog.
Regards
Robert
Weird problem I have never experienced it especially as you are saying a few times a day. I might have missed it if it was something once a month but not a few times a day.
Are you as I have noticed in your previous posts using lots of dynamic sql as oppose to stored procedures?
Even for a quick select or update statement the query plan compilation can take time on a busy server, perhaps this is causing record locks to last longer or even lock more than necessary.
pete on August 25, 2008 10:32 AMOh, you're using SQL-Server 2005...they fixed that in 2008....
PS: don't you hate it when some smart-alleck comes in and say, "Oh, they fixed that in the new version that you may never transition to for the next two years' rather than saying something constructive?...I do.
PPS: I don't know anything about 2008....there don't you feel better? 80) Happy Arbor Day (as no trees were killed in the production of this tripe...now, cattle?...that's a different story....I kill them like the plague.
PPPS: Love the show.
Keng on August 25, 2008 10:34 AMThe only reason you should be getting a deadlock is if you are waiting on someone who is in turn waiting on you. You have to ask yourself two questions: why, exactly, are they waiting on you? and why, exactly, are you waiting on them?
If your read transaction is waiting on some other transaction's write and you deadlock, that means they are waiting on you. Usually because of your own write lock. Do you really need to hang on to that lock (aka, why not just commit what you have)?
Jim McCusker on August 25, 2008 10:34 AMOMG! i was shocked when reading this : "How can a read be blocked by a write? What possible contention could there be from merely reading the data?"
like in one comment : "It's always a little disturbing to see a well-known coder ask a dumb question"
and another : "I think you misunderstand what a deadlock really is. A deadlock is not a lock that is taking a long time. It is a lock that will never be released until one of the two processes is killed."
i totally agree...
i've streched my head enough to know that threads / synchronisation = pain in the ass
...but that is basic stuff, you deceive me Jeff :-(
if you need to read stuff coming from a probe, maybe you doesn't care if the data has been written into your buffer when you read it to directly display it...[but it's bad to do that way]
Usually : you don't want to read unvalid stuff.
The reader needs to wait before the written has done this job, else you are going to read old/invalid data.... !!
---
never heard of NOLOCK option before, seems another ??!!? stupid stuff from M. $ql $erver
TiTi on August 25, 2008 10:35 AMYou really need to make sure you understand what "read committed snapshot" does and if you should apply it to your application. I have put together this quick informational post on "read committed snapshot" to help non-DBA types to understand the process.
http://www.coderjournal.com/2008/08/deadlocked-read-committed-snapshot-explained/
It can cause breaking changes in your application if you apply this with out knowing what the effects first. Your application might rely on row level locking for specific business rules. So be careful.
Nick Berardi on August 25, 2008 10:36 AMThe last place I worked had a strict "nolock on every query" policy.
Steve on August 25, 2008 10:37 AMI find that the fact that most databases treat everything like a banking system seems to be a problem that affects performance greatly. The percentage of applications that require that level of transactional stability is extremely small. If you remove the locking, and assume that reading old data is not detrimental, then you can speed things up quite a bit.
Eric Kibbee on August 25, 2008 10:41 AMThis is all well and good, but I would refer you to the following post (you may not be familiar with the blogger, but he makes a good point):
http://www.codinghorror.com/blog/archives/001079.html
If you have a simple bit of code with relatively low usage, and you're getting deadlock, then I'd be very concerned that something very bad is going on under the surface. Closing your eyes, applying a bandaid and hoping that the underlying issue goes away is the road to lots of future pain.
Daniel Blumenthal on August 25, 2008 10:43 AMWhy not just cache some of the reads into memcached? should minimize stress on the DB and thus make deadlocks less likely.
Andy on August 25, 2008 10:47 AMTo quote:
Part 1:
UPDATE [Posts]
SET [AnswerCount] = @p1, [LastActivityDate] = @p2, [LastActivityUserId] = @p3
WHERE [Id] = @p0
Part 2:
SELECT *
FROM [Posts]
WHERE [ParentId] = @p0
And, the problem is that you are seeing a deadlock...
First, we must consider WHY this is the case... Note that a SINGLE lock can't deadlock, unless the resource is never returned. Neither of these SQL statements qualifies as an "infinite loop". So, we must presume at least two resources. Note that, equally, we cannot assume NO locks, as that cannot deadlock either.
A possible answer to this is that there are separate READ and WRITE locks (there must be at least two locks). Any number of READ locks can be held by readers of a table. A WRITE lock is then exclusive (The WRITE lock can be obtained if there are no READERs). This would make sense; it is a "normal" design pattern.
This gives us the necessary two resources. Now, the READ resource should be acquired by statement 2. The WRITE resource should be acquired by statement 1. Aquiring WRITE of course implies ownership of READ as well. Note that the second statement is an UPDATE, which will of course use READ -- but it isn't needed because the WRITE resource implies the READ resource. Indeed, it would deadlock if this were not the case.
If the two resources (READ and WRITE) were not acquired atomically (relative to each other),then there may be a subtle race condition within the SQL service itself. Of course, this implies that the resources are implemented as two separate resources.
Another issue could be that the SQL optimizer could be "mis-compiling" the request and issuing a sequence like aquire(WRITE) ... do something ... aquire(READ) ... which would also lead to the deadlock.
If the optimizer ASSUMED that no updates where going on, it may begin with aquisition of READ, and begin the transaction. Later, it may determine a result set, and attempt to aquire WRITE -- which of course would fail (this would be a nasty bug indeed). Of course, it may assume that IT owns READ, and thus attempting to aquire WRITE is ok (and this naive optimization would work, AS LONG AS THERE ARE NOT CONCURRENT READERS).
Actually, there may be other causes to this. But, (and this is important) - if the problem is as a result of select/update, it is probably a defect within the database itself.
Indeed, there is NO direct solution to this deadlock, under the circumstances you have outlined. There is, however, a work-around.
Ensure that the database is NEVER UPDATEd if concurrent reads could be going on. Writes are fine, Deletes are fine, but ANY UPDATE would be suspect with this database library.
Alternatively (and my STRONG recommendation is to immediately get rid of the database), use explicit locking.
Seems to me that anyone doing any kind of database work for more than a few minutes has come across this deadlock situation. Locks on a database table are never taken because of a write, they are always taken because of a read; reads are what lock database table rows. Again, this is a fundamental concept of database design that I would expect a seasoned developer to know.
MS SQL Server's default isolation level (which controls the locking scheme) is SERIALIZABLE. Which means every read locks the rows that are read until the transaction commits. For reference, MySQL's default isolation level is REPEATABLE-READ, one down from SERIALIZABLE, but would also cause the deadlock situation described in this article.
Since nolock hints in MS SQL Server can be ignored by the query optimizer, it is advisable to change the isolation level instead, which can usually be done either in clint code (the isolation level can be set in the TransactionScope object in .net), in a stored procedure, or often at the connection object level. In this scenario, the READ UNCOMMITTED isolation level is perfectly acceptable since, as stated, no one really cares if the post count of a particular user is off by one or so (and the data is not used in a subsequent transaction, which is KEY) and will save you some memory on the database server (in contrast to READ COMMITTED SNAPSHOT). In a banking scenario, however, I'm sure the bank (but not the customers) would be somewhat perturbed if a customer made two quick money transfers between two of his accounts and suddenly had double the amount of money he started out with.
Lastly, this is why database transactions should be deliberately designed rather than be the bastard step-child of the code's business layer use-cases.
Robert C. Barth on August 25, 2008 10:56 AMIn my previous comment, I must ammend my first paragraph such that it is in the context of the deadlock situation given in the post. The update of course attempts to lock the row which is what eventually manifests the deadlock situation, however, it is not the root cause of the problem. The root cause is the locks aquired during the initial read and a subsequent read of the identical rows, with the intervening update waiting on the initial read's lock to release. Since the data read in the read situations is not used in a subsequent database transaction to update any data, it is perfectly acceptable to permit dirty reads in that situation.
Robert C. Barth on August 25, 2008 11:04 AMJeff,
I understand your concerns - you may be able to tolerate slightly older data,because most likely you only have ONE instance of your webapp that sees the old data.
Problems will occur if suppose you have multiple instances of your webapp, where you can run into a problem if your multiple instances see different data [or versions of the data] from the database. Then you run into data consistency issues.
When you have multiple application servers, you could run into a problem where at a given time some
I can tell that web applications use the "DB as shared memory", that they rely on the fact that once an update is done
You have to pick a locking strategy for your applications.
kashif on August 25, 2008 11:05 AMOf course, as much as everyone likes to talk about banking transactions, the reality is that banks heavily optimize their systems to not have these problems most of the time anyway. The majority of deposits and withdrawals in banking systems do not happen in real time. Most of the transactions begin at midnight, with deposits running first for a given account, followed by withdrawals. You have problematic categories like deposits that come from another account within the bank, but those are probably still handled as seperate transactions with the appropriate penalties if someone doesn't have the funds to cover their withdrawals after their deposits are processed.
In other words, in real time your bank statements don't do a whole lot anyway, and most people aren't checking their account balance at midnight.
Of course, investment banking and similar ventures are different, but even they have long periods of inactivity in a given 24-hour period that permit them to do large batch processes that have fewer chances of locking rows and tables in a database.
Vizeroth on August 25, 2008 11:05 AMSQL Server isn't acting up here, it's just doing what LINQ To SQL told it to do. It sounds like READ COMMITTED SNAPSHOT will solve the deadlock issue, but it also sounds like it's a pretty inefficient workaround because it's making copies of all data as it's written.
I'd recommend looking at the LINQ To SQL usage. As I understand it, you're using a static DataContext for the controller. From what I've read (and in my experience), it's better to create a new DataContext for each data operation. That may sound inefficient, but it sounds a lot more efficient to me than using a database setting which alters the way data is physically stored to create multiple snapshots of the data as it's written.
Jon Galloway on August 25, 2008 11:06 AMIs this post for all those who dabble in code without ever havign set foot in a CompSci class?
Is this the *real* audience for stackoverflow?
If you are getting deadlocks in your app, most likely cause is poor application design. Try queuing requests, get with mutexes, ... it has been done before.
BugFree on August 25, 2008 11:09 AMLiar! Posting has declined because you are too busy twittering! You my friend have become a "twitter whore".
BLACK HAL0 on August 25, 2008 11:12 AMWell, MySQL was certainly much more optimistic, to the point of not handling transactions at all. That has changed a while back (a very recent while, IMHO). Whether they do optimize for lots of reads and few writes is another story.
And speaking of stories, normalized databases are optimized for heavy reads, low writes. Is it that practical experience "teach" people not to normalize their databases? Mmmmm.
As for deadlocks, the write lock can't be taken until all shared locks are released... or made to release. Emphasis on all, not on the time each one takes.
> If you've got a SELECT that operates on tables A and B and an UPDATE
> that operates on the same ones, then the SELECT will grab a lock on
> table A, the UPDATE will grab table B, and then both will sit there
> while they try to get a lock on the other table.
The database should be smart enough to handle that deadlock. It should just execute the first incoming query first and then the other. It may be complicated to implement (or they would have already), but i should be possible.
N on August 25, 2008 11:13 AMWhy do read locks by default? So people that don't understand the issues with dirty reads don't sit around for hours trying to figure out why they get inconsistent results.
In the examples given I wonder why there would be a common set of rows returned?
Al on August 25, 2008 11:14 AMWhere is the value of the [AnswerCount] coming from?
Josh on August 25, 2008 11:17 AM"* I work with MySQL/PostgreSQL/Firebird.
Joel F"
is the only mention of Postgresql from all these entries. I would like to know why no one is using it. I find it much easier than SQLServer, its free, and it an awesome optimizing tool.
forgot the word has between "an awesome"
Mark on August 25, 2008 11:18 AMThere is more than just a dirty read issue when using NOLOCK.
Rare, but you can actually get 2 records instead of one if your NOLOCK'd query executes during a page split operation.
DB2 also has an unconditional read (WITH UR) that you can add to the end of your SQL.
slapout on August 25, 2008 11:21 AMAs we know from reading this blog for many years:
1. SQL Server is not broken
2. You, the developer, have caused this problem
I love the comments on this one - assumptions, conjecture, talk about unrelated things. A deadlock is an extremely rare and specific condition. It is not caused by inadequate hardware, or even 'over-adequate' hardware. I seriously doubt it could be cause by LINQ either.
There is a very specific situation that causes this, and I think the wrong queries have been focused on here. I can't imaging the select statement causing any kind of deadlock condition, because it only locks one row at the least, and one table at the worst. There has to be two locks on separate objects in both transactions (total of 4 locks - 2 complete and 2 waiting), for a deadlock to occur. SQL Server does not falsely report deadlocks. If it says there is a deadlock, there is. I love the way developers always take these discussions into unrelated territory - there's a blog for you Jeff.
Jasmine on August 25, 2008 11:22 AMUse Memcache to avoid doing db reads (often), which should drastically reduce the possibiliy of getting deadlocks, and you're app will run much quicker.
meow on August 25, 2008 11:30 AMI'm confused. NOLOCK is the default for select operations. I'd read this before and confirmed it now.
Brad on August 25, 2008 11:30 AM> "I get that, but I don't understand why simple read SELECTS need to lock *at all*."
The classic example is the store clerk who uses the database to check his inventory to make sure he has a particular widget in stock before making a sale. He checks and sees one in stock, so he goes ahead and makes the sale. While he is entering his order details, another salesman finalizes an order for the last item. Now he can't fill his order in time.
That explains the basic problem, but it doesn't quite satisfy in this case. After all, the example implies an "inventory transaction" of some kind, and you aren't specifically including any transacations.
An example closer to your particular app would be the sql required to display a particular question page. I'm not in the beta so I can't see what it looks like, but from the description you must have some code to get a list of responses associated with a particular question for inclusion in the page. Not only that, but updating a question will _always_ cause this code to run at nearly the same time as the update, so that the new page can be shown to the user.
Now let's say part of the retrieval code joins in another table. If it joins anything at all, it needs to do some locking for both tables in the join to make sure the join is composed correctly. Otherwise something might change while the join is in progress that could change things. And if you kick off your update in a separate thread to avoid bogging down your display code, the chances of them needing the same data at the same time are actually pretty high.
Joel Coehoorn on August 25, 2008 11:30 AM> I guess what I'm saying is that Starbucks Doesn't Use Two-Phase Commit
..and Ebay doesn't use transactions
http://www.infoq.com/news/2008/06/interview-shoup-ebay
> OK, but why doesn't SQL Server re-issue the transaction on my behalf, then? Automatic retries of some kind?
Because the transaction happens on the code, not on the database. For example:
begin transaction
x := select availability from inventory where item_id = XYZZY
y := select sum(quantity) from sale where item_id = XYZZY and sale_id = XYZZY
x := x-y
update inventory
set availability := x
where item_id = Y
end transaction
Ok, so you got through the first read and then deadlocked in the second. How can the database retry it for you? You already got the first read inside "x". The database might recalculate that value, but it can't tell your code to update x.
That's why the code has to retry the transaction.
That's also the why the read causes a lock. Suppose you read the availability, and then someone else tries to update it. If you hadn't locked it in first place, how would you know, at the time you issue the update, whether anyone changed availability or not?
It becomes even more important if the database is not properly normalized, but as my example got big and confusing, I decided to leave it at that. :-)
As it turns out, the database you are using can create deadlocks all by
itself, at any old time (well, there are reasons, but it's complicated).
The only thing that can be done is to wrap each sql series into a
sequence that retries the problem -- this is just a manifestation
of the class "pc-losering" issue.
It is likely that if it is really a read-only table, it won't deadlock,
pending implementation details of the sql library. Throw an UPDATE
into the mix -- you may well be screwed with a deadlock.
The implication is that ALL access (read, write, or update) needs to
wrapped with retry code.
So, go and wrap, be happy. Of course, if you run into a REAL deadlock,
things will get a bit ugly.
SQL Server is notoriously nasty about locking on reads. I run into these problems all of the time. NOLOCK can certainly help. Adding an index (or two) will usually resolve these issues. What often happens on a SELECT if it doesn't get an index hit SQL Server will do a table scan. Table scans (in addition to being slow) may require a table lock. So you get one user thread accessing a table at a time. Not fun.
craig on August 25, 2008 11:57 AMSadly enough, admitting you aren't a database expert draws the ire of DBAs everywhere. Be careful in admitting your own problems online, Jeff.
Hope this isn't what we can expect from StackOverflow. I really love the idea of presenting a problem and watching it followed by a flurry of insults and criticisms.
HB on August 25, 2008 11:57 AMLong, long ago, I think back in 2002, I had a similar problem with SQL Server. I was also bewildered that SQL Server, which prominently bragged about "row-level locking", could have such common deadlock problems.
What I eventually discovered is that although the *table* data was row-level locked, the *index* data was page-level. Since I was usually inserting sequential key data and reads tended to be clustered on recent inserts, that meant that reads were frequently being locked out of the *index* due to nearby inserts updating the index. MS had no useful advice at the time on how to resolve this; all they could say was to "minimize index access". I never could figure out how to reduce deadlock to an acceptable level while keeping the indices I needed to maintain query performance.
So I switched to Firebird and the problem went away.
"It's always a little disturbing to see a well-known coder ask a dumb question, but come on, database locks?"
Wrong. Dead wrong and incredibly dangerous. And egotistical. If I were interviewing you I would immediately flip the bozo bit and thank you for your time.
Its impossible to know everything. The hallmark of a good programmer is not what they know but their ability to learn what they dont. If you do not promote an environment where any question can be asked, no matter how naive and trivial, problems become intractable because people are too afraid to ask for help. Just because someone has been in the industry for a while doesnt mean they know everything.
It's actually possible for two SELECT statements to deadlock in SQL Server (at least in SQL Server 2000), if an index on the table is involved. No writes required!
Deadlocks are an almost unavoidable fact of life in any large database application. If NOLOCK works for you, go with it, but you should understand when you can and can't use it safely, and it sounds like you mostly do.
Matt Wigdahl on August 25, 2008 12:15 PMAre we all only web developers here?
Back in the day when we all wrote client apps, one common them (before disconnected recordsets) was that if you saw the data on your screen then it hadn't changed.
i.e. You "locked" the data when you loaded that set of data into your view, and then released it when you clicked Next or Update or whatever.
@Chris - Amen to that...
HB on August 25, 2008 12:19 PMToo bad that you don't seem to notice why a SELECT starts a transaction. A transacted READ makes sure you don't have dirty reads. Your example about out-of-date data is one which isn't that important as this is the same as caching.
What IF you have a rule in a table that is like an order column that containt values for a set that are unique. For example a table that contains multiple menulists but each listitem is unique per list. Doing a dirty read on that table *could* cause multiple items having the same order value as an update *could* have updated the data during reading.
This of course depends very much on the type of data you have, how (de)normalized the data is and how you manage transactions with methods like optimistic concurrency control.
The same when you READ and WRITE. For example sum all order lines, add VAT and store it at the order for quick read. You do want to make sure that no order lines are altered during this read.
When you know that certain data will never be updated like logging data then you can do dirty reads as those will not cause harm.
When you get deadlocks then you are not using the appropriate locking and/or isolation levels which definitely is an application design error. Just don't say to just perform dirty reads as that really me loose faith in your blog.
Still.. databases provide integrity and features to circumvent/loosen that model if you know what you are doing will not break integrity. If you don't know what you are doing then don't play around with the transaction isolation levels. Just increase performance by doing less reads by introducing caching and add versioning to records to perform optimistic concurrency control without sacrificing integrity.
Ramon Smits on August 25, 2008 12:21 PMNO LOCK is used in SQL Server for backward compatibility. To be really hip you want to put this at the front of your SELECT statement:
"SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED".
Put this on every SELECT you have that doesn't do an action and you will speed up everything in your programs, applications, servers, and ssurrounding vicinity. Everytime anything is doing a SELECT it is locking tables.
And don't do any SELECT INTO. That locks the TempDB. Build the table and then APPEND the rest.
Or one row SELECT INTO and then APPEND the rest.
> How can a read be blocked by a write?
Well sure, either your update statement is still running, or the transaction within which the update statement had executed has not yet been committed or rolled back. By default SqlServer's isolation level is "read committed", so your query might read uncommitted data otherwise. Did you profile how long the Update takes to execute and whether its embedded inside a longer-running transaction?
Arno on August 25, 2008 12:45 PM"But I can't help thinking our particular database vendor just isn't as optimistic as they perhaps should be."
Making a decision that is at odds with the rest of the technological world? This is *Microsoft* we're talking about, right?
Also, a credit line for the "theory vs practice" quote that you hacked up a bit:
<a href="http://en.wikiquote.org/wiki/Jan_L._A._van_de_Snepscheut">http://en.wikiquote.org/wiki/Jan_L._A._van_de_Snepscheut</a>
Frank on August 25, 2008 12:51 PMSorry everyone. That link butchering is on server's end, not mine.
en.wikiquote.org/wiki/Jan_L._A._van_de_Snepscheut
Frank on August 25, 2008 12:52 PMAs many people have already pointed out, the advantage of "nolock" is higher performance, the risk is the possibility of reading uncommitted data.
Prior to implementing "with (nolock)" or Set Transaction Isolation Level Read Uncommitted, you should ask 2 questions:
1. What (bad things) can happen if my application reads dirty data?
2. How often will a dirty read occur?
For most applications (except banking, airline reservations, etc) the answer to #1 is "nothing" and the answer to #2 is "hardly ever".
This is probably WAY too far down for anyone to read it now, but another possibility is that it ISN'T a deadlock - the message just SAYS that it is. Maybe (deep down in the SQL stack) an exception is thrown, and something slightly higher up just ASSUMES that all exceptions thrown by x() are deadlocks, and spits out the message.
Syd on August 25, 2008 1:03 PMNo Syd - SQL Server does not falsely report a deadlock. As I mentioned earlier, it is a very specific condition and simple to identify, but impossible to solve. So it kills one of the queries. If SQL Server says you have a deadlock, that's what it is.
oh and Jeff, this line:
"With nolock / read uncommitted / dirty reads, data may be out of date at the time you read it, but it's never wrong or garbled or corrupted in a way that will crash you."
is plain wrong.
read this to see why:
http://blogs.msdn.com/craigfr/archive/2007/06/12/query-failure-with-read-uncommitted.aspx
So... why are you not just using MySQL ???
Just for the record, unlike SQL Server and MySQL, Oracle does not have a READ UNCOMMITTED isolation level. As Tom Kyte says, "The goal of providing a dirty read is to supply a nonblocking read, whereby queries are not blocked by, and do not block, updates of the same data. However, Oracle Database doesn't need dirty reads to achieve this goal, nor does it support them. Dirty reads are an implementation other databases must use to provide nonblocking reads.".
Leigh Riffel on August 25, 2008 2:00 PMSimplest case :
tran1 :
begin tran
select from [A]
update [B]
commit tran
tran2 :
begin tran
select from [B]
update [A]
commit tran
this would deadlock if you happen to enter into these two simultaneously.
I suspect that the framework you are using is intiating transactions even if you are not doing so explicitly. I ran into this when I created a middle layer using .Net and the end users (developers) were facing this even though they were not using transactions for their SQL sets.
I say this because just deadlocking on selects v/s updates is very rare (if at all) and certainly not in the vicinity of the rate that you are experiencing.
Puneet on August 25, 2008 2:13 PMScoble is a moron
Steve on August 25, 2008 2:21 PMNot sure if this has already been submitted above (I didn't read each post in that much detail). You could be seeing an index deadlock. While deadlocks between tables is easier to understand you can get into the same situations with indexes as well.
For example lets say you have to indexes A and B. Lets say that A is a clustered index on Id so it contains all the data (lets say B is an index on ParentResultId and LastActivityDate).
So the select statement starts with index B, puts a read lock on it and then tries to go get the rest of the data from the data page (trying to get a read lock on the page). The update statement does the opposite, first it gets a lock on the data page (index A) and then tries to get a lock on index B so it can update it. Classic deadlock.
Not sure if this is what is happening here but it is a less straight forward version of deadlock that sometimes you miss.
Also I think the claims above of two select statements deadlocking is impossible. You need some process requesting exclusive access to a data page to get a deadlock. Any number of connections can hold read locks on the same page which makes sense because one process reading the data shouldn't prevent another process from reading data.
CT2 on August 25, 2008 2:28 PMI'd say that the fact that "SQL Server deadlocks" has twice the results amount versus "MySql deadlocks" is because the amount of users that SQL Server has much more than MySql.
Charlie on August 25, 2008 2:50 PMThe stupid question was "why should my database need to lock on a read". If calling that out flips the bozo bit, I'm happier without interviewing with your organization.
mgb on August 25, 2008 3:15 PMTo elaborate slightly: Everything works this way. Try to get a read lock on a file locked for writing. By default you can't. Try to get a read lock on a semaphore-controlled variable. By default you can't. Seeking a read lock by default before reading is one of the most common patterns of concurrency that exists.
mgb on August 25, 2008 3:19 PMI bet you do not have an index on [Posts].[ParentId]. I would also not be surprised at all to find that adding that index fixed your problem.
James Youngman on August 25, 2008 3:26 PMPersonally I'm pretty darn happy that Sql Server has good defaults that are concerned with data integrity and doesn't hope programmers will know what to do. Maybe you're happy otherwise, but people would scream bloody murder if it was otherwise.
Automatic retries .. you really don't want that do you? Retry it yourself. If you had the system software do it for you, it most certainly isn't going to make the correct choice. Additionally, should every layer retry between your app and the database? What if n classes wait rand(m) seconds between p retries? You press a button and the app goes away for 5 minutes because a failure takes n * m * p seconds to get back to you? No thanks. Fail fast and allow the application to retry if it can at the appropriate level. As long as you have concurrency detection mechanisms in place then there should be nothing wrong with retrying the write.
From an application standpoint, read-committed locks are only necessary if you plan to update the database using that data, because you must first see a snapshot the data you need to change, and then change it. If you're reading just to grab replies to a post, then you should be able to use nolock. Make sure your code can handle the case when the count is not equal to what was returned.
Also to avoid locking, you could read from views and you can partition the hot data out of the table (ie statistics, counts, etc) so you're not locking pages on a heavily accessed table to update a count that is, for most purposes, probably useless.
But is this a premature optimization? Is your goal of storing the count trying to save you time, but in effect is causing the deadlocks and slowing you down? The statistics are pretty simple, and I have to ask, why aren't you using a view? It's what they're meant for; let Sql Server work it out.
Sounds like you also need some application-level caching. While it's good to optimize the lower layers, you should be preventing every request from hitting the database without reason. Even a light cache is better than nothing. e.g. if you're getting 1000 hits a minute and the data is not expected to change in that time range, even a 30 or 60 second cache lifetime will save you 500 to 1000 roundtrips to the database during that cache period. That's something to think about.
Writing software is easy. Writing good software isn't.
someguy on August 25, 2008 3:41 PMA great read on isolation levels, concurrency, and locking:
http://www.oracle.com/technology/oramag/oracle/05-nov/o65asktom.html
While Oracle centric, this article is not necessarily Oracle specific. It compares how Oracle performs (or avoids) locking using multi-versioning to "other" databases. Detailed examples are provided.
Actually - I think this article is even more relevant to developers NOT using Oracle, since with Oracle, deadlocks are rare. In any case, understanding the basics can go a long way to solving application issues.
Regarding SQL Server vs MySQL deadlocks, doesn't MySQL use some form of multi-versioning?
John on August 25, 2008 3:42 PMwow + suck
langel on August 25, 2008 3:47 PMI think there's a good reason that Microsoft put SQL Server where it is on the Mistic Scale (opti-pessi) - it's better to design to support the high-risk environment at the expense of the low-risk one than vice versa. Since MS can't know how smart their clients' DBAs are (and I've met some donkeys responsible for databases running billions of dollars' worth of trading positions - even been one myself) they make it ultra-pessimistic out of the box.
Better financially, say, that stackoverflow.com gets some tricky deadlocks than a trader makes the wrong decision because he doesn't have full-on query consistency. Especially if the only problem is a configuration option.
Of course, MS *should* be providing better out-of-the-box configuration options - reasonable & safe setups for several common types of application, with explanations of the pros and cons of each, would be a start.
But simplicity of installation and configuration is something Microsoft often get wrong. Rick Brewster's exposition of Paint.Net's worsdt-case installation scenario is a case in point.
Mike Woodhouse on August 25, 2008 3:58 PMWhy doesn't each philosopher take the chopstick to his immediate left, break it in half, and use two small chopsticks?
dave on August 25, 2008 4:08 PMWe use NOLOCK a fair bit here for reporting - given that our production systems are doing quite a few writes per second... that said, I've found good indexing and using views to normalise data, rather than using normalised tables is a good way to go.
You can also throw the nolock hint inside the view - and for a few situations like you describe, it might be better to use a view to cache a users "last update, x answers, etc" into its own table rather than keeping in a denormalised structure - that way you aren't doing the reads directly on the tables.
Also, for any detailed stored procs, etc, I'd go about using the WITH keyword and use CTEs rather than doing any temp/in memory/@table tables.
I've found that they speed my queries up to no end.
Also, remember when using Transaction scope in LINQ, you have to wrap the first call to the data inside your transaction USING, otherwise the call to get the data is fired outside of the transaction scope:
http://www.madprops.org/blog/linq-to-sql-and-nolock-hints/
Looking forward to getting my sign up approved for the site! :)
Andrew Tobin on August 25, 2008 4:08 PM>>But in practice adding nolock to queries that you absolutely know are simple, straightforward read-only affairs never seems to lead to problems.
>>With nolock / read uncommitted / dirty reads, data may be out of date at the time you read it, but it's never wrong or garbled or corrupted in a way that will crash you
That hasn't been true in our experience: although infrequent, we do get the following error:
[SqlException] Could not continue scan with NOLOCK due to data movement.
It's an Exception - the query fails and the user sees an error. So NOLOCK _can_ still break your read query and give the user _no_ result rather than just an out-of-date one.
NOLOCK may still be useful - but it's not "guaranteed" to work every time...
Craig Dunn on August 25, 2008 4:19 PMViews are just stored SQL. Given that a view will not change a deadlock situation.
Jay on August 25, 2008 4:28 PMJeff,
Can you please show us some query plans so we can offer meaningful advice? SQL server maintains a cache of the plans for all parameterized queries by default.
I'd give instructions, but I'm not at my computer right now.
I've run into some hairy stuff at Loopt, and might be able to at least explain why it's happening, if I can't offer a real solution. I view NOLOCK as a hack and successfully eliminated it from all our DB code.
Nick Sivo on August 25, 2008 4:31 PMWell, Microsoft is based on Sybase, which I know used to default to dirty reads (this means that for instance, your bank statement could show a transaction as having occurred, but the balance from before the transaction occurred). It sounds like MS went and bandaided this behavior, with the results you see.
Try something that has always done the right thing, such as Oracle or Postgress. Reads never block (and thus never deadlock).
Ronald Pottol on August 25, 2008 4:40 PMSeriously you don't know why you need a lock when reading a database which conflicts with an update??????? And you are blogging about it?????
Say you are taking orders for the dining philosophers. Let's say philosopher B wants what philosopher B is having - you need to ensure that philosopher A's order is complete before philosopher B gets to read it.
Hey Jeff, in response to
> I get that, but I don't understand why simple read SELECTS need to lock *at all*. I mean, maybe if that database was storing my bank account balance, in which case transactional orders are hugely important.
You should consider two items:
Firstly, Read locks are much "looser" than other lock types (e.g.: Update), and won't usually cause blocking, let alone deadlocks. Several comments have suggested good articles on locking, and I'd recommend you read SQL Server Books Online (a misnomer, as you can install them locally.)
Secondly, if you read dirty data, the risk you run is of reading the entirely wrong row. For example, if your select reads an index to find your row, then the update changes the location of the rows (e.g.: due to a page split or an update to the clustered index), when your select goes to read the actual data row, it's either no longer there, or a different row altogether! How would it appear to your users if every time someone else added a comment on your site, the comments they retrieved came from other conversation threads? That's what Read locks prevent; they make your Read atomic, so that while following indexes, retrieving pages, and performing joins, you don't have invalid data creeping in (or valid data creeping out) of your results.
As to the root of your problem, I would suggest that either you are creating explicit transactions that you didn't tell us about, or that you need to turn off implicit transactions for LINQ. And while I always enjoy reading your blog, I'm afraid the old "A poor workman blames his tools" quote is rather appropriate here. If you really are getting deadlocks with simple SQL statements and no transactions, then your code is going to perform terribly as usage scales up. No amount of (NOLOCK) or Read Uncommitted will fix bad code.
Always entertaining, and often enlightening. Thank Jeff, and good luck!
Rick Townsend on August 25, 2008 4:55 PMRonald Pottol:
> Try something that has always done the right thing, such as Oracle or Postgress. Reads never block (and thus never deadlock).
Oracle accomplishes this with snapshotting, which SQL Server has supported for a few years. It's off by default because it's expensive both in terms of both CPU and Memory. Essentially, your RDMS is managing multiple versions of the "Truth", which is what we try to normalize away from.
And there are ways to get deadlocks with reads Oracle, depending on your transactions and locking levels. I know, I've done it. :( I've never used Postgres, so will defer to you on that one.
Rick Townsend on August 25, 2008 5:08 PMThe reason for the deadlock in this case is straightforward, if my obvious-seeming assumptions are correct.
I'll bet that the rendered page shows the # of answers followed by the answers themselves. Any non-sloppy developer would try to make the numbers match. The obvious thing to do is put the SELECT of the question in the same transaction as the SELECT of the answers. That's the source of the problem.
Suppose I post an answer to a question that you are currently loading. You SELECT the parent [Post], taking a shared lock on the row. I INSERT my answer, taking an exclusive lock on the new child row. I try to update the number of answers to the question, but you have a lock, so I'm blocking on you. You start SELECTing all child [Posts], get to my new answer and block because I have an exclusive lock. You wouldn't want to read a record that may get rolled back. That wouldn't be *stale* data, it would be *non-existent* data! So, you're blocking on me. Now we have a deadlock. I've assumed the [Posts] table is a tree. The problem still exists if it's not.
MVCC would fix this particular problem. You've never see my INSERTed, but uncommitted record, so your SELECT would finish, you'd release the shared lock on the question, and then I'd UPDATE it and commit, making my new record visible at the same instant that my UPDATEd answer count takes effect.
Stale data isn't so bad in certain specific cases, but using non-existent data (that got rolled back) is just amateurish. READ UNCOMMITTED is for suckers. Maybe your SELECTs don't need to take shared locks, but, for heaven's sake, don't ignore the exclusive write locks. Or switch to MVCC. I don't know how much of this is possible on SqlServer. It was news to me that it doesn't always use MVCC.
Adam Tomjack on August 25, 2008 5:13 PMJeff, you appear to have forgotten transactions are independent. SQL Server will, by default, do its utmost to preserve transaction independence (anything less is simply unacceptable.) What this means is that the world is frozen when the transaction begins, looking around at this frozen world nothing will change exept what you move.
Working without locks is okay so long as you know what it means. Working at read uncomitted means that if you go "begin transaction, select from table, select again from same table' the second select may have different results than the first. Whether this is a problem is up to you; but SQL Server has to assume (by default) it might be so by default it is more cautious.
It's not 'silly' that SQL Server treats your website as a banking operation; personally I would expect nothing less. Correctness is a vastly more important than performance. Only the developer can trade off correctness for performance; and it is rarely needed. I suspect you'd be the first complain about Microsoft's apparent shoddy coding when running the 'count of comments' query and the 'text of comments query' in the same transaction came up with different numbers!
As to why select locks records; if it didn't then the world wouldn't frozen. Your program logic may behave differently if the changes were visible (there's now an extra comment and you have a 'no more comments more than 500 rule or something) or, as I mentioned, you do a second query and it gets a different result.
S on August 25, 2008 5:21 PMmgb said, "It's always a little disturbing to see a well-known coder ask a dumb question, but come on, database locks? This is very basic stuff!"
Shut up. Seriously. What a techy-snob! You'd get negative 10,000 on stackoverflow.com. It's because of folks like you that others have a hard time asking questions. You should know better. Tool.
MGB stinks on August 25, 2008 5:31 PM> And there are ways to get deadlocks with reads Oracle, depending on your transactions and
> locking levels. I know, I've done it. :( I've never used Postgres, so will defer to you on that
> one.
I'm pretty sure this is impossible. With Oracle, writes cannot block reads, and reads cannot block writes.
In my experience with Oracle, almost all deadlocks are deletes in a parent table were a child table has an unindexed foreign key to the parent (a full table lock is required in this case.) This is of course an easy thing to fix - just add an index.
<a href="http://asktom.oracle.com/tkyte/unindex/index.html">http://asktom.oracle.com/tkyte/unindex/index.html</a>
John
Lot's of chatter over a simple SQL keyword that folks should be aware of and using if they use SQL regularly.
This is kind of like saying "I know the framework will garbage collect, but things work better when I dispose of my own trash."
Actually, it's quite easy to deadlock on Oracle, it happens quite frequently the second you open your wallet to pay for it and realize the price doesn't include the guru you need to pay as well just to set it up decently.
Ramon Leon on August 25, 2008 5:58 PMI had a very similar problem - except I was reading a large amount of data from a few tables to create a report. And on another webpage, data was being inserted into the table, but the insertion was chosen as the deadlock victim.
I think a sql server deadlock support group is needed! lol
Mitch on August 25, 2008 6:09 PMBeing real, deadlocks don't have anything to do with reads. They have to do with the acquisition order of locks and they can happen in Oracle just as in any other db.
If two transactions acquire locks to shared resources in different orders then it becomes possible each will grab a lock on a resource the other needs to finish and both processes become deadlocked. That Oracle doesn't lock reads simply reduces the likely hood of this happening. That it does MVCC by default also means the write locks are extremely short, just long enough to change the pointer from the old version to the new version rather than locking during the write itself.
To commit a write they still have to ultimately acquire a write lock on the current row long enough to swap in the new version and hold it until all other locks needed for this transaction to commit are successfully acquired. Multi version concurrency control allows multiple writers as well but ultimately only one will acquire the lock and succeed with the update and the other will be aborted. If those two transactions acquire their locks in different orders and each succeeds on locking a resource the other needs to finish, you have a deadlock.
Deadlocks are a logic problem, not a software problem. If you're getting deadlocks it's because your code is flawed and your transactions are not acquiring locks in the same order, i.e. first table A, then table B, then table C. What the order is doesn't matter as long as they all do it the same.
MVCC just happens to shorten lock time so much both by eliminating read locks and by allowing multiple writers and extremely short write locks that many such flaws in logic are never exposed because the likely hood of them happening is so rare. Sql locks during the write to disk by default, so the lock time is long, Oracle writes to disk first, all the new versions, then acquires locks just long enough to swap in the new versions.
Oracle chooses the correct approach and just suffers the performance penalty for always doing row locks and MVCC in order to achieve the correct behavior.
Sql 05 or better can be made to behave correctly, it just isn't the default behavior. MVCC is expensive so it's off by default because they don't want 05 to appear slower than previous versions. That was a mistake, it should be on by default, correct behavior is more important than any appearance of speed.
Ramon Leon on August 25, 2008 6:24 PMI know this isn't stack overflow - but I thought I'd "point" to the correct answer in this thread.
Robert G points to the trace flags described in http://support.microsoft.com/kb/832524 that give you more than enough detail on the deadlock participants.
Having gone through several of these before, this deadlock information typically highlights an application call (or rather a pair of simultaneous calls) that are not as precise as they could be - very often causing unnecessary lock escalation.
Query hints or application-level retries are bandaid solutions to address the symptoms, not the problem.
Hope that helps,
- John
1) Why do you need an AnswerCount field? Just compute it and avoid the possibility that it contains the wrong value. My suggestion is to show your schema to a good data modeler. Focus on your app code that touches anything the modeler circles in red.
2) The root cause is not using Oracle. If you are using Oracle and still get a deadlock, you have an unindexed foreign key and failed to heed the warnings in TOAD or OEM. I only need one hand to count the number of deadlocks I've encountered in a decade of using Oracle. All had obvious causes and were discovered early in development.
3) Anybody recommending nolocks should be on an IT no-hire list. You are making bugs. Why not simply store all your data in ASCII flat files and get your wrong answer even faster?
OracleGuy on August 25, 2008 6:57 PM"3) Anybody recommending nolocks should be on an IT no-hire list. You are making bugs. Why not simply store all your data in ASCII flat files and get your wrong answer even faster?"
Funny you should bring that up. I was wondering if a DB was even needed in a design like this. Sure, you need a durable store but do you really need a full ACID guarantee? You could go a long way with a simple paxos implementation rather than a full DB.
<insert argument about using a sledgehammer to drive a thumb tack here>
@Mitch
SET DEADLOCK_PRIORITY
from (http://msdn.microsoft.com/en-us/library/ms186736.aspx)
"Which session is chosen as the deadlock victim depends on each session's deadlock priority:
If both sessions have the same deadlock priority, the instance of SQL Server chooses the session that is less expensive to roll back as the deadlock victim. For example, if both sessions have set their deadlock priority to HIGH, the instance will choose as a victim the session it estimates is less costly to roll back.
If the sessions have different deadlock priorities, the session with the lowest deadlock priority is chosen as the deadlock victim."
And from (http://msdn.microsoft.com/en-us/library/ms178104.aspx)
Resources that can deadlock: Locks, worker threads, memory, parallel execution-related resources, and Multiple Active Result Sets (MARS)
There's a plethora of msdn articles on minimising and preventing deadlocks.
otherguy on August 25, 2008 7:57 PM@OracleGuy - you're a zealot. The nolocks hint is there for a reason, not just some bandaid for idiots. The only reason for a 'no-hire' is if someone can't see any possible danger of using nolock and prefers it 'for speed'.
I'll concede that if you're using nolock often, then it's a bad smell indicator that you're doing something wrong.
otherguy on August 25, 2008 8:03 PMFrom my experience SELECT's almost never could cause deadlocks, no matter with or without nolock hint. You should carefully look at your UPDATE statements to see if they could cause locking.
Andrey on August 25, 2008 11:23 PMSmall Correction:
In Oracle and PostgreSQL, a read will never block, since they both use a form of multiversion concurrency control. This feature removes a major cause of database programming suckage.
Mark Harrison on August 25, 2008 11:54 PMThe deadlock does not occur when an operation takes too long on a resource and another is waiting for the same resource.
It means, instead, that a query, of any nature, is waiting for a resource that is taken by another query, that is waiting for a resource of the first one. obvously there's no solution for this, if none of the query explicitly dies a horrible death, so that the other one can continue.
In SQL Server, there're two ways of accessing the data: Read Committed and Read Uncommitted, the first being the default (and yours).
In Read Committed, each query, even a Select, must access only data that has been completely written and processed and is not being currently modified or handled by another query. That ensures that your data is always the REAL data, and not some old data you've already overwritten during the select itself.
Read Uncommitted, on the other hand, tells the Select to look for anything it can give you right now, even if it's being updated by some other query during the select itself. I leave you to the documentation on how to switch the two modes.
Anyway there's another way: you can use a
SELECT *
from BLAHTABLE nolock
the nolock will tell the select to ignore any lock, like if it was a read uncommitted environment. Note that you can't use a nolock in an update, but you can still reduce the risk of a deadlock by using a rowlock instead. Beware rowlock will have an impact on the performance though.
AkiSkyo on August 26, 2008 12:37 AMDatabase deadlocks are not an error condition. At least they shouldn't be, if the rest of your software is coded correctly.
The point here is that database deadlocks can and will happen all the time, they are just not all that likely if your transactions are very short as in a web scenario. But they will happen, regardless of what you do with "READ UNCOMMITTED" etc., at some point of time two users will try to change the same (or connected) thing at the same time, and there you have it.
If you do any database stuff, please organize your code so that conceptually linked queries and operations are contained in blocks that you can repeat on deadlocks. That is, when serving web requests, put your request handling in a method, and put that into a while loop that retries as long as you get deadlock exceptions. Safeguard against eternal loops by counting the number of retries and aborting if they exceed some N.
This of course requires you to be careful about any variables and other state that might leak from that retrieable method, but it's the only way to do database operations correctly.
> It's as if one of the dining philosophers happened to glance over at another philosoper's plate, and the other philosopher, seeing this, screamed "meal viewing deadlock!" and quickly covered his plate with his hands.
Love this! :-)
Although I'm only an amateur DBA (and an ex-amateur-DBA at that!), I've always been under the impression that databases suffered more from the <a href="http://en.wikipedia.org/wiki/Readers-writers_problem">Readers and Writers problem</a>. This would seem to fit your starvation problem better than the Dining Philosophers' Problem, Jeff, as the large number of reads you're doing are locking out the write; it also mirror's ratboy666's comment.
Alastair Smith on August 26, 2008 1:47 AMIf you ever stuck with any big problem, you can always ask the community for hints and suggestions.
Varun Mahajan on August 26, 2008 2:14 AMThe biggest problem i have seen relating to this scenario is when the statement causes lock escalation in the database. If your first query is running a table scan rather than an index lookup it is conceivable that SQL with give up locking the row or page and lock the whole table instead. If this happens on your update statemtent then all your subsequent transactions will be blocked until the update finishes and you end up with the deadlock problems you are seeing. If it happens on a SELECT statement all your UPDATE transactions will be blocked.
This problem is compounded if your data is very large and one row spans multiple data pages.
So before setting the read committed snapshot I would first check the queries are using index lookups and try and check if the data size of a row could be reduced.
Do you really need every column in the [posts] table?
Don't if I'm to purist, but I avoid the nolock like the death, even that it gets faster, because you could show wrong informations to your user, and it could be realy bad, for example, if you are leading with money. I prefer making the query and passing the values to local variables and then killing the recordset, and consuming the variables in my next query.
Otávio on August 26, 2008 4:42 AMI'll admit I didn't read all of the posts so this may have been covered. Can you not turn on the READ_COMMITTED_SNAPSHOT option on the database? This will further explain it:
http://blog.sublogic.com/2006/02/oracles-multiversioning-now-available-in-sql-server/
Interesting article. And while I agree that dirty reads can fit the bill, I think the point is that one makes an informed decision on whether a dirty read is ok. Sure for most website "applications" few writes and mostly reads. No one cares if their profile is a few seconds out of date or if thaty aren't seeing Bob's post from 1 ms. after they submitted their own, but if your application say tracks an ordering system, it isn't appropriate to count 15 orders when 3 of them are still completing transactions before they can truly be counted. The default locking mechanism is aimed at keeping data-integrity up, not user wait time.
I feel that if it's important enough to be in a transaction in the first place, it's important enough to fix the cause of the deadlock and not just ducttape a dirty read into the query. Having said that, for non-critical data it's fine, just makes me worried when I see it. :-)
Regards!
JMinadeo on August 26, 2008 5:19 AMIf, like most web applications, you are doing mostly reads, and few writes, why use a database at all? Most web apps are just doing the same query over and over again. Just generate and store the content of each page and serve that, regenerate on change. You can use whatever granularity of locking you want that matches how you store the content. I've never understood why so many web applications are just front ends to a big pile of the same SQL queries over and over again, it's never made much sense to me but I guess that's what everyone else does, so why not do it too?
Reed on August 26, 2008 5:22 AMUPDATE [Posts]
SET [AnswerCount] = @p1, [LastActivityDate] = @p2, [LastActivityUserId] = @p3
WHERE [Id] = @p0
I think the update statement is trying to convert a "read" lock into a "write" lock. I would look at putting a locking hint on the select that first read the "post" in the transaction to make it take out a write lock at that point. Otherwise do the update in a separate transaction.
Ian Ringrose on August 26, 2008 5:33 AMSeems like a case of bad defaults... just reading what "read committed snapshot is and it sounds like how it should be implemented by default.
"If we request a row that is being modified by some other transaction, our transaction will wait until the other is finished and fully committed."
Sounds like common sense... queueing database requests is hardly rocket science. :P
Clearly, you should be using MySQL. :)
jheriko on August 26, 2008 5:35 AMRe: choice of SQL Server
He wants to use LINQ, so he must use SQL Server. LINQ doesn't talk to anything else at the present time (there's only one sql-translation provider).
Re: deadlock -
It's not really about query complexity... it's about multiple tables involved in a single transaction in a multiple user environment.
Are there any triggers that the update statement activates that reads other tables?
Is the select issued in its own transaction, or are other tables queried in the same transaction?
David B on August 26, 2008 6:25 AMSomeone has to explain to me sometime -> why every time a problem is posed, peoples first response is to tell you that you should use a different platform/technology/brand.
Same thing happened in the comments to having issues with Windows and Wordpress, half the responses were "use x blog engine instead", or "use linux".
Here it's "use x db technology".
And ta for posting this Jeff, I don't get the responses that expect you to always be an expert, and I don't think I'd learn as much if you were an expert and didn't need to pose the problems!
Andrew Tobin on August 26, 2008 6:36 AMMaybe if you open sourced your webapp, you could gain the benefits of peer code review! ;)
-Some- bugs are shallow with enough eyes...and maybe development would speed along, and then you can get back to writing more posts to enlighten the masses.
Why not open source at least some of the stackoverflow website?
MS-SQL is not a particularly strong database for online transaction processing, at least not compared to Oracle. In the mid-1990's, Oracle developed a form of multiversion concurrency control that pretty much devastated the competition: it's sort of like "nolock", but safer.
MySQL rocked the world by coming out with the blisteringly fast MyISAM table handler, which traded transaction support for performance. Not having transactions is a real loss, but many people have found MyISAM is reliable enough for what they do. MySQL also supports the InnoDB table handler, which uses MVCC. InnoDB is remarkably fast for ordinary operations and commits, but it can really burn you if you need to rollback a big transaction. I've heard of InnoDB wrecks where it took more than a week to recover from a system crash.
Evolv CS (http://www.defran.com/evolv-cs.asp) upgraded to SS 2005 and told us to add the nolock option to ALL of our custom queries. So maybe this is common practice.
We are writing an app that should work at least with Oracle and SQL Server, therefore using non standard hints like NOLOCK is not an option.
It's ok to have propietary extensions but SQL Server should support ANSI standards better. It's incredible that the concatenation operator || is not available. I guess they haven't changed the Sybase core much, that's unfortunate.
You could always try with read uncommitted
Sql on August 26, 2008 7:30 AMThese types of problems are a big part of why SQL Server introduced MVCC features in the 2005 release. Most of the DB world has moved to MVCC as their default behaviour (Oracle, Postgres, MySQL, Firebird/Interbase, etc) with DB2 being the main holdout that still uses the pure locking model for database transactions.
I suspect they didn't change the default behaviour (thus treating you as a "banking site") in order to limit breakage to backwards compatibility. But seriously, for web apps MVCC is the better approach.
Jess Sightler on August 26, 2008 7:36 AM>But instead of storing it into Post table, store number of replies to PostCommentCount table.
The whole point of storing the count instead of calculating it is making reads cheap. So now you are asking for an extra join? Why not just tell him to calculate it each time like the other people who don't get it.
a on August 26, 2008 7:42 AM>>Someone has to explain to me sometime -> why every time a problem is posed, peoples first response is to tell you that you should use a different platform/technology/brand.
It's because they don't know how to solve the actual problem. This is not a case of a bad product - it is a case of a mistake in programming, and it can be fixed. I'm sure that's why it was posted in the first place. I don't like the solution, but the point is, it was solved without re-working the whole site, changing databases, or anything else drastic - he just added two little words to the query. Simple problem, simple solution.
I agree with some of the comments (though I might phrase it differently) that you would probably be happier if you learned more about databases.
The other thing is, you know the old axiom "select(2) is not broken"? That if you think you found a bug in the OS, 99% of the time it's in your code, and the other 1% of the time you need to check your code even harder?
Not true with Microsoft products.
Chris D on August 26, 2008 9:01 AMI suspect that the reason why there aren't many google entries for MySql deadlocks is because the default storage engine in MySql (MyISAM) doesn't have deadlocks.
From the MySql docs at http://dev.mysql.com/doc/refman/5.0/en/internal-locking.html
Table locking in MySQL is deadlock-free for storage engines that use table-level locking. Deadlock avoidance is managed by always requesting all needed locks at once at the beginning of a query and always locking the tables in the same order.
However, this is had at the expense of transactions.
Kyle Cunningham on August 26, 2008 9:27 AMThe deadlock part seems pretty well covered so I'm going to echo the, "AnswerCount field is probably a bad idea and should be calculated." sentiment.
Adam on August 26, 2008 9:30 AMIf you have long running transactions, lock all the required data in the beginning to prevent deadlocks:
Thread 1:
begin tran
declare @x int
Select @x = A.x
from A with (holdlock, updlock)
join B with (holdlock, updlock)
on A.id = B.id
where id = 1
update B
set y = @x
where id = 1
commit
thread 2:
begin tran
declare @y int
Select @y = B.y
from B with (holdlock, updlock)
join A with (holdlock, updlock)
on B.id = A.id
where id = 1
update A
set x = @y
where id = 1
commit
If you don't like to join you can write:
select x from A, B where A.id = 1 and B.id = 1
------
And please never use NOLOCK.
Using the NOLOCK index hint can result in phantom data. When you query a table while another process is inserting a set of data in the table and the insert fails (duplicate key error maybe), the select query can return data that was never committed to the database.
SQL Server - That is MySql, Progresql, Oracle, MS-SQL or any other SQL server
MySql Server is MySql
MS-SQL Server/Microsoft SQL Server, however is Microsoft's SQL Server,
The Generic name has been appropriated by Microsoft for marketing reasons, Various SQL Servers (including the original engine Microsoft used) are older than Microsoft the company ....
Deadlocking on dirty reads on a web interface that does not really care if the information is a few seconds out of date sound like the perfect case for using nolock ....
The deadlock issue has nearly been beaten to death, but I want to add one more thing. I think there has been a lot of confusion in the discussion about the isolation levels that Jeff is talking about using. NOLOCK is equivalent to READ UNCOMMITTED, but if you read the entire post, that's not what Jeff is talking about using. He's planning to use READ COMMITTED SNAPSHOT, which does no dirty reads. With snapshot isolation enabled, SQL Server creates a duplicate copy of the data page before beginning any transaction that includes a write operation and purges it after the transaction is committed; if that copy exists, SQL Server reads from the copy. That eliminates the locking issue, and therefore should eliminate the deadlocks. It does require more disk resources, but in my experience, snapshots have been very fast. I wouldn't use a dirty read unless it was absolutely necessary - all sorts of nasty side effects can come of that if you're not very, very careful - but this isn't a dirty read. You're still reading the data that has already been committed to the database.
Jeff, if you haven't done it already, it would be worthwhile to use SQL Profiler to run a trace on your database while the app is running. This has helped me sort out a number of issues since it tells you everything the database is doing. You may find a transaction that LINQ is creating implicitly that you weren't aware of, and even if you do choose to use snapshot isolation, correcting a transaction like that may help the performance of the application.
Good luck!
Ed
Ed Leighton-Dick on August 26, 2008 10:30 AMYou need to read this one Jeff. You happened to mention you are using LINQ To SQL for your design. Please tell me you didn't use the Linq To SQL designer in Visual Studio to define your database tables. I'm sure LINQ To SQL works just fine if you write the code yourself, but using the designer is going to make your site SLOOOOOW!.
I recently developed a Virtual Tour posting system for real estate agents at my job and I used the Linq To SQL designer to define my tables to start with (programming in C#). I never could find out why a single page would take seconds to build (1.93s it turns out). Finally I used the trace feature and found the DataContext.ExecuteMethodCall procedure run from the designer code was taking .73 seconds to retrieve ONE row from the database, from a table with ONLY 17 rows, used a stored procedure that was ONLY a simple select statement, even with DataContext.ObjectTrackingEnabled set to FALSE.
That method uses a lot of reflection and testing of relationships to insure the data is valid. I rewrote the Data Access Layer of my system, still using attributes and reflection for parameter types, inherited their IExecuteResult interface to add a little more flexibility, and now that same call takes 0.000434 seconds, and now the page builds in 0.034 seconds. I know it sounds like an exaggeration, but I am dead serious. MS' code for that method must be incredibly slow. And so you don't think it is my system, our server is a twin dual-core (4 cores) Xeon 2.8GHz, 8GB RAM, Window Server 2008 Enterprise, SQL Server 2005 Professional, 1.6TB RAID 5 array. I could watch the page peg the processor at 26% (their code only using one thread) for over a second in the task manager.
Thought you need to know that, but only if you used the designer in Visual Studio.
Christopher Tillman on August 26, 2008 12:02 PM>2) The root cause is not using Oracle. If you are using Oracle and still get a deadlock, you have an unindexed foreign key and failed to heed the warnings in TOAD or OEM. I only need one hand to count the number of deadlocks I've encountered in a decade of using Oracle. All had obvious causes and were discovered early in development.
I guess you haven't tried running Oracle in full SERIALIZABLE isolation mode then. Try that and say hello to "Cannot serialize the transaction" error unless you're doing just really, really small transactions.
But overall, deadlocks and serialization errors are a fact of life. You just try again and hope you succeed. Happily, in our framework, we just rollback the transaction and call the continuation that was saved when the transaction was started.
TT on August 26, 2008 12:52 PM"I'm a little disappointed that SQL Server treats our silly little web app like it's a banking application."
Imagine how pissed you'd be if SQL Server treated your banking application like a silly little web app.
Ian Nelson on August 26, 2008 1:31 PMSQL server provides lots of opportunities for deadlocks. Whenever I encounter this scenario, I look at my Update/Insert/Delete statements and I add the WITH(ROWLOCK,UPDLOCK) query hints.
This "suggests" that SQL Server first locate the rows to be updated (UPDLOCK) and then lock the effected rows only rather than the entire page(s) that contains the rows.
Your new statement would look like the following:
UPDATE [Posts] WITH(ROWLOCK,UPDLOCK)
SET [AnswerCount] = @p1, [LastActivityDate] = @p2, [LastActivityUserId] = @p3
WHERE [Id] = @p0
It has worked for me. Good luck!
D Garcia on August 26, 2008 1:50 PMToo bad you donīt use Java: http://java.sun.com/j2se/1.4.2/docs/api/java/sql/Connection.html#setTransactionIsolation(int)
;)
Rubem Azenha on August 26, 2008 2:25 PMThis is an excellent post . Thanks a lot . I am grateful to you .
العاب شمس الدين on August 26, 2008 2:46 PMWould have been nice to see how many users you have hammering away to get this error. I've never seen this error from MSSQL or MySQL, and I've used them a lot in both read and write heavy environments.
Mattkins on August 26, 2008 3:07 PMJeff-
You might want to try here:
<a href="http://blogs.msdn.com/bartd/">http://blogs.msdn.com/bartd/</a>
<a href="http://blogs.msdn.com/bartd/archive/2006/09/13/751343.aspx">http://blogs.msdn.com/bartd/archive/2006/09/13/751343.aspx</a>
I don't think you have a typical deadlock situation going on (A waiting on B, B waiting on A scenario).
The above links should help with troubleshooting and offer some insights. I think you might just need some query tuning or some index adjustment.
I don't think SQL Server is broken as several have suggested. Looks like Stack Overflow needs a SQL Server DBA!
Jon Raynor
Great examples here of why comments need ratings so you can separate the rantings and insults from the answers. I did learn some things from reading the comments about the differences between database designs.
I've been writing database backed applications for 12 years now- never hit a deadlock on a read. Even in SQL Server 2005. Of course, I generally follow the rules...
matt m on August 26, 2008 7:11 PMI've never used MS SQL Server, I've used oracle for over a decade.
As has been metioned, in Oracle readers don't block writers, and writers don't block readers.
A thread above suggested that MS SQL Server was better because it turned off the 'snapshot' feature that enabled this behaviour by default : personally, I'd rather have correct than fast in the first instance anyday...
I have seen a couple of Oracle deadlocks in my day, always caused by U1 updates A and tries to update B, while U2 updates B and tries to update A.
JAycee on August 27, 2008 1:47 AMSQLServer is a work in progress for Microsoft compared to Oracle - from plenty experience deploying to both platforms we've had MUCH better performance, much fewer deadlocks and fewer headaches in general with Oracle.
Frank on August 27, 2008 3:20 AMHas anyone mentioned triggers?
I usually avoid triggers because 'tis easy to forget they exist and get included in any transactions you might generate. So that simple SELECT or UPDATE might be introducing a second table for the deadlock through a trigger.
Harbour Master on August 27, 2008 3:52 AMI amend my previous statement - no SELECT triggers in MS SQL.
Harbour Master on August 27, 2008 3:57 AMHey Jeff,
You're not storing the Id of the original post (primordial one) in it's ParentId col are you? This would obviously introduce immediate overlap and a race-condition for the offending (parent) post. I can't believe it would be that obvious but, if it is you can SQL-around that one with an extra clause to ignore that particular record. I'm reading waaay too much into a DM that i didn't design!
I think in the case of this race condition to re-acquire a Read lock on a matching row amongst the Update and Select you should perform the retry as per direction via the db error. Have you tried this as a default option before applying hints that result in dirty reads for some of your users? It smells like your data isn't that concurrent anyway - the select seems to pull those records that were related to a parent record which the UPDATE is isolated to changing. I suspect your comment about denormalisation i.e. storing responses to an inital post in the SAME table have A LOT do with deadlock as normalising parent-child fashion would result in separate tables used in each query.
Anyway fundamentally it's perhaps a usability choice - do you allow your users to see dirty data? Would they know it was dirty? Or do you trap the error when you know data to be dirty coz SQLServer is telling you it is and requery on behalf of the user. For me - the system should do this but, the cost should be measured in terms of trapping the error and requerying and it's affect on your desires to scale up.
chorleyboy on August 27, 2008 6:46 AMDeadlocks are caused by the Database Designer and the Developers. It is always possible to write a system that is incapable of causing a Deadlock. Of course, this is the same as saying it is possible to write code that has no bugs and therefore easy in theory yet incredibly hard in practice.
Turning COMMITTED SNAPSHOT on can result in two identical SELECT statements within the same transaction returning different data. These two SELECT statements are both returning fully correct and valid data. However, that is the reason is is not turned on by default.
I've seen two brilliant comments here:
Ian Nelson stated "Imagine how pissed you'd be if SQL Server treated your banking application like a silly little web app."
S stated "It's not 'silly' that SQL Server treats your website as a banking operation; personally I would expect nothing less. Correctness is a vastly more important than performance. Only the developer can trade off correctness for performance; and it is rarely needed."
I couldn't agree more with these statements, you as the developer has to be the one to choose what level of locking you require. There is a time and a place for all of them, from the pessimistic default to the much feared (nolock). However, as the majority of us Developers do "suck", I'm rather glad Microsoft have taken the default approach they have, even if it is unnecessary in a large number of cases.
If they were to justify reducing the level to READ COMMITTED SNAPSHOT, then they could in fairness reduce the default to READ UNCOMMITTED, how many people would that affect?
Robin Day
http://www.advorto.com
The devil is in the details, and in this case there are a lot of details. I'd like to echo the sentiment expressed by some people that I do hope the answers on stackoverflow will contain a little less vitriol. Lock issues are hard, even seasoned DBAs can run into them on occasion. Having done a database course or two does not make you anymore of an expert on the exact cause of these issues and it doesn't make you smarter than Jeff.
Other than that, I hope the type of person who likes to insult other people's intelligence will end up with appropiately low karma so I can just ignore their posts on the eventual stackoverflow. (which I'm not beta testing cause I'm lazy ;) )
wds on August 27, 2008 9:17 AMRobin Day Wrote: "If they were to justify reducing the level to READ COMMITTED SNAPSHOT, then they could in fairness reduce the default to READ UNCOMMITTED, how many people would that affect?"
I think you aren't understanding the difference between these two, and I think Jeff isn't either, as he said: "With nolock / read uncommitted / dirty reads, data may be out of date at the time you read it, but it's never wrong or garbled or corrupted in a way that will crash you."
Read uncomitted doesn't mean data is out of date, it means you are seeing data BEFORE its "in date." It means you are seeing the absolute latest *uncomitted* data. Read Committed Snapshot on the other hand means that you ALWAYS see the latest *committed* data. Read Committed Snapshot allows for 100% ACIDity. Read Uncomitted allows for 0%.
Tim C on August 27, 2008 11:00 AM>> Tim C:
Read uncomitted doesn't mean data is out of date, it means you are seeing data BEFORE its "in date." It means you are seeing the absolute latest *uncomitted* data. Read Committed Snapshot on the other hand means that you ALWAYS see the latest *committed* data. Read Committed Snapshot allows for 100% ACIDity. Read Uncomitted allows for 0%.
Almost, until that last sentence. The %-age is unknown a priori. The *only* time that UR data is wrong is when the Uncommitted transaction is Rollback-ed. In no other case is it wrong. Depending on the nature of the system (i.e. no programmatic Rollback, no Deadlock, and Timeout), it can be 100%.
How about this case?
CONNECTION 1: Update BankAccount Set Balance = Balance - 10 WHERE ID =1
CONNECTION 2: select Sum(Balance) From BankAccount
CONNECTION 1: Update BankAccount Set Balance = Balance + 10 WHERE ID =2
Connection 1 is moving $10 from account 1 to account 2. Connection 2 in the meantime just read that there is a total of $10 less than actually exists. Nothing rolled back, but you've just completely lost atomicity.
Tim C on August 27, 2008 5:22 PMTo illustrate read uncommitted, if all your selects used read-uncommitted, you could have
employee ID 12 is salary of $50
SELECT Salary FROM Employee WHERE ID = 12
phantom data a possibility
0s Process A - starts transaction -> multiplies all salaries by 100
1s Process B - reads salary for user X (with nolock - read-uncommitted) .. read value is $5000, or $50, or both
20s Process A - does heaps of other things and then rolls back transaction
whereas in snapshot mode
0s Process A - starts transaction -> multiplies all salaries by 100
1s Process B - reads salary for user X .. read value is $50
20s Process A - rolls back transaction or commits transaction
So there's nothing wrong with snapshot data, because it's not committed, it just prevents you from seeing data before it's committed, but doesn't make you wait. It's like you just reordered the read to come before the write started. It's a better version of NOLOCK, but it does come at a price. And maybe the reordering doesn't make sense to your application.
So the only alternative is you block readers from reading data that is under change.
0s Process A - starts transaction -> multiplies all salaries by 100
1s Process B - wants to reads salary for user X - lock & wait
20s Process A - rolls back transaction
21s Process B - wants to reads salary for user X .. read value is $50
And as pointed out earlier, if you block readers for rows that may change, and the reader is locking tables in a different order than writers, then you have deadlock.
monty on August 27, 2008 10:02 PM"Lastly, this is why database transactions should be deliberately designed rather than be the bastard step-child of the code's business layer use-cases.
Robert C. Barth on August 25, 2008 10:56 AM"
Hallelujah. Like object databases, this is another useless idea designed to make programmers think they can write code by clicking about a few times in an IDE. To design a working system like you want, you need to do it the old-fashioned, boring but reliable ways. LINQ is just like the Ruby programs that are finding they don't scale, they're nice ideas, but that's all they are.
If you are designing something that will be putting food on your table, you want it to work. If you're designing a playground for yourself, only then can you use whatever you like.
@Tim C:
How about this case?
This is a non-atomic transaction. What happens depends on the implementation. And in an OLTP system (really, any multi-process one), such a read will always be *wrong*, since there are multi-transactions (whether atomic or not) at any point in time. The read will interleave some writes; therefore it is *wrong*. If you've setup uour database (and it supports this) to read from buffers immediately (recognize commit on buffer update; don't wait for flushes, table or log), you shouldn't see the issue. But as others have mentioned, MVCC rather than locking is the answer.
BuggyFunBunny on August 28, 2008 5:06 AM>>LINQ is just like the Ruby programs that are finding they don't scale, they're nice ideas, but that's all they are.
I'd agree for using it with SQL, but in-memory objects like Arrays, Lists, loaded XML documents, etc... LINQ has been incredible.
HB on August 28, 2008 6:17 AMFWIW,
I provide an explanation of why and how deadlocks can occur with only SELECT and UPDATE statements outside of transactions
http://www.samsaffron.com/archive/2008/08/27/Deadlocked+
Sam on August 28, 2008 6:22 AMFunny thing:
During one project we found out that running parallel database queries didn't improve speed over serialized database queries.
(E.g. we had similar queries running one minute each. Running
them in parallel it took 6 minutes and then all of them finished.
So we didn't win anything by parallel execution - even worse the
response time for all users got unacceptable).
So the trouble about locking and concurrent operations and everything
was pointless - "batching" all queries and running them one after the
other didn't slow down the system and all problems about locking
and stuff were gone.
Sometimes I really wonder if todays heavily optimized x86-CPUs
really run faster than a overclocked 6502 CPU (CBM 64) running
on 3 GHz ;-)
@Sam - insightful stuff. Another thing to keep me from sleeping at night.
David B on August 28, 2008 10:29 AMIs it possible to somehow filter out suggestions like "Use MySQL" or "Switch to Linux" or "Oracle is the best!" or "Microsoft rules!" or "I love Macs!" or whatever. I don't mean just from this blog's comments, I mean from life in general.
Tippy Hedren on August 28, 2008 11:21 AMSimple solution: Change to Firebrid
Cornie on August 28, 2008 5:04 PMAssuming this is not an April Fool's type joke blog entry: the author states "I'm no DBA" and then proceeds to give DB related advice? Seriously? Even to the point of saying "But in practice adding nolock to queries that you absolutely know are simple, straightforward read-only affairs never seems to lead to problems". Never *SEEMS* to lead to problems? Really? These are to be taken as trustworthy statements? And the conclusion from the Google search on SQL Server vs MySQL deadlocks did not lead to a conclusion that twice as many (at least) companies use SQL Server vs MySQL? (and that doesn't even take into account the fact that MySQL has only supported transactions for a relatively short time in comparison to SQL Server). And no mention of the increased TempDB activity trade-off when using snapshot isolation?
I am very glad that this guy doesn't work for my bank. And, developers like this (no matter how good they are at application code and/or architecture) really do help keep DB folks like myself very well paid. Bless his heart ;-).
PS. with regards to one of the responses above (from wds): this is not an insult to the author's intelligence but rather a response to the dismissive attitude that databases are so simple that anyone who is a good developer is automagically qualified to do data modeling, solve complex DB problems, and even give DB-specific advice.
You need to look at what else is happening in the transactions that are deadlocking, its the transactions that hold the locks, not the individual SQL statements.
As Jasmine said above, for a deadlock to occur one transaction must have locked resource A and then tried to lock resource B, then the second transaction must have locked B, and tried to lock A, causing the deadlock.
In the case of read-write locks, it's also possible that the both transactions have (non-exclusive) read locks on B and one has a (exclusive) write lock on A.
Once you have understood this, you can solve the problem using a technique called "Structural Deadlock Avoidance". It's pretty simple, but it requires a bit of hard work and discipline. It works like this:
1) assign a unique number to each resource that you want to lock. E.g., resource A = 1, resource B = 2.
2) when locking multiple resources, always lock the one with the lowest number first, then the next lowest, .... E.g., lock A first, then lock B.
If you do this you can see that you'll never deadlock, because you always lock resources in the same order.
This technique works with any kind of locking construct, object locks in Java or .NET, file locks, semaphores, or database locks.
In a database the resources you number are tables. E.g., TABLE A = 1, TABLE B = 2, ... then you order your transactions like this:
Transaction 1:
BEGIN
SELECT ... FROM A ...
UPDATE B SET ...
COMMIT
Transaction 2:
BEGIN
UPDATE A SET ...
UPDATE B SET ...
COMMIT
I've been using this technique in a variety of environments for more than 15 years. As long as I don't take shortcuts, I never have deadlocks. Hope it helps you.
One of the 'underrated' causes of deadlocks is the absence of covering indexes (i.e. non-clustered covering indexes in the SQL Server world). Grossly simplfying, reads (SELECTs) that can be satisfied wholly from a covering index, will not require an extra lookup into the clustered index (which is predominately where the locking for updates or inserts will be occurring)
Mitch Wheat on August 29, 2008 7:29 PMHoly shit! These comments (inc. Jeffs original post) display the staggering amount of ignorance there is about SQL Server.
It's truly scary - A good 60-70% of comments are clueless, particularly the ones just saying use NOLOCK. Oh dear. Makes me realize how lucky I am to have a good understanding of the product. Conversely it makes me realise why so many database applications are f&cked!
If you actually understand how to program SQL Server its one of the best performing DBMS in the world, not to mention to easiest to develop for.
JackU on August 30, 2008 7:40 PMThere is a fundamental difference between database dead-lock and 'the Dining Philosophers puzzle' dead-lock you would see in concurrent programming using 'lock'.
The deadlock described is 'Philosopher' is a state at which none of process can continue because circular dependency. ( that is until you force one to give up ). This happens when a process A has lock on X and requires lock on Y and a process B has lock on Y and wants to lock on X ) There is an inherit resource inter-dependency conflict.
In the database, 'dead-lock' can happen when a rough process can lock the table - there is no lock on two resources. 'select * from table' may look innocuous but if it does not release lock on the table, any subsequent 'update' statement may cause 'dead-lock'. It really depends on the database to determine which statements to terminate. In DB2 case, it was 'update' statement but in MS-SQL it may be 'select *'.
'select *' is not innocent. If your program holds to the result set and iterate over very 'slowly', other operation that require 'write' lock maybe be dead-lock with 'select *'. It is just resource contention and no circular dependency. This problem was very self-evident on my case with DB2. The developer was doing 'select *' from very large table using DBArtisan. DBArtisan continues to hold on to the result set thus the table is locked out, causing any subsequent update to fail.
Detecting 'dead-lock' is _very_ difficult. Your first instinct ( 'SELECT *' cannot cause this ) may lead you to wrong path. I hope you find your cause soon.
Good luck.
I like Dave and his writings. But, for the life of me I can not understand why anyone would want to run Windows SQL Server when they can just as easily Install and Run MySQL.
MySQL can run on a Windows Server. I am not sure if .NET/ASP.NET has API's for MySQL, but if it does, switching to a better DataBase is what I would recommend.
I run 4 sites that have massive reads and writes and Im running a very old MySQL version (3) and have never had any problems. Other than when the server crashes, for whatever reason, and even than all I have to do is run the repair tool to reconstruct any broke record that may have corrupted.
Matt K. on August 31, 2008 4:22 PMBased on the philosopher example, simple switching is all you need
http://www.dragonlasers.com
Sod
Jeff's point in one of his followup posts is the critical one: the DBMS should, at user's option, be able to handle deadlock automatically, without user intervention. We are willing to accept some epsilon-level error on one side (dirty read, error in time synchronization, whatever) in exchange for not having our query barf on us for reasons created by the specific circumstances, i.e., someone else is competing with us for scarce resources.
In situations where code must be of very high reliability (larger organizations like mine), deadlock failures are simply unacceptable, period. The fact that they can arise through no fault of one's own query is a huge Achilles' heel to basic SQL.
The other Achilles heel, btw, is the ability of a single client with a bad query to bring the server to its knees, also a simply 100% unacceptable event. I imagine both of these server weaknesses can be tuned out by a skilled DBA, but I certainly don't know how, and we see these kinds of failures all the time.
John Pirie on September 1, 2008 6:42 AMJust a comment on deadlocks and the "with (nolock)" option - That basically says that you don't care what state the data is at. That is blocked by default because it is dangerous. Information that is "dirty" might be about to be undone. You can't seriously say that this doesn't matter because it isn't a banking app, because ALL information that is being read from the database is being read for a reason. Information is acted on, either by a computer or by a user (even if that just means displaying to a user so they can make decisions). Even somethign as simple as a date of a post: "Oh - I won't respond to that post because it was made last year." or "That concept is outdated... no - hang on, they just posted today."
Effectively, you are allowing decisions to be made based on information that will be wrong some of the time. That doesn't make sense. Seriously - why are we bringing back information from the database if no decisions are based on that information? Useless information would never need to be retreived.
SOMETIMES I will use "with (nolock)" - especially when I know the information I am returning can't be changed (one part of a row may be static, and another part may change). Also - if the "decision" being made by the information is allowed to be wrong some of the time (such as general monitoring processes), then I will also use it.
BUT - I agree with others that it is how you are trying to acheive your result that is the issue. A deadlock is a bit more complicated than 1 process trying to get access to a resource while another already has it. A deadlock is 2 processes trying to get to eachother's resource at the same time, but neither will let go of their resources until they complete. So your problem isn't with the Select, your problem will be that you have TWO updates. SQL Server only killed one of them, which is the one you had reported to you. Seriously - do a profile to determine what is running at the time the deadlock occured, and it will be clear the processes that are clashing (I use Lumigent Log Explorer, and it makes it very clear which resources are in conflict).
As for the select:
SELECT *
FROM [Posts]
WHERE [ParentId] = @p0
If you had not used an asterisk (which is a BIG NO NO) AND indexed the information you were after, you may have found that no block due to the lock would have occured at all. That is, SQL Server 2005 on allows you to include data at the bottom page of the index. Once the index has been used to find the data, it can then return the data to you without reading from the actual table!!! If this data is not being edited, the index should not be locked. So that performing this query:
SELECT [UserName], [Topic]
FROM [Posts]
WHERE [ParentId] = @p0
Stting the ParentID's Index to have username and topic columns included "with" the index (look up the TSQL on this to see how this is created). This would locate your information FROM THE INDEX and never need to touch the data, and never need to even know the row was locked (take that MYSQL).
As for database technology (others have commented on this):
I have done speed comparisons, and for small apps I agree that MySQL is OK. But for larger apps Microsoft SQL Server out-strips MySQL like a Car vs a scooter. Then there is the power of Microsoft SQL Server. Sure MySQL is getting better, but there is nowhere near the programming ability. So it is still horses for courses - and as long as the SQL is iso standard, a user can still upgrade from MySQL to SQL Server. However, my personal thoughts are that using SQL Server throughout is a prefered option due to scalability, power and it is the tool our developers are most familiar with..
I have seen something similar to this before. As another poster mentioned, if the select or the write requires a table scan then that will require a table lock and they can conflict. However, just adding an index may not help - and I would indeed assume that you already have that.
Using parameterised SQL statements (as in the shown code) can cause SQL to not use indexes, thus causing table scans, which can cause table locks.
So, if you do the thing about declaring a variable and then using that variable in a SQL statement, then the optimiser may choose not to use an index. I believe something similar is true for prepared statements - sometimes.
Because the database is small and all in memory, you are unlikely to even notice that table scans are happening.
You need to use the SQL Profiler with more details, combined with the Performance Monitor for checking for number of locks and number of table scans. Then try looking at the "estimated execution plan" in Query Analyser - using variables/prepared statements rather than substituting them.
Wasn't the diner a drive system for some ship in one of the Hitchhiker's Guide to the Galaxy books?
Debt Consolidation on September 2, 2008 12:31 PM"Using parameterised SQL statements (as in the shown code) can cause SQL to not use indexes, thus causing table scans, which can cause table locks."
On analysis, I have never found this to be true. At least not since SQL Server 2000.... But this is a very Key issue - you always need to profile your system. How things work depend on your hardware, your data, your data structures and indexes.
Philip on September 2, 2008 5:06 PM"Using parameterised SQL statements (as in the shown code) can cause SQL to not use indexes, thus causing table scans, which can cause table locks."
@Fran, this might have been true before "parameter peeking" was introduced in IIRC MSSQL7, or if you way out of date statistics.
My experience is that SQLServer is quite aggressive on using indexes for equals / like% queries (as you would expect) but it is unlikely to use indexes when using a range scan or if you are issuing SELECT * which means both the index and the table row records need to be read.
As always, don't ASSUME anything. If you want to be sure then design a test case and PROVE it. (I did this on a StackOverflow thread and surprised myself!)
Guy on September 2, 2008 9:56 PM@Chris Not sure if you're coming back to read this, but anyway, I just wanted you give you a thumbs up for your comments. I'm glad to see that someone has a different perspective on things. You also inspired me to do a little rant on "stupid" questions (hope you don't mind me quoting you):
http://blog.looplabel.net/2008/09/03/please-ask-stupid-questions/
(Sorry about hijacking your post, Jeff, but he didn't leave any contact information.)
Anders Sandvig on September 3, 2008 9:39 AMOne has to be careful while using the (nolock) hint in SQL Server due to the fact that it is possible for the query to not return all the expected results!
I have spent many days trying to figure out why a SSRS report never returned consistent results nor match information out of Microsoft Dynamics AX (an ERP system with SQL Server as a back end). This was due entirely to the (nolock) hint causing SQL Server to not return records that had a lock on them (hence the same query wouldn't always return the same rows). You need to SET ISOLATION LEVEL to READ UNCOMMITTED, or use the SNAPSHOT isolation level you mentioned in your post.
One thing of note with READ COMMITTED SNAPSHOT, it uses tempdb to create the snapshot before taking out the lock, so your tempdb files should be on an appropriate disk setup as it is being used more often.
Questions: http://www.hidesato.com/contact/ (until I can get a contact type form setup on my personal website - http://siliconsea.net/)
Redbeard on September 3, 2008 11:14 AM"The dining philosophers problem has been coined a classic synchronization problem either because it's practically inapplicable nor does computer scientists dislike scientists..."
-I read it in an operating systems book
Allan on September 4, 2008 12:56 AMFrom blog.stackoverflow.com
" We figured out our deadlock problem, finally. It had to do with Log4Net being called during a database operation, itself triggering another database operation and log event. This is normally OK, but the events were happening so close together that it resulted in a deadlock. "
SQL Server still works fine. Programmers still blame their platform for their mistakes.
Henri on September 5, 2008 7:26 AMClassic!
After all the whaling and slagging off SQL server the issue wasn't with SQL Server at all!
For fun go back through all the clueless comments and laugh at stupid attacks on SQL Server which have shown again to be unfounded.
Ignorance is truely bliss for most developers.
Jeff should write a follow up post, about being careful not to jump to conclusions when you don't understand the problem/software/library.
JackU on September 5, 2008 7:25 PMJeff said - "We figured out our deadlock problem, finally. It had to do with Log4Net being called during a database operation, itself triggering another database operation and log event. This is normally OK, but the events were happening so close together that it resulted in a deadlock."
JackU on September 5, 2008 7:28 PM@JacKU - "being careful not to jump to conclusions when you don't understand the problem/software/library."
I agree. You can easily come to an incorrect conclusion even if you follow a very logical path. I find the forums really useful. How many problems/errors/issues do you come across where nobody in the world has ever come across that issue before in any form? Not very often. Being part of on-line forum communities is becoming essential because "woops" problems like this just shouldn't happen if propper communication through forums is occuring.
We can't know everything, but everyone can know something.
Philip on September 7, 2008 8:54 PMThe article as well as the discussion just confirms what I encountered in my professional experience - many experienced application developers don't have a good understanding of the most basic DBMS properties: ACID, concurrency control, serialization levels... All of this I learnt at uni, can't remember if it was second or third year.
The underlying logical and theoretical essentials should take no more than a dozen hours to comprehend. RDBMS vendor-specific features are a few hours on top. What's everyone's excuse for not investing such relatively insignificant amounts of time?
Jeff's having problems with a simple web app. The scary part is that there are much more demanding apps designed by similarly ignorant engineers.
PS: A simple clue for raising DBMS-ignoramus alerts - look out for the word "persistence" (e.g. "our persistence layer").
Andrew from NZ/SG on September 9, 2008 6:06 PMOne of the best ways to motivate programmers to work is to have a clean and beautiful codebase.
it's a great investment. An opensource product with less features but beautiful code is preferable for a team to customize, I think.
Greg on September 12, 2008 9:08 AMMost DBAs don't know programming so don't answer these questions correctly (which is why I don't hire DBAs, becuase 95% are worthless). A few facts I'm surprised no on mentioned:
1. In the same database connection, you CANNOT block yourself. I've see lots of people chase their tails.
2. Many deveolopers use multiple database connections. This allows them to block themselves in the same "logical" set of code. If they simply reused connections, blocking can go away.
3. Try to always read and update your tables in the same order. This is old school, but just about no developer or DBA enforces these rules anymore.
4. Using NOLOCK in ALL instances is just plain wrong. Use it when you know it doesn't matter...like all things, you've gotta know what you are doing to do it well.
I encountered the same problem last week. Whenever two threads ran the transaction simultaneously, there occurred a deadlock. After reproducing the error and finding no solution, I placed a lock in the c# code so that no two thread execute that transaction at the same time. This eliminated the problem but at the cost of reduced parallelism. The idea of retrying could have backfired as there could have been an issue of thread starvation. Any comments or a better idea?
Umer Azaz on September 16, 2008 11:14 AMJeff,
There isn't any no locks option in Oracle because readers don't block writers. (and Oracle does not support dirty reads because there is no reason to do so) Not all RDBMSs work the same way. This is a prime example of that.
Jim
jim on September 17, 2008 8:37 AMJust because dirty reads don't exist doesn't mean that readers don't block writers. Dirty reads mean that readers aren't blocked by writers. Readers DO NOT block writers in SQL Server.
The one exception in SQL Server where readers block writers is when the reading is done within a transaction. That is - if a row is read in a transaction it is considered part of the transaction, but that is logical.
You wouldn't want to start a withdawal proces, read the balance, proceed with the transaction only to find that the balance was updated during the final part of the transaction! Instead, the second process trying to reduce the balance shoul wait until this transaction finishes.
I'm new to PHP and recently setup my local machine with PHP and MySQL for doing development. I was sort of stuck when I needed to post my work for the user to test and review. After looking around a bit I found a site that hosts PHP and MySQL apps. I was surprised that it was free - it seems they're offering the service at no cost until 2012. At that point they'll change over to a fee-based service. However, in the meantime, it's a great place to do anything from demo and sandbox right up to posting sites for real.
Their pitch is as follows:
"This is absolutely free, there is no catch. You get 350 MB of disk space and 100 GB bandwidth. They also have cPanel control panel which is amazing and easy to use website builder. Moreover, there is not any kind of advertising on your pages."
Check it out using this link:
http://www.000webhost.com/83188.html
Important: There's one catch in that you must make sure you visit the account every 14 days - otherwise the account is marked 'Inactive' and the files are deleted!!!
Thanks and good luck!
Jeff,
We had a similar problem where deadlocks were happening, and we found that it all came down to the MaxDop (Maximum Degree Of Parallelism) setting.
We have 90GB databases running super fast with no drop in performance by setting this value to 1. In fact we recommend it now. We also get 0 dead locks as a result.
I would definately try this.
Adam
I like your picture,it's funny.
green laser pointer on October 29, 2008 7:31 AM| Content (c) 2009 Jeff Atwood. Logo image used with permission of the author. (c) 1993 Steven C. McConnell. All Rights Reserved. |