Programmers* love to futz around with thread priorities. As if programming with threads wasn't already dangerous enough, we've got to get in there and tweak thread priorities to make things run.. er.. "better".
Let's fire up Task Manager and take a quick survey of process priorities. Out of 38 processes running on my computer right now, I have 0 at low priority, 36 at normal priority, and 2 essential system processes (csrss and winlogon) running at high priority.
I bet almost every process on your machine is running at a base priority of "Normal", too. And there's a very good reason for this.
Witness K. Scott Allen's strange threading experiment:
This program behaves badly on a single processor machine, and pegs the CPU at 100% for over two minutes. On a multi processor machine, the program finishes all the threading work in the blink of an eye - only a brief CPU spike.Strangely, if I remove a single line of code:
t.Priority = ThreadPriority.BelowNormal;… then the program performs just as well on a single processor machine (only a brief spike - comparable to the multi processor scenario).
This little threading demo highlights one of the reasons a dual-core computer is so desirable -- it protects you from poorly written programs. If a program goes haywire and consumes 100% of CPU time, you still have a "spare" CPU waiting to pick up the slack. Whereas a single processor machine becomes totally unresponsive. That's why Task Manager itself runs at High priority-- so you can pre-empt these kind of runaway apps.**
Hardware fixes to software problems are never pretty. What's really going on here? Joe Duffy is something of an expert on the topic of threading and concurrency-- he works for Microsoft on CPU-based parallelism in the .NET Common Language Runtime-- and he has this to say:
Messing with [thread] priorities is actually a very dangerous practice, and this is only one illustration of what can go wrong. (Other illustrations are topics for another day.) In summary, plenty of people do it and so reusable libraries need to be somewhat resilient to it; otherwise, we get bugs from customers who have some valid scenario for swapping around priorities, and then we as library developers end up fixing them in service packs. It's less costly to write the right code in the first place.Here's the problem. If somebody begins the work that will make 'cond' true on a lower priority thread (the producer), and then the timing of the program is such that the higher priority thread that issues this spinning (the consumer) gets scheduled, the consumer will starve the producer completely. This is a classic race. And even though there's an explicit Sleep in there, issuing it doesn't allow the producer to be scheduled because it's at a lower priority. The consumer will just spin forever and unless a free CPU opens up, the producer will never produce. Oops!
The moral of the story? [Thread] priorities are evil, don't mess with them.
Although there are some edge conditions where micromanaging thread priorities can make sense, it's generally a bad idea. Set up your threads at normal priority and let the operating system deal with scheduling them. No matter how brilliant a programmer you may be, I can practically guarantee you won't be able to outsmart the programmers who wrote the scheduler in your operating system.
* and users who think they're programmers
** assuming the runaway app itself isn't running at High priority, in which case you're in a world of hurt.
Jon,
Not only do media players (not just windows media player) run at a higher priority, they also set the timer resolution to ~1 msec. This happens for the entire system not just the application that set the resolution.
I do have to disagree with some points of this thread (excuse the pun). Threads that use 100% CPU are usually inidicatons of poorly written threads. In my experience, this occurs when the code is something like:
while(oktorun)
{
DoSomeWork()
}
In order to stop this thread, oktorun gets set to false. Bad, Bad, Bad.
As Jon points out, 1 solution to this is as follows:
while(oktorun)
{
DoSomeWork();
Sleep(1); //throw the slice
}
But, this is really not right either. IMHO, the Right Thing is as follows:
while(TRUE)
{
::WaitForSingleObject(blah,sometimeout);
DoSomeWork();
}
Then, when you need to shut down the thread, set the shutdown event. You could use WaitForMultpleObject as well.
How about setting the thread priority to realtime in Task Manager, that can be fun...
Doogal on August 31, 2006 2:19 AMI sometimes set a conversion process to high priority because I need it to speed up. Then I regret it because it takes up so much that I can't lower it again.
[ICR] on August 31, 2006 2:21 AMOne poster has already mentioned this but I think it's important enough to mention again. The problem is NOT with thread or process priorities, it is a problem with the Windows scheduling algorithm.
If my memory serves me (lately it often seems to have a different master it serves), Windows assigns all time slices to high priority processes UNTIL all are blocked and then it assigns time to normal processes until all those are blocked and then low prio processes get their shot. This is discussed in Advanced Windows by Jeffrey Richter (from Microsoft Press no less). Why MS built their AOS (Almost an Operating System) this way I have no idea.
Any decent textbook on operating systems (like the ones I read in my undergrad degree before Windows was a twinkle in Gates' eye) will indicate that you shouldn't completely starve lower priority tasks; simply allocate a much smaller pool of time slices to them than you allocate to higher priority tasks. Of course, this would not be what you would want to do in a real time system but I trust nobody is putting my life at risk by running a real time system on Windows.
Tim Dudra on August 31, 2006 2:28 AMColeman: If the scheduler is doing it's job, why would you need to constantly "give up" a time slice inside the loop? That's a great way to introduce unnecessary context switching.
Attempting to usurp the responsibilities of the OS is usually not a good idea.
Now, if you were to argue that the scheduler is doing a poor job, that would be another thing entirely..
Don't confuse thread priorities with process priorities. Threads inside processes can be scheduled with different priorities again.
But I do find priority tweaking useful; to down-schedule cpu-intensive, long-running tasks such as archiving and video encoding, so I can continue to use my desktop as usual. On a task that takes 15 minutes, I don't mind waiting the extra minute if I can continue doing other stuff in the meantime.
An application that fits that bill has my blessing to set itself to BelowNormal priority. Otherwise, I haven't found a good reason to mess with priorities yet, either.
RiX0R on August 31, 2006 2:44 AMHave a look at the code implementing all those nice UI animation effects, e.g. auto-hiding windows and such.
AFAIK, they all set the UI thread to high-prio, to have that smooth and steady feel to it. It's only temporary, unless it somehow goes haywire, say by some incoming windows message.
Henry Boehlert on August 31, 2006 3:15 AMI'm glad you started this topic as I've always wondered why pre-emptive multi-tasking doesn't seem to work as I would expect it to.
Why should one task grab 100% and freeze out the UI and everything else?
We have an IIS server and if one page is busy for a long time why on earth should it freeze everyone else out and slow the whole server to a crawl?
I don't understand why processing power is not spread equally between the other processes - I always assumed that is what PRE-EMPTIVE multitasking was supposed to do when it was launched in XP after Windows 95 et al.
There doesn't seem to be any other resource blocks BTW...besides even if there was the I/O should also be shared shouldn't it??
Steve Hurcombe on August 31, 2006 4:14 AMQuite a predictable result. I think problem is rather with design of inter-thread communication, not with priorities. Consumer should actually _wait_ for producer. In this case scheduler not only can free CPU cycles by not giving them to consumer, but it can also boost producer priority (because high-priority thread waits for it).
Iggi on August 31, 2006 4:19 AMWhen I first had access to the task manager I set my DVD player to a higher priority as it was stuttering. Whoops.
Took over my UI thread, and so I couldn't move the mouse to hit 'play'.
:S
Ian Tyrrell on August 31, 2006 4:26 AMGrant Johnson, Tim Dudra, anyone else criticising the Windows scheduler: you're wrong. Windows does indeed run threads by priority. It is fully pre-emptive, but that word means something different from what you think it does. Pre-empting means that when something occurs that causes a higher-priority thread to become runnable than is currently executing on the current, or any, processor, the current thread's run is ended early (with the scheduler noting how much of the quantum has been used, the thread getting the rest of its quantum next time) and the higher-priority thread beginning its run immediately, rather than having to wait until either the next timer tick or to the end of the current thread's quantum. The pre-empted thread gets pushed onto the front of the queue at its priority level, so it will be the next thread to run once all higher-priority threads have blocked (see next paragraph).
Windows' scheduler is a round-robin priority-based scheduler. What that means is that it will take into account the priority of all runnable threads on the system. It runs all threads at the highest priority level at which there _are_ runnable threads in a round-robin fashion - running each thread in turn. It will only start to consider lower-priority threads when those higher-priority threads are no longer runnable, which basically means when they're blocked waiting for something to happen. The exception is when it has more CPUs than high-priority threads - in this case it will find work for the otherwise-spare CPUs to do.
Multi-CPU (or {logical} core) scheduling is more tricky since Windows has both an affinity mask which governs which cores a thread is allowed to run on (which is also misused!), an 'ideal' core for each thread (set when the thread is created, rotating across the cores within a process, so thread 0 might be on CPU 0, then thread 1 will be on CPU 1, thread 2 on CPU 2, then 3 on CPU 0 again, and so on), and remembering the last core that a thread ran on. These last two settings are a hint to try to improve cache locality - with luck the thread's code or data may still be in the core/processor's cache. You might find that the two highest priority threads on a two-core system wouldn't actually both run concurrently if they have the same ideal core and there's a runnable lower-priority thread which has the other core as its ideal core.
Windows stores two priority values: a base priority, and a dynamic priority. The scheduling is done based on the dynamic priority - the base priority is a low water mark below which a thread will never go. When a thread that was blocked becomes runnable, its dynamic priority is boosted - the amount it is boosted by depends on the reason it becomes unblocked. For disk I/O it's boosted by 1, for network or serial port activity by 2, for keyboard and mouse input by 6, and if the sound card needs more data, by 8. If it was waiting for an event or a semaphore, it's normally by 1. There are special functions which can cause a higher boost for events - the event used in a critical section boosts the unblocking thread to the priority of the thread which left the critical section plus one. If a thread unblocks due to GUI activity its priority is boosted by 2.
Joe Duffy mentions the balance set manager thread which wakes up once per second for various management duties. One of those is to look for threads that are runnable but haven't run for 4 seconds. It temporarily gets boosted to priority 15 (the highest non-real-time priority) then, after its quantum expires, drops back to its base priority. It only examines 16 threads on the ready queue at a time, and will only boost 10 threads in any pass, so if there are a lot of runnable threads, it may take some time for any given thread to get its boost.
Full details are in 'Windows Internals, Fourth Edition' by Mark Russinovich and David Solomon, which I recommend to any programmer. If abstractions leak, you'd best understand the leaks.
Windows hasn't had the concept of 'yield' since Windows 3.1. 16-bit applications do still act this way with each other on Windows NT-family systems but the whole 16-bit environment is pre-emptable against 32-bit applications.
The main problems are threads which poll rather than blocking, and applications which are thread-happy. Polling is brain-dead. The operating system offers numerous ways to wait for changes to occur, and they should be used. Polling wastes laptop batteries, because the CPU cannot be put into a low power state, and often causes disks to be kept spinning because the polling thread's code and data may be pulled back into memory if trimmed from the working set.
A thread-happy application is one which creates, and keeps runnable, more threads than can be actually serviced by the system's processors. The system spends its time thrashing the processor's cache rather than doing useful work. Put one of these at high priority and suddenly very little else will get done.
Mike Dimmick on August 31, 2006 6:39 AMMr. Dimmick,
You wrote: "Windows' scheduler is a round-robin priority-based scheduler. What that means is that it will take into account the priority of all runnable threads on the system. It runs all threads at the highest priority level at which there _are_ runnable threads in a round-robin fashion - running each thread in turn. It will only start to consider lower-priority threads when those higher-priority threads are no longer runnable, which basically means when they're blocked waiting for something to happen. The exception is when it has more CPUs than high-priority threads - in this case it will find work for the otherwise-spare CPUs to do."
Isn't this _EXACTLY_ what I said, although admittedly from a different angle? If a high priority thread is ready to run (i.e., they are not ALL blocked), then Windows runs it, pre-empting if necessary any lower priority threads. It will then run that high priority thread EXCLUSIVELY until it is joined by another high priority thread that can be run (in which case the two or more of them share all the time quantums) or it (and any that joined it) gets blocked. Six of one, a half dozen of the other. I say Potato, you say Spud. We need to practice our critical reading skills a little before jumping out and practicing our critical writing skills.
Whether or not you believe your explanation to be more correct due to some bizarre bit of subtlety, the scheduling algorithm is absurd because it can and does result in starvation of lower priority threads, threads that may be necessary to move the high priority threads onward instead of just cycling - and we know what happens with cycling, especially American cycling; drugs, lots of drugs and, as Mr. Mackey says "Drugs are bad, don't do drugs".
I agree with Iggi here. The correct moral should be that *spinning* is evil, not thread priorities. The OS provides sync objects to handle this situation. They will even handle the priority inversion by boosting the lower thread's priority.
Code should never spin unless the duration is so short that its quicker to spin (and this is enforced with a timeout), or your code owns the CPU (embedded and/or realtime programming).
T.E.D. on August 31, 2006 8:48 AMThat's funny -- I have to run a CPU graph monitoring tool on OS X so I can tell if a process is pegging the CPU (easy to do these days on a 700MHz iBook). If it wasn't for the graph, it would take me a while to realize the machine was running slow, because the scheduler on this platform apparently actually works.
Except I've pegged the CPU under windows and not had as much trouble as people are indicating, so I wonder if paging is the real root problem here. When a procress takes over all the RAM and has to page to disk, it gets hung on the I/O. So the scheduler tries to run another program, but since all the RAM is taken over, that has to reload its working set from disk too, so it hangs on paging as well. ...and so on. *Everything* starts hitting paging, and everything grinds to a halt.
Ethan on August 31, 2006 9:16 AMI have experienced the same thing Ethan did. On the Mac sometimes you "feel" something slow, check the Activity Monitor and boom.. a process is taking up 100% cpu and not responding, but the whole OS keeps working.
And this is on my "little" G4 1.25 GB Powerbook with 512Mb RAM.
It's really amazing. :)
Martin Marconcini on August 31, 2006 10:23 AMDon't confuse thread priorities with process priorities. Threads inside processes can be scheduled with different priorities again.
Yes, I know, but they are functionally similar. Thread priority is actually a 16-bit number from 0-31 (where 0 is reserved for system use). Here's a page that describes the relationship between application priority and thread priority in Windows:
The names are process priority, and each number goes from highest thread priority (Real Time) to lowest (Idle)
Real Time -- 31, 26, 25, 24, 23, 22
High -- 15, 15, 14, 13, 12, 11, 1
Normal -- 15, 10, 9, 8, 7, 6, 1
Idle -- 15, 6, 5, 4, 3, 2, 1
I've pegged the CPU under windows and not had as much trouble as people are indicating, so I wonder if paging is the real root problem here.
The problem is usually some other I/O bottleneck. Like a dual-core 3GHz monster rig brought to its knees by a single unreadable CD-ROM, for example.
The correct moral should be that *spinning* is evil, not thread priorities.
There may be other fixes, but manually setting a thread priority is a recipe for problems. The OS itself rarely sets explicit priorities, and for good reason-- you should rarely need to!
Jeff Atwood on August 31, 2006 10:46 AMT.E.D., I respectfully disagree with your statement that "[t]he correct moral should be that *spinning* is evil, not thread priorities."
If you are writing LOW-LEVEL concurrent algorithms, spinning is often a requirement for good throughput/parallel speedups. A context switch is just about the worst thing you can do when you're trying to scale up to tens or even hundreds of processors. Not only does a context switch cost in the neighborhood of 4,000 cycles, but it can have a cascading effect on your entire set of threads due to causality, effectively killing your scalability.
Using a simple spin-lock is quite effective in many such cases. The CLR's synchronization primitives build on top of the OS primitives, adding some lightweight spinning up front. We have spent many years of work fine tuning these algorithms across 3 major releases, and countless machine architectures and OS SKUs. The result is that statistically we acheive better performance on multi-processor machines when compared to the same workloads using the OS primitives directly. This is, of course, workload dependent. If code holds a lock for a long period of time or spinning is used to wait for high latency events, then the thread will eventually block anyhow. And therefore, spinning just represents wasted CPU cycles.
OK... voluntarily giving up a time slice is not REALLY the worst thing you can do. A poorly written spin wait is even worse. Only a small subset of programmers should ever need to write one, but those that do need to know how to do it correctly. It turns out that most of the programmers I interact with daily are library developers at Microsoft, which means many of them have to face this head on. And so I admit my perspective may be a little skewed.
Regards, joe
Joe Duffy on August 31, 2006 12:54 PMjsm
Mike and Tim said it way better than I could have:
"It will only start to consider lower-priority threads when those higher-priority threads are no longer runnable, which basically means when they're blocked waiting for something to happen."
"...when they're blocked..."
This is what sleep and WFSO/WFMO do, make the thread wait so that other processes get a turn. More precisely so that that scheduler will give some time to other processes and threads. If you have a while(true) loop that's running "full tilt" you're never giving the scheduler a chance to give time to other processes -- which will lead to 100% CPU on single CPU systems. This happens whether the thread is high or normal priority.
You asked:
"If the scheduler is doing it's job, why would you need to constantly "give up" a time slice inside the loop?"
I says you need to give up a time slice so the scheduler can do it's job. So, agree with it or not, giving up the slice is a necessary evil in Windows threading.
Coleman on September 1, 2006 7:56 AMColeman: the difference is that other threads actually do get CPU time if the badly-behaved thread is at normal priority. You don't see it because the well-behaved threads have normally woken up, done their work, and gone back to sleep again before their time slice has elapsed, and are therefore using far less than 1% of the CPU time. Process Explorer from www.sysinternals.com can show fractional CPU. It can also show the Context Switch Delta, the number of times this thread was switched to - some threads are doing such tiny amounts of work that they don't even get credited with any CPU time by the thread time accounting.
If the badly-behaved thread is at high priority and the other runnable threads are at normal priority, at the end of every quantum for the badly-behaved thread, the scheduler will look at the priority queues, see that there's no other thread at this priority level and just run the badly-behaved thread again. This is what it's designed to do. It's only when the balance set manager thread boosts the priority of the starved thread that the starved thread will get to run, and it will only get up to one quantum (about 270ms with XP's default settings for the foreground application, 90ms for a background application or a special quantum of 4 ticks = 60ms on Windows Server 2003) before its dynamic priority drops back down and the badly-behaved thread gets full control again.
The scheduler _is_ doing its job. Its job is to run higher priority threads until they block. If the threads don't block, that's a programming error. Unfortunately developers have a tendency to use all the resources of the system for their task, not considering that it may not be the most important task for the user at that time.
A spinlock is meant to spin for a _short_ time. If it spins for as long as a timer tick (15ms) it's spinning too long and should have blocked.
A thread doesn't just block when Sleep or the WaitFor{Single,Multiple}Object{s} family are called. It also blocks when GetMessage is called (PeekMessage does not block and should be avoided where possible). Never use Application.DoEvents - that's a polling function (it calls PeekMessage) and can cause re-entrancy problems as well. Blocking also occurs whenever disk accesses are needed to satisfy a request, either explicitly using ReadFile (or anything on top of that, e.g. fgets, fscanf, StreamReader), or implicitly when paging code or data from the executable or the page file, or from a memory-mapped file.
If you genuinely have a compute-bound operation whose code and data fits in a small area of memory, then there's no need to yield as long as you leave the priorities alone - the scheduler will pick another thread at the same priority to run (the one which ran least recently) when your thread's quantum elapses, if there is a runnable thread. High priorities are for responsiveness where the OS dynamic boosts are not sufficient. I can't honestly think of a reason to use this.
If you find any operation where you're polling a value and the API doesn't offer a blocking version (or a version that signals a synchronisation object), complain to the maker of the API.
Mike Dimmick on September 1, 2006 8:46 AMcoleman:
"If you have a while(true) loop that's running "full tilt" you're never giving the scheduler a chance to give time to other processes ..."
Although that may be true in a cooperatively multitasking OS, it is false for a preemptively multitasking OS. In this discussion, it's the latter we're talking about.
"I says you need to give up a time slice so the scheduler can do it's job."
In a premptively multitasking OS, it is the _operating system_ that forces running threads/processes to "give up" time to other threads/processes (that's part of it's job). So no, there is no need for Sleep() to give up time, unless (as has been pointed out) you are misusing priorities. If putting Sleep() inside your loop makes things better in your situation, that is a symptom of a more fundamental problem, not a real solution.
Well here are some comments from a device driver developer, who just saw a preposterous claim on his RSS feed.
Process and thread priorities are there for a reason. They have clear-cut and documented semantics. So does the scheduler. So do cars. We don't condemn cars because some people drink and drive. We don't condemn cars because most people have trouble controlling them at 130mps.
Why should people think that changing the priority of a CPU-hungry app to realtime and thus hogging the system is a more reasonable thing to try than jumping off a car moving at 60mps?
Both actions have well defined results, they are 2 clicks away and the "OS" definitely permits them.
Moreover, why should people think that an OS should magically protect them from anything evil that a clueless app/user/driver might do?
OK, sure, usability and robustness should always improve, but there is common sense as well.
You can blame the OS for a user misusing process priorities as much as you can blame it for allowing the user to delete all her files.
To sum up:
(*) Process and thread priorities are an 100% necessary feature. The design and implementation of the Windows Scheduler is quite satisfactory.
(*) Multithreaded programming is usually much tougher to get right than simple GUI programming, and involving priorities makes it a little more challenging, but nothing that a decent developer can't digest. Decent means a bit of experience and some willingness to read the documentation, rather than mindlessly copy-paste from CodeProject some code they don't really understand.
I would be interested in knowing what OS X and Linux are doing that somehow makes some people happier, and why Windows is not doing it. Anybody know? Obviously, there are pros and cons to every scheme, so nothing is going to be perfect.
I do wish Windows would be a bit more granular about how often it boosts the priority of low-priority starving threads. Instead of getting 20 ms every 4 seconds, I would prefer that the starving threads got 1 ms every half-second or so. Much easier to interact with the system that way. However, this does mean the system spends more time context switching and less time working, but in some cases that might be preferable.
In my experience, Windows *usually* does an ok job of scheduling the CPU. The biggest issues come up when one process monopolizes some other resource, such as disk IO. That is when things really start bogging down.
Doug on September 8, 2006 11:27 AMI was just wondering if any of you guys can answer this question:
If a process is suspended(put on wait), will its thread also be suspended
And what is an example of this
John on September 10, 2006 4:54 AMA comment on the last quote, the Joe Duffy quote.
The only reason the producer will be starved is that the consumer is using busy-wait instead of a system-supplied synchronizing object. Using that object, the thread would be "not-running", thus not starving the producer.
Thread priorities are not evil, people writing bad code are evil, and they hog my computer.
TG on November 2, 2006 2:01 AMI sometimes reduce the priorities of processes that are taking 100% CPU time, but that is is for what I do with priorities.
Yuhong Bao on April 14, 2008 11:00 AMYour reall issue sits not with thread priorities, or even process priorities (your article mixes the two) but really with the Windows scheduler. It does not propperly pre-empt processes, it instead appears to wait for the process to yield, and just gives starts based on priority. As another user, who uses Mac OSX, mentioned, one task at 100% does not kill the machine. I use primarily Linux, and have the same observation.
Grant Johnson on February 6, 2010 9:52 PMI've been glad that thread priorities can be changed by me as an end user when an application is preventing me from working. My only other option at that point is to kill the process, which can leave things in a sorry state. An example was RSSBandit on 1500 feeds - it would hang like crazy on a single proc box when updating feeds, but if I set the thread priority to low I could let it update in the background and keep working.
That's a great point on the real utility of dual proc boxes being insurance against un-responsive applications rather than blazing speed on normal operations (which I haven't found to be the case). It would be nice if Windows eventually changes CD / DVD access so it only drags down a single proc, although I'm sure that would be a complex task. Anyone know if this is better in Vista?
I was a little surprised to hear that one of the new features of Windows Media Player is that it runs at a higher thread priority level. On the one hand, I understand that it's good to prevent playback stutter, but it seems odd that entertainment software would deserve a higher thread priority.
I'm not going to pretend to argue with Joe Duffy on threading, but my understanding is that for general application development (i.e. not Microsoft library development), you're better off with a spin sleep rather than a spin lock:
while (!proceed) Thread.Sleep(sleepTime);
If you need need need the high performance of a spin lock, you'll know it; otherwise please don't lock up my machine.
And for further reading, I highly recommend Joseph Albahari's e-book, Threading in C#:
http://www.albahari.com/threading/
Free PDF download here:
http://www.albahari.com/threading/threading.pdf
We build systems that require close to real time performance - i.e. if Sleep(10), I'd like to be back in 15-20, not 40-100ms, and if waiting on an event, I'd like to wake up close to the time that event occurred (e.g. network rx is caught by one thread, which swallows buffers acknowledges the message, and sets an event to say data is available to the). Raising the *thread* priority of threads within the processes which need this performance has performed well, and lowering the *process* priority of GUIs again improved performance (let's face it - no point in having a working GUI if you're losing data because of it...).
Question is, should I be raising the priority of the *processes* so they all sit at a higher priority than the Windows GUI to start with?
Latest issue is my continuous 6MBytes per sec disk access works most of the time, but sometimes disk IO stalls for more than 2 seconds and my 16MBytes memory buffer gets full...
As a matter of fact, I think that the particular behavior that Scott Allen's program demonstrates is really a bug in .NET 2.0
What happens is that there is a loop inside the CLR (ThreadNative::StartInner)
that looks like this:
while (!pNewThread-HasThreadState(Thread::TS_FailStarted)
pNewThread-HasThreadState(Thread::TS_Unstarted))
{
__SwitchToThread(0);
}
(from comsynchronizable.cpp in the sscli http://www.microsoft.com/downloads/details.aspx?FamilyId=8C09FD61-3F26-4555-AE17-3121B4F51D4Ddisplaylang=en
)
The loop tries to wait until the thread has actually started before continuing.
The problem is that in the sample there are two threads that are simultaneously executing the code above. This means that we have two normal-priority threads, and one low-priority thread.
The two normal-prio threads are continually calling SwitchToThread which just switches to the other normal-prio thread, which means that the newly creted low-prio thread never gets a chance to run.
This means that the program is stuck until the balance set manager comes in and boosts the prio of the low-prio thread.
Calling SwitchToThread in a loop waiting for another thread to do something is always dangerous.
To fix the problem, there should be a maximum value for the number of times to spin - after the maximum spins the SwitchToThread should fall back to a Sleep(1)
Even if it is dangerous to mess with thread priorities, the .net framework should be able to create low-priority threads without risking long delays.
Another observation is that this loop was not present in .net 1.0 which means that this is also a backwards compatibility problem. The test program would have worked fine on 1.0 but fails miserably on 2.0.
/SG
Stefan Gustafsson on February 6, 2010 9:52 PMThe comments to this entry are closed.
|
|
Traffic Stats |