Deep C# - The Perils of the C# Parallel For

Written by Mike James

Sunday, 15 December 2024

Article Index
Deep C# - The Perils of the C# Parallel For
Shopping Problems - Race Condition
Accessing Earlier Results

Page 2 of 3

Shopping Problems - Race Condition

While we have looked at the problems of sharing resources in earlier chapters which applies to all asynchronous and true parallel methods, there is an addition problem with true parallel algorithms. Put simply trying to turn a sequential algorithm into a parallel one introduces the possibility that the algorithm will interact with itself in ways that aren’t entirely obvious.

Parallelism is easy if you restrict your attention to completely unrelated tasks. If you have task A which has nothing to do with task B, then you can simply say to a processor “get on with task A” and to another processor “get on with task B”. For example, if you have a shopping list then you can tear it in two and give one to shopper A and one to shopper B and tell them both to do the shopping at the same time. When they meet up at the checkout there is no problem and the shopping is done in half the time.

Things start to go wrong when the tasks are interrelated. For example, if the top half of the shopping list says “buy bread” and the bottom half says “if you bought some bread buy some jam”, there are problems. Now when you tear the list into two and hand the halves to shopper A and B what you get at the checkout varies – sometimes you get bread and no jam, sometimes bread and jam, and very rarely jam and no bread when shopper A picks up a loaf, but at the last minute notices it’s stale and puts it back without getting another one.

If the list is processed by a single shopper buying things in the order the list specifies, there is no problem. As soon as you split the list then what you get depends on how the two tasks interact. Of course this is just another manifestation of the shared resource problem and the race conditions that can arise when it isn’t clear what order things will happen in. What is new is that now we have the extra concern of how you split a sequential algorithm into portions that it is safe to execute in parallel. Until recently this is a problem that the average programmer could avoid by not trying to write parallel programs, but now C# has the Parallel.For construction it is very easy to convert a sequential algorithm into a parallel one without a moment’s extra thought.

The Parallel For

TheParallel.Forloopmakes it so easy that you can start parallel programming in a few minutes - no threads, no tasks, nothing at all that looks any different from normal programming. When you get a little deeper you discover that you have to do things like lock shared resources and so on, but problems can arise much earlier and at a much more basic level if you haven't thought about the consequences of implementing something in parallel.

To see parallel programming in action start a new C# project and add:

using System.Threading; using System.Threading.Tasks;

System.Threading is the original threading library and System.Threading.Tasks is where the newer parallel features live.

Next place a button on a form and write a simple for loop in its click event handler:

int size = 1000000;
double[] data = new double[size];
Stopwatch sw = new Stopwatch();
sw.Start();
for (int i = 0; i < size; i++){
    data[i] = Math.Pow(new Random().NextDouble(), 0.6);
}
MessageBox.Show("done " + 
        sw.Elapsed.TotalMilliseconds.ToString());
sw.Reset();

The loop simply does some arithmetic on a lot of random data. The Stopwatch class, which can record time accurately across multiple threads, is used to time how long the loop takes and to make use of this very useful class all you need to add is:

using System.Diagnostics;

We can turn this loop into a parallel loop very easily. The Parallel static class has a For method which accepts the start and end value for the loop and a delegate to execute. In general the command is:

Parallel.For(start,end,delegate);

The loop is run from startto end-1 and must run in the forward direction, that is from smaller to bigger index values. There are variations on the Parallel.For command, including parameters to control the parallel iteration and a Parallel.ForEach, and the problem that we are about to encounter can happen with all of them.

The simplest way to provide the delegate is to use a lambda expression and so the following example will execute the code for values of i from 0 to size-1.

int size = 1000000;
double[] data = new double[size];
Stopwatch sw = new Stopwatch();
sw.Start();
Parallel.For(0, size, i => {
    data[i] = Math.Pow(new Random().NextDouble(), 0.6);
});
sw.Stop();
MessageBox.Show("done " + 
          sw.Elapsed.TotalMilliseconds.ToString());
sw.Reset();

This all works and the results are impressive. On a dual-core machine the simple loop takes around 4000ms and the parallel loop takes 2500ms, which isn’t quite half but still represents a very worthwhile speed up. If you use the Task Manager or the diagnostic tools to monitor CPU usage, you will see that the simple loop uses 100% of one of the cores while the parallel loop uses 100% of both cores. This in itself might not exactly be a good thing, as our parallel loop is a bit of a processor hog, but these are the sort of simple decisions that have to be made.

The bottom line is that the Parallel.For is very easy to use and produces a real gain in performance. It provides a lot for very little effort. However, it comes with its own set of dangers. We have just seen that it has apparently converted a loop that takes 6 seconds to one that takes 3 seconds, which is like doubling the machine's speed for free. However, unless you realize this isn’t magic then there could be problems ahead.

You can’t just convert any old for loop to a Parallel.For and expect everything to work – it only works if the for loop is essentially a set of independent operations. This sounds easy, but it can be difficult to spot because parallel implementation can render everything you have come to expect simply wrong.

<< Prev - Next >>

Last Updated ( Sunday, 15 December 2024 )