Programmer's Python Async - Process-Based Parallelism |
Written by Mike James | ||||
Monday, 28 November 2022 | ||||
Page 3 of 3
Waiting for the First to CompleteThe above example waits on p1 and p2 and p3 to finish. What is more difficult is to wait until one of the processes is complete, i.e. wait for p1 or p2 or p3, whichever completes first. The easiest way to do this is to make use of the Connection object which is introduced later as a way of communicating between processes. The technique relies on the sentinel attribute to return a handle to a system object that becomes “ready” when the process is complete. This is a low-level feature that changes how it is implemented depending on the operating system. The good news is that at the Python level it works in the same way under Linux and Windows. The multiprocessing.connection.wait function will wait on a list of sentinel handles until one of them becomes “ready”. It returns a list of sentinel handles that have become ready, for example: import multiprocessing import multiprocessing.connection import random import time def myProcess(): time.sleep(random.randrange(1,4)) if __name__ == '__main__': p1=multiprocessing.Process(target=myProcess) p2=multiprocessing.Process(target=myProcess) p3=multiprocessing.Process(target=myProcess) p1.start() p2.start() p3.start() waitList= [p1.sentinel,p2.sentinel,p3.sentinel] res=multiprocessing.connection.wait(waitList) print(res) print(waitList.index(res[0])+1) The first part of the program simply creates three processes which wait for random times to use as an example of waiting for the first process to complete. The final part of the program builds a list of sentinel values, one per process. Then we use the wait function to suspend the parent thread until one of the child processes completes. The return value is a list of sentinel values that are “ready” and these values are easily converted into the numbers of the processes that have finished. Notice that the program only takes the first sentinel value in the list. In practice you might want to process them all. Also, as all the processes in this example are non-daemon, they all run to completion after the main process ends. As the set of sentinel values only has to be an iterable, refer to Programmer’s Python: Everything Is An Object, ISBN: 978-1871962741 if you are not familiar with this distinction, you could write it as: waitDict= {p1.sentinel:p1, p2.sentinel:p2, This has the advantage of making the Process object corresponding to the process that finished first easier to find, i.e. waitDict[res[0]] is the process object. Computing PiAs a simple example, suppose you want to compute the mathematical constant pi to a few digits using the well-known formula: pi=4*(1-1/3+1/5-1/7 ... ) This is very easy to implement, we just need to generate the odd integers, but to get pi to a reasonable number of digits you have to compute a lot of terms. In other words, this series is very slow to converge. The simple-minded synchronous approach is to write something like: def myPi(m,n): pi=0 for k in range(m,n+1): s= 1 if k%2 else -1 pi += s / (2 * k - 1) print(4*pi) This computes the series from the mth to the nth term. The reason for this elaboration is that it allows us to compute different parts of the series in different processes. Of course, myPi(1,N) computes the full series up to the Nth term. If you try this out: if __name__ == '__main__': N=10000000 t1=time.perf_counter() myPi(1,N) t2=time.perf_counter() print((t2-t1)*1000) You will find that it takes about 1700 ms to compute Pi to five digits on a medium speed Windows PC and 4500 ms on a four-core Raspberry Pi 4. We can easily modify the calculation by splitting the sum into two portions and using a separate process for one half of the sum: if __name__ == '__main__': N=10000000 p1=multiprocessing.Process(target=myPi, Running this reduces the time to 1200 ms on the PC and 2500 ms the Pi 4. If you try these programs out using an IDE or a debugger than you may well discover that there is no significant speed gain. As before, this is because of the way programs are run under the debugger – again, to appreciate the speed increase try running them from the command line. Notice that the computation is performed using Python’s unlimited precision arithmetic, Bignum arithmetic, so you could continue to use this very slowly converging series to compute any number of decimal places. To know more about Python’s novel approach to large numbers see Chapter 2 of Programmer’s Python: Everything Is Data, ISBN: 978-1871962598. Thus far we haven’t explored any way that data can be exchanged between processes, so our only option is to print the results from each one. The subject of sharing data between isolated process is a complicated one and is postponed until Chapter 7. Processes may be isolated from one another, but they do share a single Terminal instance and so the print sends data to the same output. Increasing the number of processes to four on the Pi decreases the time to 1600ms, which demonstrates some decreasing returns on using parallelism. The complete program is: import time import multiprocessing def myPi(m,n): pi=0 for k in range(m,n+1): s= 1 if k%2 else -1 pi += s / (2 * k - 1) print(4*pi) if __name__ == '__main__': N=10000000 p1=multiprocessing.Process(target=myPi, However, if you increase N to 100000000 the single-process version takes 45 s and the four-process version takes just 12 s. For longer running processes the initial overheads matter less. In chapter but not included in this extract
Summary
Programmer's Python:
|
||||
Last Updated ( Wednesday, 30 November 2022 ) |