Programmer's Python Async - Process-Based Parallelism |
Written by Mike James | ||||
Monday, 28 November 2022 | ||||
Page 2 of 3
Processes are independent of one another and this means that it is possible for a process to create a child process and then end leaving the child process running. In practice this isn’t a good idea as the parent process should be in control of any processes it creates. By default child processes do not end when their parent terminates. This is slightly dangerous in the sense that you can create orphaned processes that just carry on running until the user notices and stops them manually. If you want a child process to terminate automatically when its parent process terminates you have to set the daemon attribute to True. If you know what a Linux/Unix daemon process is this will seem to be the wrong way round. A Linux/Unix daemon process runs in the background with no user interaction and has no parent process. In contrast, a Python daemon=True process is totally dependent on its parent to keep it running. To see this in action try: import multiprocessing def myProcess(): while True: pass if __name__ == '__main__': p0=multiprocessing.current_process() p1=multiprocessing.Process(target=myProcess, You can see that this starts two processes – one daemon and one non-daemon. The parent process comes to an end and you will see the ending message displayed. If you examine the process that are running after the program has ended you will discover that the process with the pid non-daemon process is still running. You can check which processes are running under Linux using the ps command and under Windows using the task manager. Under Windows the main program, i.e. the parent process, is listed as running whereas under Linux it is shown as suspended. In either case the process consumes no CPU time and waits for its non-daemon child process to end. If you run this program under a debugger then the results you see will be contaminated by the action of the debugger. To see the true behavior you have to run the program from the command line. If you are using VS Code run the program with the command Python: Run Python File In Terminal. This runs the program without the IDE or debugger getting in the way. To summarize:
As already mentioned, this seems to be the wrong way round compared to the usual Linux/Unix definition of a daemon process. You can test to see if a thread is a daemon using: Thread.daemon which is True if the thread is a daemon. Waiting for ProcessesSometimes it is necessary to wait until a child process has completed its allotted task. Any process can wait on another using the join method: Process.join(timeout) will put the calling process into a suspended state until the Process terminates or until the specified timeout is up. The timeout is specified in seconds and if it is None, the default, the join waits forever. You can discover what state the Process is in using Process.exitcode which gives you None if the sub-process is still running and its exit code otherwise. Usually an exit code of zero is used to signal that everything was OK. You can set the exit code to n using sys.exit(n). Another way is to use Process.is_alive() which returns True if the sub-process is still running and False otherwise. There are two methods that will terminate a process. Process.terminate() stops the process running by sending a SIGTERM signal under Linux or calling TerminateProcess under Windows. The result should be the same – the process stops running without completing any exit handlers and finally clauses. Any child processes of the process you stop will be orphaned. Clearly it is a good idea only to terminate processes that don’t have child processes. In most cases terminating a process should be a last resort and you should arrange for processes to run to completion. Alternatively you can use Process.kill() which is a more aggressive way to stop a process, as it uses the SIGKILL signal under Linux, which in theory always succeeds in terminating a process. If it doesn’t it is an operating system bug. You need to know that Process.Close() doesn’t stop the process. Instead it raises a ValueError if it is still running. What it does is to release the resources still owned by the Process object associated with the process. The join method is very flexible in that you can join a process that has already terminated and it will return immediately, but it is an error to try to join a process that hasn’t started. You can join a process multiple times and you can join multiple processes. For example: p1.join(1) print("Possibly not finished") p1.join(1) will pause the process waiting for p1 to complete. If it doesn’t finish after one second then the join returns and we see Possibly not finished displayed. Then the program waits for the process to finish for another second. You can continue this until the process completes. The advantage of this approach is that join suspends the calling process allowing the CPU to run other processes. If you simply poll on the state of the child process the CPU is kept occupied doing nothing. That is: while p1.is_alive(): p1.join(1) print("not finished") frees the CPU while looping every second to check that the process has finished whereas using: while p1.is_alive(): print("not finished") the loop finishes when the child process finished – but in this case it keeps the CPU occupied. You can use join to wait for multiple processes to end: p1.join() p2.join() p3.join() This will only continue when all three processes have completed. Notice that the order of completion doesn’t matter. |
||||
Last Updated ( Wednesday, 30 November 2022 ) |