Page 1 of 4 Threading is the most basic way to implement async code but for Python threading is complicated by the GIL. Find out the basics of threading in this extract from my new book Programmer's Python: Async.
Threads are often described as lightweight processes, but while they are a lot like processes there are some important differences. In this chapter we discover how to create and control threads within a single process.
As the multiprocessing module is based on the threading module you will find much of this chapter similar to the previous chapter, but there are important differences even at this level.
Programmer's Python: Async Threads, processes, asyncio & more
Is now available as a print book: Amazon
Contents
1) A Lightning Tour of Python.
2) Asynchronous Explained
3) Processed-Based Parallelism Extract 1 Process Based Parallism 4) Threads Extract 1 -- Threads 5) Locks and Deadlock
6) Synchronization
7) Sharing Data Extract 1 - Pipes & Queues
8) The Process Pool Extract 1 -The Process Pool 1
9) Process Managers
10) Subprocesses ***NEW!
11) Futures Extract 1 Futures,
12) Basic Asyncio Extract 1 Basic Asyncio
13) Using asyncio Extract 1 Asyncio Web Client 14) The Low-Level API Extract 1 - Streams & Web Clients Appendix I Python in Visual Studio Code
The Thread Class
When you start running a Python program you have a single thread, usually referred to as main thread. This just runs the Python interpreter, which in turns runs your Python program. You can create additional threads using the Thread class:
class threading.Thread(group=None, target=None,
name=None, args=(), kwargs={}, *, daemon=None)
where for the moment, group isn’t used, target specifies the callable to start the thread running, name is an optional identifier for the thread and args and kwargs are the positional and keyword arguments passed to the target. Discussion of the daemon parameter is best left until later.
Once you have a Thread object you can start the callable running in a new thread using the start method:
import threading
def myThread():
print("Hello Thread World")
t1=threading.Thread(target=myThread)
t1.start()
You will see Hello Thread World displayed by the new thread before the program comes to an end. The thread that runs the target is in the same process as the main thread and all of the global variables that are accessible to the main thread are accessible to it – both threads share the same memory space. This has some important consequences which we will explore in detail later.
The name attribute of the thread is purely for you to use to identify the thread – it is of no importance to the system. The two attributes ident and native_id are more useful in that they are unique across the system at the time the thread is running. The ident attribute is an integer that is assigned by the system. It is the system identifier of the thread and can be used in other system calls that need a thread id. Both are globally unique across the entire system, but only while the thread is running. When the thread ends the assigned ident and thread_id may be reused. Notice that native_id isn’t available on all systems and is only available from Python 3.8 onwards.
Threads and the GIL
Threads are very different from processes in that they share the runtime environment. That is, they have access to the same set of variables and objects as they run within the same process. Processes, on the other hand, each have their own copies of all of the variables within the program and there is no interaction between them. This sharing of resources seems to make things simpler, but in many ways it creates additional problems.
At the time of writing another major issue is the GIL – Global Interpreter Lock. The current implementation of CPython, and some other implementations like PyPy, allow only one thread to use the Python interpreter code at any one time. This isn’t very important on a system that has only a single CPU or core as only one thread is active at any given time anyway, but it does stop programs from running faster on multicore machines.
Although other implementations of Python, Jython for example, do not use the GIL they may in fact be slower than CPython or lack support for all of the modules that CPython does. There are attempts both to remove the GIL from CPython and to improve its performance. The main reasons for the continued existence of the GIL is that it allows Python to work with C-based libraries that are not thread-safe and it keeps single-threaded programs fast.
|