Programmer's Python Async

Programmer's Python Async - Threads

Written by Mike James

Tuesday, 31 January 2023

Article Index
Programmer's Python Async - Threads
Daemons
Local Variables
Thread Local Storage

Page 4 of 4

Thread Local Storage

It is obvious that global variable are always shared and in most cases this isn’t a problem. However, suppose you have some existing code that makes use of a global variable to store its state and we want to make it thread-safe. To do this we have to create a global variable that is thread-local. The threading module provides the local class which is a global object with thread-local attributes. That is, when you create an instance of threading.local any attributes that a thread creates are thread-local, for example:

import threading
myObject=threading.local()
def myThread(sym):
    myObject.temp=sym
    while True:
        print(myObject.temp)
t1=threading.Thread(target=myThread,args=("A",))
t2=threading.Thread(target=myThread,args=("B",))
t1.start()
t2.start()
t1.join()

This repeatedly prints runs of As and Bs. The local class is used to create a thread-local object, myObject, and t1 and t2 use this to create their own thread-local temp attributes. If myObject was just a general object, i.e. not thread-local, then t1 and t2 would share a single copy of the attribute.

At this point you should be wondering why anyone would want to use threading.local? Notice that a local object cannot be used to persist the state of a function that is repeatedly run using a thread simply because it really is thread-local and different for each thread. For example, you cannot use it to write a function that counts the number of times it has been called irrespective of the thread that calls it. For that you need a simple global variable that is the same for all threads.

There really is no point in using threading.local if you are writing the code from scratch. If you need anything that is local to a thread then create it within the thread and it will automatically be thread-local. To see that this is true compare this example with the previous example – they achieve the same result, but the first one doesn’t use threading.local.

To summarize:

local variables are automatically thread-local
function parameters are automatically thread-local
objects created within a thread are thread-local
the threading.local class creates global thread-local objects which aren’t needed in well written thread-safe code.

Computing Pi with Multiple Threads

It is informative to compare the multi-process computation of pi given in the previous chapter with a multi-threaded computation:

import threading
import time
def myPi(m,n):
    pi = 0
    for k in range(m,n+1):
        s = 1 if k%2 else -1 
        pi += s / (2 * k - 1)
    print(4*pi)
N=10000000
thread1=threading.Thread(target=myPi,args=(N//2+1,N))
t1=time.perf_counter()
thread1.start()
myPi(1,N//2)
thread1.join()
t2=time.perf_counter()
print((t2-t1)*1000)

You can see that this is virtually the same program, but using equivalent thread methods. If you try this out you will discover that, compared to the single-threaded version, the computation is slower. For example, on a dual-core Windows machine the time increased from 1700 ms to 1800 ms and on a Pi 4 from 4500 ms to 5000 ms.

This increase in time lag shouldn't come as a surprise. The GIL means that the threads cannot run at the same time and hence they cannot make use of the additional cores. The multi-threaded program takes longer simply because of the overhead in switching between threads.

The computation of Pi is a CPU-bound task and so there is no hope that using multiple threads can speed it up while the GIL is in force. However, when it comes to I/O-bound threads, the story is very different.

In Chapter but not in this extract

I/O-Bound Threads
Sleep(0)
Timer Object

Summary

The Threading module is very similar to the Process module.
Threads cannot be used to speed up a Python program because of the GIL – Global Interpreter Lock. This only allows a single thread to be using the Python Interpreter at any given time.
The current Python thread will give up the GIL to another thread if it is waiting for I/O, running another language, or after it has run for a specific time.
When a thread releases the GIL the operating system selects the next thread to run, which might even be a thread in another process.
A non-daemon thread, daemon = False, will keep the main thread alive until it is finished. A daemon thread will allow it to end before it has finished.
You can wait for a thread using join. If you specify a timeout you have to test to see why the join ended.
A function can be used by more than one thread. When this happens all its local variables are unique to the thread – they are thread-local. Global variables and objects are shared between threads and so are locally created objects.
Global variables can be converted to threading.local objects which makes them local to each thread. This is usually only needed if you are using code that you cannot modify to work in more sensible ways.
Using threads speeds up I/O-bound programs but not CPU-bound programs.
The function time.sleep can be used to suspend a Python thread.
The Timer object isn’t as useful for running delayed programs as it seems because it creates and destroys a thread, which means it cannot be used as a repeating timer.