Applying C - Pthreads |
Written by Harry Fairhead | |||||
Monday, 25 September 2023 | |||||
Page 1 of 4 The standard way to do threading in C under Linux is to use Pthreads. This extract is from my book on C in an IoT context. Now available as a paperback or ebook from Amazon.Applying C For The IoT With Linux
Also see the companion book: Fundamental C <ASIN:1871962609> <ASIN:1871962617> ThreadsC has no standard way of multi-tasking and no standard way of handling interrupts, hard or soft. Until recently you could mostly ignore this deficit on small machines because they hardly had the power to run one thread of execution, let alone multiple threads. Also any use of interrupts would have been highly hardware-specific. Increasingly even small machines run Linux using multiple cores and the environment is multi-threaded even if your C program isn’t. In this chapter we look at how POSIX systems handle the problem of threading, i.e. more than one task running potentially at the same time. The C11 standard introduced its own standard threading, which is similar to POSIX, but lacking some facilities. For the moment, the POSIX approach is the more practical and better supported. The subject of multi-tasking and parallel programming is a large one and this chapter doesn’t aim to be complete. It is an introduction to the ideas and techniques you need to understand why creating multi-threaded programs is difficult and the general approaches for getting it right. In most cases the best advice is to try to avoid multi-tasking altogether, but if you can’t or you think it is desirable then keep it as simple as possible. Reasoning about even simple parallel systems that interact is hard – if they are complex it becomes very nearly impossible. Why Multi-task?It seems almost obvious that getting a program to do more than one thing at a time is a good idea. It obviously speeds things up. In practice this isn’t quite so obvious. Multi-tasking on a machine that has only one CPU is often slower than an equivalent program that doesn’t attempt it. The reason is that there is an overhead in switching between tasks. If the machine has multiple cores then you can achieve a faster performance, but even here usually not as much as you would expect. The reason is that interacting tasks generally need to cooperate and share resources and the time this requires can reduce the speed increase and in some cases eliminate it altogether. The bottom line is that no matter how attractive multi-tasking appears from a common sense point of view, it often doesn’t deliver what you expect and it increases the complexity of your program. In fact many programmers are of the opinion that multi-tasking, including interrupts, should never be used in any embedded system even if they are available. To Thread or ForkThe first thing to say is that the modern idea of a thread is not fundamental to POSIX operating systems. Unix introduced the fork function which makes a complete copy of a running program. The two copies are identical and the new process continues to run from the same location in the program. The child process doesn’t inherit any outstanding I/O nor memory locks from the parent process and it has a unique process ID. This doesn’t seem particularly useful until you know that fork returns -1 if there is an error, 0 in the child's program and a positive integer, the child’s PID, in the original parent program. This means, you can write code which behaves differently in each copy. For example: #include <stdio.h> #include <stdlib.h> #include <sys/types.h> #include <unistd.h> int main(int argc, char** argv) { if (fork() == 0) printf("Hello from Child\n"); else printf("Hello from Parent\n"); return (EXIT_SUCCESS); } The reason for this strange way of doing things is lost in the past, probably due to the limited memory available back then. Most programmers when first exposed to the idea of the fork are concerned about the idea that there are two copies of a program running and the distinction between them is just which part of the if statement is executed. In practice, the pattern was generally fork followed by the child doing an exec call to load another program. Why not just use exec to load a program without a fork? The answer is that you can do this, but the new program simply overwrites the current program and a new process isn’t created. When you use a fork, the new process is fairly well separated from the original. There are ways that the two can communicate, pipes and shared memory for example, but they don’t share resources. A thread on the other hand is a process that lives in the same environment as other threads of execution. When a process is started it has a single thread of execution but there are system calls that allow you to create new threads. Most operating systems do threading in their own way, but the standard POSIX threading specification is fairly well supported and it is generally referred to as Pthreads. You can use fork and even lower-level system calls such as clone and so on to implement multi-tasking, but Pthreads is fairly modern and a good choice if you are working with Linux or any POSIX-compliant operating system. It doesn’t work under Windows, which has its own threading system, but there are libraries that provide some compatibility. The advantage of using threads over processes created using fork is that threads involve a smaller overhead when the operating system switches its attention to a new thread. Threads are often described as lightweight processes. It is also usually easier to allow communication between threads via their shared memory space. This is also the big disadvantage of threads – if a thread corrupts the memory in some way then all of the threads could be affected. A group of threads or processes may be implemented on a single processor. In this case the operating system shares the processor’s time by scheduling threads and processes to run for a short time. In this case all but one thread in one process is executing at any one time and all of the others are suspended. This situation is much easier to reason about as there is only one thing happening at any given moment. Today's processors, however, often support multiple cores, each one capable of running a thread. If the processor has n cores there can be n threads running at the same time and this makes the interaction between the threads much more difficult to analyze. |
|||||
Last Updated ( Tuesday, 26 September 2023 ) |