Applying C - Cores |
Written by Harry Fairhead | ||||||
Monday, 03 July 2023 | ||||||
Page 2 of 2
AffinityThe operating system tries to keep threads associated with particular cores, but sometime you need to enforce this. There is no standard POSIX way of determining which core a thread will use, but there is a Linux extension of the Pthreads library that does the job. The setaffinity function: int pthread_setaffinity_np(pthread_t thread, sets the specified thread to run on one of a set of possible CPUs as specified by the cpuset – the affinity mask. The getaffinity function will return the affinity mask of the specified thread: int pthread_getaffinity_np(pthread_t thread, Notice the thread is specified as a Pthread id. You can also use a Linux process id if you use the alternative get and set functions defined in sched.h: int sched_setaffinity(pid_t pid,size_t cpusetsize, In practice, the Pthreads function calls the functions defined in sched.h. The only thing we need to know is how to set the affinity mask. This uses a single bit to control access to each of the physical and logical cores. You can’t simply set or reset these bits. You have to use the set of macros designed for the job. There are a large number of these, but the ones that you use most often are: CPU_ZERO(& cpuset); set all bits to 0 CPU_SET(n,& cpuset); resets the bit corresponding to core n How do you find out which core corresponds to which bit in the mask? As long as your system is set up correctly you should be able to get details by reading the /proc/cpuinfo file or you could use the lstopo tool. For example, suppose you want to run two threads on separate cores. First we need two functions to run: volatile int j; volatile int i; void * threadA(void *p) { for (i = 0;; i++) { }; } void * threadB(void *p) { for (j = 0;; j++) { }; } These simply run a for loop with a global counter to let us know how many times the loop has been executed. The global counters have to be marked as volatile to stop the compiler optimizing the empty loops away. To set the thread affinity we need to use the macros: cpu_set_t cpuset; CPU_ZERO(&cpuset); CPU_SET(1, &cpuset); This sets the mask to core 1. Next we start the first thread and set its affinity: pthread_t pthreadA; pthread_create(&pthreadA, NULL, threadA, NULL); pthread_setaffinity_np(pthreadA, sizeof (cpu_set_t), &cpuset); The second thread is to run on core 2 so we need to change the mask and then start the thread: CPU_ZERO(&cpuset); CPU_SET(2, &cpuset); pthread_t pthreadB; pthread_create(&pthreadB, NULL, threadB, NULL); pthread_setaffinity_np(pthreadB, sizeof (cpu_set_t), Now we can let the main thread sleep for a few seconds and print the value of the counters to give an indication of how many loops each thread has performed. The complete program is: #define _GNU_SOURCE #include <stdio.h> #include <stdlib.h> #include <pthread.h> #include <sched.h> #include <unistd.h> volatile int j; volatile int i; void * threadA(void *p) { for (i = 0;; i++) { }; } void * threadB(void *p) { for (j = 0;; j++) { }; } int main(int argc, char** argv) { cpu_set_t cpuset; CPU_ZERO(&cpuset); CPU_SET(1, &cpuset); pthread_t pthreadA; pthread_create(&pthreadA, NULL, threadA, NULL); pthread_setaffinity_np(pthreadA, If you run the program you will find that each thread executes roughly the same number of loops. Now if you set the second thread to run on the same core by changing: CPU_SET(2, &cpuset); to: CPU_SET(1, &cpuset); and run it again, you will discover that each thread now loops for about half the previous total. This is what you would expect as each of the two threads now only gets to run on the core for half of the total time. If you run the same program without setting affinities you will discover that for a lightly loaded machine they will automatically be allocated to different cores and as the load goes up they will eventually share a core. In the case of a hyperthreaded processor, placing the two threads on two processing units in the same core has the same result as running them both on one core, as neither has any voluntary idle time and so they get to share the core equally. It is instructive to try this program out after assigning different scheduling policies and priorities to the threads. There are Linux tools that will allow you to discover what core a process is running on and change its affinity. There is also the cpuset facility which can be used to dynamically change what cores are used. However, if your goal is to allocate a single core to a single important thread then the best and simplest way of doing this is to first prohibit Linux from using the core by adding: isolcpus=core_number to the boot loader. You can use a comma separated list of cores not to use. For example, to disable core 3 you would edit /etc/default/grub and change the line: GRUB_CMDLINE_LINUX_DEFAULT="quiet splash" to: GRUB_CMDLINE_LINUX_DEFAULT="quiet splash isolcpus=3" You also have to use: sudo update-grub and reboot. For the Raspberry Pi the Linux configuration is stored in /boot/cmdline.txt. Simply add isolcpus=3 to the end of the list and reboot. When the machine starts up, core 3 will not be used by the system. You can, however, still use thread affinity to run a user thread on core 3. There are other ways, such as cpu setc, to disable a core dynamically, but these suffer from problems such as not moving any thread that is already running. For fixed tasks the best way to do the job is isolcpus. You can find out what cores are isolated using: cat /sys/devices/system/cpu/isolated The system will still occasionally interrupt a thread running on an isolated core, but the interference is much less than encountered in normal scheduling. You can discover which cores are being used for interrupt handlers using the command: cat /proc/interrupts This gives you a list of interrupt numbers and the cores that have handled them. Some interrupts have names rather than numbers and these are the ones that you can’t tamper with. Isolated cores only handle interrupts that are essential – rescheduling interrupts for example. It is sometimes possible to control which cores are used for particular interrupts – as long as they have an interrupt number and as long as they support IO-APIC, and many don’t - there are none on the Raspberry Pi for example. To discover which cores a particular interrupt can be handled by use: cat /proc/irq/n/smp_affinity where n is the interrupt number. This returns a bit mask with the lowest order bit corresponding to core 0. You can set the bit mask to determine which processors will handle the interrupt using: echo m > /proc/irq/n/smp_affinity where n is the interrupt number and m is the new mask. For example, to have all timer interrupts, irq 17, handled by Core 0 you would use: echo “1” > /proc/irq/17/smp_affinity Note that if the interrupt is not IO-APIC compatible you will get a read/write error. You also have to give the entire command as root e.g. use sudo -i. In chapter but not in this extract
Summary
Now available as a paperback or ebook from Amazon.Applying C For The IoT With Linux
Also see the companion book: Fundamental C <ASIN:1871962609> <ASIN:1871962617> To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.
Comments
or email your comment to: comments@i-programmer.info |
||||||
Last Updated ( Wednesday, 05 July 2023 ) |