Deep C Dives: The Union
Written by Mike James   
Wednesday, 29 January 2025
Article Index
Deep C Dives: The Union
Color
Tagged Union

Color

For a more realistic example of a union, consider the way most graphics hardware stores an ARGB (Alpha, Red, Green, Blue) value in a 32-bit unsigned int. If you want to access each of these bytes, and also treat all four bytes as a single value, then the best way is to define a union:

union Pixel{
    struct {unsigned char b,g,r,a;};
    uint32_t value;
};

The only complication is that the order has to be changed to allow for the way x86 stores multiple bytes in a 32-bit word. Notice the order is machine-dependent and for Intel and Arm processors corresponds to “little endian”. You can see that the union is of a struct with four byte fields and a single 32-bit word. The four byte fields share the same memory block as the 32-bit word and on 64-bit machines no padding is needed.

Now you can declare a pixel and use it as follows:

union Pixel pixel;
pixel.r=1;
pixel.g=255;
pixel.b=128;
pixel.a=255;
printf("%d\n",pixel.value); 

You can see that how you access the union depends on the fields that you specify. If you specify pixel.r then you simply work with the first byte of the memory block. If you specify pixel.value you work with all four bytes as a single int. There’s no need to cast, simply use the fields that make up the union and you access the memory according to the type.

The type punning version of the pixel union is:

struct Pixel{
    unsigned char b,g,r,a;
};
struct Pixel pixel
pixel.r=1;
pixel.g=255;
pixel.b=128;
pixel.a=255;
printf("%d\n",(uint32_t)pixel);

A union can often be used in place of type punning, but it often requires a little more pre-planning to alias the fields of a suitable union. It is more difficult to use a union to extend structs as shown earlier because of the way a union changes the way fields are named.

Type Punning Unions

It is a little-used fact that you can cast a union to a struct or to anything else for that matter. Why not? After all, a union is just a block of memory associated with a number of different types. The block of memory is big enough to hold the largest of the types taking into account any padding that is needed. As a block of memory you can cast it to whatever you want to – but it might not make much sense.

For example:

union Pixel{
    struct {unsigned char b,g,r,a;};
    int value;
};
union Pixel pixel={1,255,128,255}; char (*color)[4]; color=(char *) &pixel;

Assuming a four-byte int, the memory block is four bytes in size and we can alias it to an array using an array pointer, color. Aliasing a union obeys the same rules as aliasing a struct and if you try this example out you will find that GCC give a warning that you are aliasing an incompatible type – the result is undefined behavior, which is a shame because it makes perfect sense, see Dive 16 for more information.

Cdive180

When Unions Don’t Hack It

This said, unions aren’t always suitable alternatives to casting. For example, if you are trying to implement an object-oriented approach to C, then aliasing a base class with its derived class is much more natural to do with a cast than a union. In fact you could argue that casting is the only way to do this job.

Consider the example given in Dive 14:

struct myStruct{
  int a;
  int b;
};
void myFunction(struct myStruct *myPointer){
 printf("%d\n", myPointer->a);
 printf("%d\n", myPointer->b);
}

If we decide to extend the struct in the future to something larger:

struct myStructEx{
   int a;
   int b;
   int c;

then we can still use the original function by casting myStructEx to myStruct, as they agree about the types of the first part of each struct:

myFunction((struct myStruct *) myPointer);

where myPointer is now a pointer to myStructEx.

Now consider how this might be implemented using a union. You would first have to create a union of both structs:

union myUnion{
	struct myStruct{
	  int a;
	  int b;
	};
	struct myStructEx{
  	  int a;
   	  int b;
	  int c;
	};
};

We can now create a pointer to an instance of myUnion:

union myUnion mytest;
union myUnion *myPointer;
myPointer=&mytest;

How do we call:

void myFunction(struct myStruct *myPointer)

which, as originally written, expected a pointer to a myStruct? The only way is to cast the union pointer to a myStruct pointer:

myFunction((struct myStruct) *myPointer);

and thus we haven’t avoided using a cast. The only alternative is to rewrite the function to accept a union, or design it this way from the start. Lack of such forethought is exactly what the cast approach aims to make up for. The cast approach is extensible in a way that the union approach isn’t.

However, having said this, there are many approaches to this problem and this is not the end of the story if you want to invent more solutions.



Last Updated ( Wednesday, 29 January 2025 )