Deep C Dives: The Union
Written by Mike James   
Wednesday, 29 January 2025
Article Index
Deep C Dives: The Union
Color
Tagged Union

The Union - it looks like a struct but doesn't quite fly like a struct. So what is a Union for exactly? Find out in this extract from my latest book Deep C Dives.

Deep C Dives
Adventures in C

By Mike James

Cdive360

Buy from Amazon.

Contents

Preface
Prolog C
Dive

  1. All You Need Are Bits
  2. These aren’t the types you’re looking for
  3. Type Casting
  4. Expressions
  5. Bits and More Bits
  6. The Brilliant But Evil for 
  7. Into the Void 
  8. Blocks, Stacks and Locals
  9. Static Storage
  10. Pointers
  11. The Array and Pointer Arithmetic
  12. Heap, The Third Memory Allocation
  13. First Class Functions
        Extract:
    First Class Functions
  14. Structs and Objects
  15. The Union***NEW!
  16. Undefined Behavior
  17. Exceptions and the Long Jump

<ASIN:B0D6LZZQ8R>

Dive 15

The Union

Form follows function - that has been misunderstood.
Form and function should be one, joined in a spiritual union.”

Frank Lloyd Wright

 

The union is a special type of struct that is more or less dedicated to the idea of type punning. Its very existence in the C language is a strong indication that type punning is not an accidental feature.

Union Basics

A union is declared in the same way as a struct, but memory is only allocated for the largest of the union’s fields. That is, all of the fields of a union share the same memory. This sounds like a crazy thing to do until you notice that it gives you a way of working with the same bit pattern with different interpretations. You can treat the same area of memory as different types simply by using the appropriate field name. That is, using unions allows you to do reasonably well-defined type punning in C.

The syntax for a union is the same as for a struct, but you replace struct with union. For example:

union {
	int I;
	float F;
} myUnion;

This allocates four bytes – assuming sizeof(int) and sizeof(float) are both 4 – that can be used to store an int or a float. Which is stored depends on the field name used to access the union, I or F.

Now you can store an int in the union using:

myUnion.I = 42;

or a float using:

myUnion.F = 42.0;

In the first case, the bit pattern that represents 42 is stored in the four bytes and in the second, the bit pattern that represents 42.0 is stored in the same four bytes. Of course, there is nothing stopping you from storing an int and reading back a float, or vice versa. This is how the type punning occurs when you use a union.

In principle, you can pun any types, but there is a rule in C, but not C++, that it is legal to read a union member as long as it is not larger than the member most recently written. This is a perfectly reasonable restriction, but even so I can invent, admittedly unlikely, examples of reading a larger type when a smaller type has just been written, but they are all very machine-dependent. This means that, in most cases, you can use union type punning to avoid the compiler warning messages and potential undefined behavior.

As with structs and type punning, you have to take padding into account. To revisit the example in the previous chapter:

struct test1 {
     char c;
      int i;
};
struct test2{ char c[4]; char i[4]; };
union { struct test1 myStruct1; struct test2 myStruct2; } myUnion;

In this case, for a 64-bit machine, struct test1 is padded so that int i is on a word boundary and this means there are three padding bytes between c and i. What this means is that union of the two structs, myUnion, has struct test1.c in struct test2.c[0] and struct test1.i in struct test2.i[0] to struct test2.i[3].



Last Updated ( Wednesday, 29 January 2025 )