Fundamental C - Pointers, Cast & Type Punning |
Written by Harry Fairhead | |||||
Monday, 10 September 2018 | |||||
Page 1 of 4 Casting is a fundamental C technique when you are working at a low level. Type punning, using pointers to reference the same area of memory but using different types, is also a powerful technique. But did you know that it was undefined behavior?
Fundamental C: Getting Closer To The MachineNow available as a paperback and ebook from Amazon.
Also see the companion volume: Applying C <ASIN:1871962609> <ASIN:1871962463> <ASIN:1871962617> <ASIN:1871962455>
Expressions, Type & CastingThere is a great deal of misunderstanding about type and casting in particular. In C it is better to think of a memory location that happens to be currently in use to store a bit pattern that has a particular interpretation. For example the bit pattern 01000001 could be the binary representation of 65 or it could be the ASCII code for ‘A’ or it could signify that switch 0 and 7 are closed and the rest are open. The bit pattern stays the same the interpretation changes. In C the type assigned to a variable or pointer only changes how the bit pattern is supposed to be interpreted. It also determines the amount of memory allocated. For example, when you write int myVar; 2 or 4 bytes are allocated depending on the machine architecture. Declaring this as an int also means you can perform 2 or 4 byte arithmetic on the variable. In C it is not so much the operations that change according to type it is the number of bytes involved in the operation. In particular all non floating point types are integer types and vary only in the way that the sign bit is treated i.e. signed v unsigned type, and the amount of memory allocated. For example: char myChar; allocates a single byte which is interpreted, mostly by the programmer, as a character. So you can initialize it to a character literal: myChar=’A’; which stores the bit pattern 01000001 in the byte. However if you try: char myChar; myChar='A'; myChar=myChar+1; printf("%c \n",myChar); printf("%d \n",myChar); you will find that it works and at most you might get a warning that you are using %d to print a char. Notice that the third instruction adds 1 as if myChar was a numeric value, the fourth prints it as a character and the final instruction prints it as a number. So which is it a char or a number? If you are expecting a firm answer to this question you haven’t yet understood the very loose way that C treats type. The best answer is that there is a bit pattern stored in myChar and its up to you to how to interpret it. The compiler does expect you to treat it as a character and it will warn you if you treat it as something else but its just helpful advice to avoid you making a mistake. Suppose that you really know what you are doing and you want to print a char as if it was a numeric value and don’t want to see even a compiler warning? The answer is that you can use a cast but this isn’t always necessary and it doesn’t always stop the compiler from flagging a warning. A cast is just a type enclosed in parenthesis and it is written in front of a variable to tell the compiler that you want its bit pattern to be reinterpreted as the new type. Notice that a cast isn’t an operator – it doesn’t change anything it just tells the compiler that what was a signed int should now be regarded as an unsigned int say. (unsigned int)myVar; There are two aspects of changing type using a cast the first is just the operations on the type and the second is the amount of memory allocated. If you don’t change the amount of memory allocated then to change how a type is treated you simply use a cast. For example, int8_t myVar= (int8_t) myChar; printf("%hhd \n", myVar); This only works in C99 which supports fixed size types and only if you have imported the <stdint.h>. For a slightly more convincing example try: int myVar1 = -1; unsigned int myVar2 = (unsigned int) myVar1; printf("%d \n", myVar1); printf("%u \n", myVar2); myVar1 is a signed int and set to -1 but when it is assigned to myVar2 its bit pattern is interpreted as 4294967295. This is only slightly more convincing because the printf will interpret myVar2 as a signed integer if you change the %u to %d. This also indicates that all that matters is how the bit pattern is interpreted. |
|||||
Last Updated ( Monday, 10 September 2018 ) |