Fundamental C - Simple Strings
Written by Harry Fairhead   
Sunday, 08 December 2019
Article Index
Fundamental C - Simple Strings
String Handling Functions
Buffer Overflow

Buffer Overflow

Usually the discussion of avoiding string or more generally array or buffer overflow ends at this point, but in the real world things are more complex.

When programming in C you need to be aware of where array data comes from. In general, the arrays and strings that you create and consume are safe enough because you know their size and can ensure that they are null-terminated where necessary. This means that you can use null-terminated string functions if you want to. This includes any functions you might write - you don’t have to further protect a function that you know only you are going to use and in a responsible manner.

Where things get dangerous is when data is generated externally – user input, network data, file data, or anything that is not originated by your program. In this case you have to put an upper limit on the number of items of data you are prepared to accept. This usually means using strn functions as opposed to str functions and more generally specifying array sizes. It also means that you have to remember to check that an array access is within the array bounds for every access – this is inefficient but safe.

Of course, in the real world implementing such strategies is always much harder. For example, consider the network data problem. You set up an array to accept data from a device or a service which normally sends you 500 bytes. However, you have no guarantee that network problems or exceptional circumstances might not force it to send 750 bytes or more. In an ideal world you would simply allocate an array so huge that overrunning it was unlikely – you would still have to check that it wasn’t overrun, however. In the real world you generally can’t afford that sort of memory allocation, especially on small devices. So what can you do?

In most case the best solution is to divide the transaction into packets of data. Read in the first 500 bytes, process it and see if there is any more. In this way you can safely reuse the 500 bytes you have allocated to the array and still not miss any data that goes beyond this limit. Of course, any processing that you do has to be fast enough so that you can carry on reading the next 500 bytes without missing any data or even aborting the connection due to a timeout.

The exact details of implementing the repeated use of a small buffer to read in large amounts of data varies according to how the data transfer protocol works and what is to be done with the data, but a for loop and a test of the end of the data is generally what is required. In the case of limited resources it is often necessary to trade code for memory.

In the book but not in this extract:

  • Convert to String - sprintf
  • Input – Buffer Problems
  • Low-level I/O
  • A Safe Way To Do Input – String Conversion

Summary

  • Strings are null-terminated char arrays.

  • A char is generally a single byte and it is the smallest of the integer types.

  • The character code used depends on the operating system, but you can generally assume that you are working with UTF-8 restricted to a single byte which is functionally equivalent to ASCII.

  • C doesn’t currently handle Unicode well.

  • You can initialize a string using a string literal, but you cannot assign a string literal to a string.

  • Always make sure that the array has enough elements to hold the string and its null terminator. If the string has n characters the array has to have at least n+1 elements.

  • A string variable is a pointer to the first element of the string and behaves like a standard array.

  • There are no native string operators or functions in C but the standard library has a comprehensive set of string functions.

  • All string operations work by using a for loop to scan the string and stop when it reaches the null terminator.

  • If the null terminator is missing then most string operations will overrun the array.

  • There are alternative safe string functions which allow you to specify the maximum number of characters to be processed. Used correctly these protect you against array overrun.

  • String overflow is generally easy to control when all of the strings involved are generated by your program. Things are much more difficult when strings are input from external sources.

  • Printf and sprintf can be used to convert integer and floating point types to human readable string representations.

  • Operating system buffers make interactive I/O using scanf difficult.

  • Scanf also has problems in terms of how it applies the format string to the input.

  • In many cases the only solution is to use lower-level I/O functions to control the way the characters are converted into numeric data types.

 

Fundamental C: Getting Closer To The Machine

Now available as a paperback and ebook from Amazon.

  1. About C
      Extract Dependent v Independent
                  & Undefined Behavio
  2. Getting Started With C Using NetBeans
  3. Control Structures and Data
  4. Variables
      Extract Variables
  5. Arithmetic  and Representation
      Extract Arithmetic and Representation
  6. Operators and Expression
      Extract: Expressions
      Extract Side Effects, Sequence Points And Lazy Evaluation
      First Draft of Chapter: Low Down Data
  7. Functions Scope and Lifetime
  8. Arrays
      Extract  Simple Arrays
      Extract  Ennumerations
  9. Strings
      Extract  Simple Strings
     
    Extract: String I/O ***NEW!!
  10. Pointers
      Extract  Starting Pointers
      Extract  Pointers, Cast & Type Punning
  11. Structs
      Extract Basic Structs
      Extract Typedef
  12. Bit Manipulation
      Extract Basic Bits
      Extract Shifts And Rotates 
  13. Files
     Extract Files
     
    Extract Random Access Files 
  14. Compiling C – Preprocessor, Compiler, Linker
     Extract Compilation & Preprocessor

Also see the companion volume: Applying C

<ASIN:1871962609>

<ASIN:1871962463>

<ASIN:1871962617>

<ASIN:1871962455>

 

Harry Fairhead is the author of Raspberry Pi IoT in C ,  Micro:bit IoT in C and Fundamental C: Getting Closer to the Machine. His latest book is  Applying C For The IoT With Linux.

Related Articles

Remote C/C++ Development With NetBeans

Raspberry Pi And The IoT In C

Getting Started With C/C++ On The Micro:bit

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Banner


Sequin - Open Source Message Stream Built On Postgres
31/10/2024

Sequin is a tool for capturing changes and streaming data out of your Postgres database, guaranteeing exactly once processing. What does that mean?



Fermyon's Spin WebAssembly Version 3.0 Released
26/11/2024

The open source developer tool for building, distributing, and running serverless WebAssembly applications reaches version 3.0. What's new?


More News

espbook

 

Comments




or email your comment to: comments@i-programmer.info

 



Last Updated ( Monday, 09 December 2019 )