Fundamental C - String I/O
Written by Harry Fairhead   
Monday, 27 June 2022
Article Index
Fundamental C - String I/O
Low Level I/O

Low-level I/O

There are a number of I/O functions that are simpler than printf and scanf:

putchar & getchar

These put and get a single char from the standard I/O streams. If you try:

printf("type a character ");
int c=getchar();
putchar(c);

then the chances are very high that you will discover that buffers are still getting in the way and you don’t see the “type a character” message until after you have typed a character. The simplest fix is to use fflush, even if it is system-dependent:

printf("type a character ");
fflush(stdout);
int c=getchar();
putchar(c);

However, now you will discover that you have to press return after the character, once again because of the buffer. If you type “abcdef” then nothing happens till you press return, when the buffer is made available to getchar and a single character is removed from the buffer. You can use getchar again to read more of the buffer. Also notice that getchar and putchar work in terms of int rather than char.

The two functions are sometimes useful, but not as a way to dynamically interact with the keyboard as you might expect.

gets & puts

These two functions work like getchar and putchar but they work with complete C strings. For example:

printf("type a string ");
fflush(stdout);
char s[25];
gets(s);
puts(s);

As in the case of getchar the buffer is only used by gets when the user presses return when gets reads characters into the string until it reaches the end of the buffer. The string s is null-terminated and includes any newline used to end the input.

Notice that gets is dangerous in that it will accept as many characters as the user types and thus an array overflow is very possible. To avoid this problem use fgets instead.

fgets

The fgets function is designed to read a string from any data stream but it is the obvious alternative to the dangerous gets because it allows you to specify a maximum for the number of characters read. The safe equivalent of the previous example is:

printf("type a string ");
fflush(stdout);
char s[25];
fgets(s,25,stdin);
puts(s);

Notice that the 25 in the call to fgets means you cannot have a buffer overflow but you can stop reading data before it is complete. As with reading a general array the solution to this is to repeat the read and process each chunk until all of the data has been processed.

A Safe Way To Do Input – String Conversion

The scanf function is easy to use but both dangerous and unstable. Many C users when presented with this fact have in the past created their own version of scanf – a very complex alternative.

A much better and simple way to proceed is to use fgets to safely read in a complete line of text and then use string conversion functions to extract the data.

The string conversion functions are all of the same form:

strtod(string, end);
strtol(string, end, base);
strtoul(string, end, base);

which convert to double, long or unsigned long respectively. The string is scanned and the value built up as legal characters are encountered. The scanning stops when a character that cannot be part of a number is encountered or the end of the string. The end parameter is set to point at the location that the scan stopped so that the rest of the string can be processed. Finally, base is the numeric base to be used for the conversion, usually 10.

For example:

char s[]="1234.456 Some data";
char *prt;
int num=strtol(s,&prt,10);
printf("%d",num);

Prints 1234 and leaves prt pointing at the space before “Some data”. For the moment don’t worry about the use of the & in &prt, it is explained in the next chapter.

The strtol and strtoul work in the same way, converting legal characters to a value and stopping at the first non-legal character.

You might wonder why there is no strtoi or similar?

The simple answer is that there is no need as long can be reduced to int or short if the numeric value is small enough, and the same is true of unsigned long. There are some older functions atoi, atof and atol which convert a string to int, float and long respectively but don’t use them as they are can overrun the string.

The atoi family of functions scan a string until they find a suitable set of characters to convert. That is:

atoi(“the number is 123”);

will return 123 whereas strtol stops at once on ‘t’. You can use the fact that ptr is the start of the string to test to see if any valid characters were found. The problem is that atoi will carry on scanning a string until it finds a valid character even if this results in it going beyond the end of the string.

The best way to do safe input from the keyboard is to use the strto functions on the string returned from fgets.

For example, suppose you want the user to input an integer and a number with a decimal point separated by a comma:

char myString[25];
printf("type a int,double ");
fflush(stdout);
fgets(myString,25,stdin);
char *prt;
int num1=strtol(myString,&prt,10);
printf("%d",num1);
prt++;
double num2=strtod(prt,&prt);     
printf("%f",num2);

The fgets reads in a whole line from the user and it allows the user to edit the line before pressing enter. Next we use strtol to extract the integer digits. The scan stops at the comma and this is what prt is pointing at. Adding one to prt moves it past the comma, in a real application we need to check that the comma is there and that the floating value is next. The strtod extracts the floating value.

All of this is easy and safe and, if you need to get input from the keyboard, is the best approach unless you are using a library or GUI framework.

Summary

  • Strings are null-terminated char arrays.

  • A char is generally a single byte and it is the smallest of the integer types.

  • The character code used depends on the operating system, but you can generally assume that you are working with UTF-8 restricted to a single byte which is functionally equivalent to ASCII.

  • C doesn’t currently handle Unicode well.

  • You can initialize a string using a string literal, but you cannot assign a string literal to a string.

  • Always make sure that the array has enough elements to hold the string and its null terminator. If the string has n characters the array has to have at least n+1 elements.

  • A string variable is a pointer to the first element of the string and behaves like a standard array.

  • There are no native string operators or functions in C but the standard library has a comprehensive set of string functions.

  • All string operations work by using a for loop to scan the string and stop when it reaches the null terminator.

  • If the null terminator is missing then most string operations will overrun the array.

  • There are alternative safe string functions which allow you to specify the maximum number of characters to be processed. Used correctly these protect you against array overrun.

  • String overflow is generally easy to control when all of the strings involved are generated by your program. Things are much more difficult when strings are input from external sources.

  • Printf and sprintf can be used to convert integer and floating point types to human readable string representations.

  • Operating system buffers make interactive I/O using scanf difficult.

  • Scanf also has problems in terms of how it applies the format string to the input.

  • In many cases the only solution is to use lower-level I/O functions to control the way the characters are converted into numeric data types.

 

 

Related Articles

Raspberry Pi And The IoT In C

Getting Started With C/C++ On The Micro:bit

Fundamental C: Getting Closer To The Machine

Now available as a paperback and ebook from Amazon.

  1. About C
      Extract Dependent v Independent
                  & Undefined Behavio
  2. Getting Started With C Using NetBeans
  3. Control Structures and Data
  4. Variables
      Extract Variables
  5. Arithmetic  and Representation
      Extract Arithmetic and Representation
  6. Operators and Expression
      Extract: Expressions
      Extract Side Effects, Sequence Points And Lazy Evaluation
      First Draft of Chapter: Low Down Data
  7. Functions Scope and Lifetime
  8. Arrays
      Extract  Simple Arrays
      Extract  Ennumerations
  9. Strings
      Extract  Simple Strings
     
    Extract: String I/O ***NEW!!
  10. Pointers
      Extract  Starting Pointers
      Extract  Pointers, Cast & Type Punning
  11. Structs
      Extract Basic Structs
      Extract Typedef
  12. Bit Manipulation
      Extract Basic Bits
      Extract Shifts And Rotates 
  13. Files
     Extract Files
     
    Extract Random Access Files 
  14. Compiling C – Preprocessor, Compiler, Linker
     Extract Compilation & Preprocessor

Also see the companion volume: Applying C

<ASIN:1871962609>

<ASIN:1871962463>

<ASIN:1871962617>

<ASIN:1871962455>

 

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Banner


IBM Opensources AI Agents For GitHub Issues
14/11/2024

IBM is launching a new set of AI software engineering agents designed to autonomously resolve GitHub issues. The agents are being made available in an open-source licensing model.



Apache Lucene Improves Sparce Indexing
22/10/2024

Apache Lucene 10 has been released. The updated version adds a new IndexInput prefetch API, support for sparse indexing on doc values, and upgraded Snowball dictionaries resulting in improved tokeniza [ ... ]


More News

espbook

 

Comments




or email your comment to: comments@i-programmer.info



Last Updated ( Monday, 27 June 2022 )