Applying C - File Descriptors
Written by Harry Fairhead   
Monday, 24 August 2020
Article Index
Applying C - File Descriptors
Permissions & Random Access
fcntl
The Reader

The actual permissions that result are complicated by the fact that the process has a default permissions mask - umask. The effective permissions are given by mode & ~umask which means that the bits in umask that are set are unset in the result i.e. umask blocks the setting of some permissions.

For example, to obtain a file descriptor fd:

int fd=open(“filename”,O_RDWR |O_CREAT, 0644);

opens a file for read and write and creates it if it doesn’t exist with permissions 0644, which is owner read/write and the rest read only.

You can open the same file with a file descriptor more than once and these may have read/write positions and status flags. You can set the default permissions using the umask function.

As with C file handling, any file you open has to be closed using:

close(fd);

File descriptor files have block read and block write commands:

read(fd, ptrToBuffer, numbytes);

and:

write(fd, ptrToBuffer, numbytes);

These are similar to fread and fwrite but simpler as it is up to you to work out how many bytes to read and write.

All of these functions return -1 if there is an error, and read and write return the number of bytes transferred. Notice that file descriptors do not support text format mode and it is up to you to code or convert raw data into the bytes that represent it.

Also notice that a call to read or write might return before all of the bytes you specified have been transferred. For example, the call to read or write might be interrupted by a signal. It is up to you check and restart the read or write if you want all of the data you requested.

There is also a file positioning function:

lseek(fd,offset,whence);

If whence is SEEK_SET then offset is from the start of the file, if it is SEEK_CUR it is from the current location and if it is SEEK_END it is from the end of the file. The function returns -1 if the seek failed and the offset from the start of the file if it worked.

This means that lseek can be used to return the current position:

off_t currentPos=lseek(fd,0,SEEK_CUR);

where off_t is the type defined to hold an offset. If you position beyond the end of the file and write data then the gap is filled by zero bytes until real data is written to the location.

You can probably see how to make use of the open, read, write and lseek functions to do the same job as the file pointer functions of the C standard. An extra facility is provided by the dup and dup2 functions. These allow you to duplicate a file descriptor so that you can make use of the file in more than one way at a time. For example, you could duplicate a file descriptor and maintain two different reading positions. The difference between the two functions is how the file descriptor is determined:

int fd2 = dup(fd);

returns the next free integer file descriptor and sets it to reference the same file as fd and:

dup2(fd,fd2);

sets the existing fd2 to reference the same file as fd. You should be able to see that these file functions can be used in a very similar way to the file pointer functions.

A Random Access Example

For example, to implement a random access record program you would first have to write some records to the file:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <fcntl.h>
#include <unistd.h>
struct person {
    char name[25];
    int age;
};
int main(int argc, char** argv) {
    struct person me;
    strcpy(me.name, "Harry");
    me.age = 18;
    int fd = open("myFile.bin",O_RDWR |O_CREAT, 0644);
    for (int i = 0; i < 10; i++) {
        write(fd,&me, sizeof (struct person));
        me.age++;
    };

You can see that the only difference from using C file functions in writing out ten records is that the open function has to specify the permissions and the write doesn’t specify the number of records, just the size of the record.

To now read the fifth record you would use:

int record = 5;
lseek(fd, record * sizeof (struct person), SEEK_SET);
struct person me2;
read(fd,&me2, sizeof (struct person));
printf("%s  %d", me2.name, me2.age);

No flushing is needed as buffers aren’t used, or if they are they are transparent to the program. You can use the same techniques to read a record, modify it and write it back to the file. If you write beyond the end of file then the file is extended with zeros to fill in the gap.

Descriptors and Streams

Descriptors are the POSIX low-level file functions and on a POSIX system the file pointer or stream functions are built on top of these. This means that on a POSIX system when you use the C standard fopen to open a file a file descriptor is created behind the scenes.

If you want to work with a file using that file descriptor you can.

The function:

int fd=fileno(fptr);

returns the file descriptor corresponding to the file that the file pointer references. Once you have the file descriptor you can use read, write, lseek and any other function that works with a file descriptor.

In the same way as a stream is associated with a file descriptor, you can open a stream given a file descriptor. The function:

FILE *fptr=fdopen(fd, “mode”)

where mode is any of the usual file stream opening modes e.g. w for write. Notice that the mode has to match the mode of the already open file descriptor and you can’t use b as file descriptors are always binary. The key point is that the stream isn’t actually opened at this point because it has already been opened as a file descriptor.

Finally, this ability to switch between file descriptors and file pointers is only possible on POSIX systems and isn’t a part of the C standard.



Last Updated ( Monday, 24 August 2020 )