Sun Java Solaris Communities My SDN Account Join SDN
 
Article

Basic File I/O Functions of Solaris Platform

 
By Rich Teer, August 2001  

Adapted from the forthcoming book, "Solaris Systems Programming"

Introduction
This article looks at some of the basic, low-level I/O functions provided by the Solaris operating environment. Low-level I/O is sometimes referred to as unbuffered file I/O because the functions do not do any buffering, unlike those in the Standard I/O library. (Although this statement is technically correct, it is possible to use the Standard I/O library, which is normally buffered, in unbuffered mode.)

Following are descriptions of file descriptors, and then descriptions of the functions that most UNIX® I/O operations rely on, for example, open, close, read, write, and lseek.

File Descriptors

As far as user mode processes are concerned, file descriptors are the fundamental way of accessing files. A file descriptor, which is a small positive integer, is actually the offset into the process' process file table. Each process has a process file table associated with it, and it is this table that provides a mapping between the process' idea of the file, and the kernel's. Each of the basic file I/O functions takes a file descriptor as an argument, or returns one.

There are three file descriptors that the shell opens: standard input, standard output, and standard error. These are conventionally assigned to descriptors 0, 1, and 2 respectively. Instead of these magic numbers, newer programs should use the POSIX.1 constants STDIN_FILENO, STDOUT_FILENO, and STDERR_FILENO, which are defined in <unistd.h>.

The open Function

Before you can do any input or output with a file, you must open it for reading or writing (or both). To open a file, use the open function.

int open (const char *path, int oflag, /* mode_t mode */...);

The file you want to open is pointed to by path, and the mode you want to open it for, as well as other flags, is specified in oflag. If the call to open causes a file to be created, it is created with the permissions in mode, modified by the process's umask. If the call to open is successful, the file's file descriptor is returned; otherwise, -1 is returned, and errno is set appropriately. The file descriptor that is returned is guaranteed to be the lowest unused one available.

Open a file for reading, writing, or both, by setting oflag to one of the following mutually exclusive constants.

O_RDONLY Open the file for reading only.
O_WRONLY Open the file for writing only.
O_RDWR Open the file for reading and writing. The result of applying this flag when opening a FIFO is undefined.

You may also specify other options by bitwise-ORing one or more of the following constants with oflag.

O_APPEND Sets the file offset to the end of the file prior to each write. This is useful if, for example, more than one process is updating the file, as the writes won't overwrite each other.
O_CREAT If the file exists, this flag has no effect, unless O_EXCL is also specified. Otherwise the file is created, with the file's owner set to the effective user ID of the process. If the directory in which the file is being created has the S_ISGID bit set, the group ID of the file is set to the directory's, otherwise the file's group ID is set to the process' effective GID. The access permission bits of the file are set by mode, as modified by the process' umask.
O_DSYNC Write I/O operations on the file descriptor do not complete until the data is transferred to the physical storage medium. Normally a call to write returns once the data has been copied to a buffer in the kernel. It has no idea whether or not the data actually got stored on the physical medium.
O_EXCL If this and the O_CREAT flags are set, the call to open will fail if the file to be opened exists. The check, and the subsequent creation of the file if it doesn't exist, are atomic with respect to other processes trying the same operation on the same file. The effect of setting O_EXCL but not O_CREAT is undefined.
O_LARGEFILE Setting this option will set the maximum offset for the file to the largest offset that can be stored in an off_t64. If this flag1 is not set, the maximum allowable offset is restricted to the 32 bit off_t.
O_NOCTTY If the file to open is a terminal device, setting this flag will prevent the terminal from becoming the controlling terminal for the process.
O_NONBLOCK or O_NDELAY

These flags affect future reads or writes to the file. If the file to be opened is a FIFO, then setting either O_NONBLOCK or O_NDELAY will cause a read-only open to return without delay; a write-only open will fail if no process currently has the FIFO open for reading. If both O_NONBLOCK and O_NDELAY are clear, then a read-only open will block until a process opens the FIFO for writing. Similarly, a write-only open of a FIFO will block until a process opens it for reading.

If the file to be opened is a block or character special file that supports non-blocking opens, then setting either O_NONBLOCK or O_NDELAY will cause open to return without waiting for the device to get ready; the subsequent behaviour is device specific. If both O_NONBLOCK and O_NDELAY are clear, then the open will block until the device is ready before returning.

O_RSYNC If this flag is set, reading the data will block until any pending writes which affect the data are complete. Consider the situation where we want to read a block of data, which another process is updating. If this flag is not set, it is indeterminate whether the data returned will be that which is on the disk, or that which is scheduled to be written.
O_SYNC The result of setting this flag is similar to setting O_DSYNC, except that the write blocks until the data to be transferred is written to the physical medium, and the on-disk file attributes are updated.
O_TRUNC If the file exists, and is successfully opened for writing, setting this flag will cause the file's length to be truncated to 0 bytes. Any data that the file contains is discarded.

The creat Function

The creat system call is another way of creating a file.

int creat (const char *path, mode_t mode);

The name of the file to create is pointed to by path. If the file is successfully created, its permission bits are set to mode, as modified by the process' umask; if the file can't be created, creat returns -1 , and errno is set. The creat system call2 is equivalent to

open (path, O_WRONLY | O_CREAT | O_TRUNC, mode);

The close Function

The close function closes an open file.

int close (int fd);

fd should be a file descriptor that was previously returned by open, creat, dup, (or dup2), or pipe. When a process closes a file, any locks that it may have on the file are released.

When a process exits, all of its files are automatically closed, a fact taken advantage of by many programmers. It is considered good programming practice to close a file when you are finished with it, as file descriptors are a finite resource.

The lseek and llseek Functions

Every open file has an associated file offset, which determines where the next read or write operation will start. The file offset is set to 0 when a file has been opened, and is automatically increased after each successful read or write. Reads from a file descriptor start from the current file offset, as do writes, unless the O_APPEND flag was set when the file was opened (in which case the file offset gets set to the end of the file at the start of every write).

You can change the file offset for an open file by using either the lseek or llseek functions.

off_t lseek (int fildes, off_t offset, int whence);
offset_t llseek (int fildes, offset_t offset, int whence);

How the value of offset is interpreted depends on the value of the whence argument.

  • If whence is SEEK_SET, the file ponter is set to offset bytes.
  • If whence is set to SEEK_CUR, the file pointer is set to its current location plus offset.
  • If whence is SEEK_END, the file pointer is set to the end of the file plus offset.

The constants SEEK_SET, SEEK_CUR, and SEEK_END are defined in <unistd.h>, and have the values of 0, 1, and 2 respectively, for compatibility with older code3.

The llseek system call is the 64 bit API version of lseek, which uses the 64 bit offset_t rather than the 32 bit off_t.

You can use the following code to position the file pointer at the beginning of a file:

lseek (fildes, 0, SEEK_SET);

Similarly, you can use the following code to position the file pointer at the end of a file:

lseek (fildes, 0, SEEK_END);

Notice that off_t and offset_t are signed quantities. This means that negative offsets may be specified. Attempts to seek before the start of a file result in an error.

It is possible to use lseek to seek beyond the end of a file. When you next write to the file, it gets extended, creating a hole in the file. These holes are read back as 0. A file with holes in it is called a sparse file.

The tell Function

The tell function is used to get the current file offset for a file descriptor.

off_t tell (int fd);

Notice that the return type is an off_t (which is a 32 bit quantity), rather than an offset_t (which is a 64 bit quantity). This means that you cannot safely use it for files that were opened with the O_LARGEFILE flag specified, because the file's offset may be too large to fit into an off_t.

The read and pread Functions

Use the read and pread functions to read data from an open file.

ssize_t read (int fildes, void *buf, size_t nbyte); 

ssize_t pread (int fildes, void *buf, size_t nbyte, off_t offset);

The read function reads up to nbyte bytes from the open file referred to by fd into the buffer pointed to by buf. If the read is successful, the number of bytes read is returned, unless you are at the end of file, in which case 0 is returned.

It's possible that read will read less than the number of bytes you requested with nbyte. There are several reasons why this could happen.

  • You are reading from a regular file and encounter the end of file before reading the number of requested bytes. For example, if there are only 64 bytes remaining until the end of file and you request 128 bytes, read will return 64. The next time you try to read from the file, 0 will be returned (assuming that no other process has written to the file in the meantime).
  • Reading from a terminal usually happens one line at a time.

The read operation starts at the file's current offset, which gets incremented by the number of bytes read before a successful return.

The pread function is identical to read, except that pread read operations start at the specified offset, without changing the file pointer. Attempting to perform a pread on a file that is incapable of seeking will result in an error.

The write and pwrite Functions

Use the write and pwrite functions to write data to an open file.

ssize_t write (int fildes, void *buf, size_t nbyte);
ssize_t pwrite (int fildes, void *buf, size_t nbyte, off_t offset);

The write function writes up to nbyte bytes to the open file referred to by fd from the buffer pointed to by buf. If the write is successful, the number of bytes written is returned.

Normally a write operation starts at the file's current offset, which gets incremented by the number of bytes written before a successful return. However, if the O_APPEND flag was specified when the file was opened, the file pointer is set to the end of the file before the buffer is written. The moving of the file pointer and the writing of the data are performed atomically.

The pwrite function is identical to write, except pwrite write operations start at the given offset, without changing the file pointer. As with pread, attempting to perform a pwrite on a file that is incapable of seeking results in an error.

Footnotes

  1. This flag was introduced in Solaris 2.6, as part of the large files feature set. In releases of Solaris prior to 2.6, files were limited to 2 GB.

  2. In early versions of UNIX, the oflag option to open only allowed the values 0, 1, and 2 (for read-only, write-only, and read-write respectively -- in Solaris, these are the values for O_RDONLY, O_WRONLY, and O_RDWR), hence the need for creat. Since support for O_CREAT and O_TRUNC was added to open, the use of creat has become less necessary.

    The decision to call this function creat instead of create was arbitrary. Ken Thompson, one of the original authors of UNIX, has been quoted as saying that if he could change anything in UNIX, he would call this function create rather than creat.

  3. The l in lseek's name stands for long integer. The lseek system call was added to UNIX in Version 7, the same time that the long data type was introduced to C. Similarly, the ll in llseek's name stands for long long integer.

About the Author

Rich Teer has more than 10 years of industry experience with UNIX systems and C programming. He runs his own Solaris consultancy and web hosting company, and is currently writing a book, Solaris Systems Programming, to be published by Addison-Wesley in 2002. Rich lives in Kelowna, BC, with his wife, Jenny, and their dog, Judge.

August 2001

Rate and Review
Tell us what you think of the content of this page.
Excellent   Good   Fair   Poor  
Comments:
Your email address (no reply is possible without an address):
Sun Privacy Policy

Note: We are not able to respond to all submitted comments.