Sun Java Solaris Communities My SDN Account Join SDN
 
White Paper

Endianness in the Solaris Operating Environment

 
  « Previous | Contents | Next »
 
Introduction
This paper focuses on the programming-related aspects of endianness. Its primary goal is to present mechanisms for producing endian-independent source code that can be compiled for either big-endian or little-endian environments from one source.
 

Readers unfamiliar with endianness as applied to programming (as opposed to eggs) will find some needed background in the next section. The "Overview" section outlines the remainder of this white paper.

 
What Is Endianness?
Nearly all computers today address their memory in units of 8-bit "bytes." They address larger storage units--typically, but not always, called halfwords (16 bits), words (32 bits), and doublewords (64 bits)--by giving the address of the byte at one end or the other of the storage unit. Unfortunately, some computers, the "little-endians," use the address of the numerically least significant byte, while others, the "big-endians," use the address of the numerically most significant byte for the address of the multibyte storage unit[1]. The Intel x86 family and Digital Equipment Corporation architectures (PDP-11, VAX, Alpha) are representative little-endians, while the Sun SPARC, IBM 360/370, and Motorola 68000 and 88000 architectures are big-endians. Still other architectures (PowerPC, MIPS, and Intel's IA-64[2]) are capable of operating in either big-endian or little-endian mode.

[1] The opening quotation was written by Swift in 1726, so the "endian" term predates the computer era by more than two centuries.

[2] IA-64 is Intel's 64-bit successor to the x86 (now called IA-32) architecture. "Merced" is the better known name of the first implementation of the IA-64 architecture.

Consider a little-endian and a big-endian computer, each containing in its first 16 bytes of memory the number representing the address of the byte. Table 1-1 illustrates this situation. Regarded as an array of bytes, the situation is the same on both computers.

 

Table 1-1. Memory as an Array of Bytes

Byte Addr

00

01

02

03

04

05

06

07

08

09

10

11

12

13

14

15

Contents

00

01

02

03

04

05

06

07

08

09

10

11

12

13

14

15

 
If instead of considering the memory as an array of bytes we consider the same memory contents as four 32-bit words, the view that the two processors have is different, as shown in Table 1-2 and Table 1-3 In these tables, the bytes are grouped into four byte words, which are shown in the Arabic numeral form with the most significant byte ("digit") on the left. In Table 1-2 we put the word address column on the right ("little end") because the computer uses the address of the least significant byte, the byte on the right, to address the word. In Table 1-3, the address column is on the left ("big end"), showing that the computer addresses the most significant byte in word operations. As a result, the little-endian processor loading the 32-bit word at word address 0 would obtain a different value (03.02.01.00) than the big-endian processor (00.01.02.03). [3]

[3] The periods denote concatenation of the byte values so that 00.01.02.03 represents:

equation

 

Table 1-2. Little-Endian Memory as an Array of Words

Contents

Word Addr

03

02

01

00

00

07

06

05

04

04

11

10

09

08

08

15

14

13

12

12

 

Table 1-3. Big-Endian Memory as an Array of Words

Word Addr

Contents

00

00

01

02

03

04

04

05

06

07

08

08

09

10

11

12

12

13

14

15

 
Overview
There are two fundamental sources of endian-related problems:

  • Programmer assumptions about the endianness of the processor on which a program will run

  • Endianness collision: a program running on a machine of one endianness having to deal with structured data (that is, not simply an array of bytes) exported from an environment of the other endianness

 

Most programmers will never encounter endianness problems. Some will never even write programs that will have to run on machines of different endianness. This paper is intended to help those who may have to deal with endianness issues either in developing a new product or porting an existing product. It offers advice on how to avoid the problems where possible and how to recognize and overcome the more difficult ones. Although most of the material is not operating system specific, it highlights the solutions that the Solaris operating environment offers. It also points out areas in which hardware assistance may be brought to bear in dealing with endianness problems. The paper is organized into three main sections as follows:

 
  • Avoiding Endianness Assumptions

    The most common, and yet easiest to solve, endian-related problems arise from programmers being unaware of endianness issues. This section describes some typical cases and shows how to recognize them and to write endian-independent code. Both applications and systems programmers may find this section useful.

  • Application Problems

    This section deals with endianness problems encountered in applications programming.

  • Device Drivers and Endianness

    This section covers endianness issues in systems programming such as those facing device driver writers in mixed-endian environments.

 
Finally, the last section in this white paper, "Operating System and Bi-Endian Architectures," discusses issues involved in choosing the endianness for an operating system on a bi-endian system, one with hardware that can support either endianness.
 
Notes appearing throughout this paper highlight the practical application of the techniques and mechanisms discussed.
 
  « Previous | Contents | Next »