Sun Java Solaris Communities My SDN Account Join SDN

Article

Migrating to Solaris 64-bit: 32-bit Applications and Data Model

 
 
The 32-bit Application Programming Interfaces (APIs) supported in the 64-bit operating environment are the same as the APIs supported in the 32-bit operating environment. Thus, no changes are required for 32-bit applications between the 32-bit and 64-bit environments. However, recompiling as a 64-bit application can require cleanup.

This document discusses the following topics:

Basic Issues

Two basic issues arise for applications developers regarding converting 32-bit applications into 64-bit applications:
  • Data type consistency and the different data model.
  • Interoperation between applications using different data models.
Maintaining a single source with as few #ifdefs as possible is usually better than maintaining multiple source trees. It's best to write code that works correctly in both 32-bit and 64-bit environments. At best, the conversion of current code might require only a recompilation and relinking with the 64-bit libraries. For those cases where code changes are required, Solaris 7 includes tools that help make conversion easier.

The Data Model

The 64-bit environment uses a different data type model than the 32-bit environment. The C data-type model used for 32-bit applications is the ILP32 model, so named because ints, longs, and pointers are 32-bit. The LP64 data model is the C data-type model for 64-bit applications. This model was agreed upon by a consortium of companies across the industry. It is so named because longs and pointers grow to 64-bit quantities. The remaining C types int, short, and charare the same as in the ILP32 model. The standard relationship between C integral types still holds true.

sizeof (char) <= sizeof (short) <= sizeof (int) <= sizeof (long)

Table 1 lists the basic C types, and their corresponding sizes in bits for both the ILP32 and LP64 data type models.

Table 1    Data Type Sizes in Bits1289
C Data Type
ILP32
LP64
char
8
unchanged
short
16
unchanged
int 
32
unchanged
long
32
64
long long
64
unchanged
pointer 32
 
64
enum
32
unchanged
float
32
unchanged
double
64
unchanged
long double
128
unchanged
 

Tips for Converting Applications

The following topics describe useful tips and techniques for the 64-bit environment.

Data Type Differences

It is not unusual for current 32-bit applications to assume that ints, pointers, and longs are the same size. Because the size of longs and pointers change in the LP64 data model, you need to be aware that this change alone can cause many 32-bit to 64-bit conversion problems.

In addition, declarations and casts become very important in showing what is intended. The evaluation of expressions can be affected when the types change. The effects of standard C conversion rules are influenced by the change in data-type sizes. To adequately show what is intended, you might need to declare the types of constants. Casts might also be needed in expressions to make certain that the expression is evaluated correctly. This is particularly true in the case of sign extension, where explicit casting might be essential to show intent.

Implementing Single Source Code

The following topics describe some of the resources available to application developers that help you write a piece of single-source code that supports 32-bit and 64-bit compilation.

Derived Types

Using the system derived types helps make code 32-bit and 64-bit safe, since the derived types themselves must be safe for both the ILP32 and LP64 data models. In general, using derived types to allow for change is good programming practice. Should the data model change in the future, or when porting to a different platform, only the system derived types need to change rather than the application.

The system include files <sys/types.h> and <inttypes.h>, which contain constants, macros, and derived types that are helpful in making applications 32-bit and 64-bit safe.

<sys/types.h>

An application source file that includes <sys/types.h> makes the definitions of _LP64 and _ILP32 available through inclusion of <sys/isa_defs.h>. This header also contains a number of basic derived types that should be used whenever appropriate. In particular, the following are of special interest:

Type
Purpose
clock_t
Represents the system times in clock ticks. 
dev_t
Used for device numbers.  
off_t
Used for file sizes and offsets. 
ptrdiff_t
The signed integral type for the result of subtracting two pointers. 
size_t
The size, in bytes, of objects in memory. 
ssize_t
Used by functions that return a count of bytes or an error indication.
time_t
Used for time in seconds. 
 

All of these types remain 32-bit quantities in the ILP32 compilation environment and grow to 64-bit quantities in the LP64 compilation environment.

<inttypes.h>

The include file <inttypes.h> was added to the Solaris 2.6 release to provide constants, macros, and derived types that help programmers make their code compatible with explicitly sized data items, independent of the compilation environment. It contains mechanisms for manipulating 8-bit, 16-bit, 32-bit, and 64-bit objects. The file is part of an ANSI C proposal and tracks the ISO/JTC1/SC22/WG14 C committee's working draft for the revision of the current ISO C standard, ISO/IEC 9899:1990 Programming language - C.

The basic features provided by <inttypes.h> are:

  • A set of fixed-width integer types
  • uintptr_t and other helpful types
  • Constant macros
  • Limits
  • Format string macros

Fixed-Width Integer Types

The fixed-width integer types provided by <inttypes.h> include both signed and unsigned integer types, such as int8_t, int16_t, int32_t, int64_t, uint8_t, uint16_t, uint32_t, and uint64_t. Derived types defined as the smallest integer types that can hold the specified number of bits, include int_least8_t,..., int_least64_t, uint_least8_t,..., uint_least64_t.

These fixed-width types should not be used indiscriminately. For example, intcan continue to be used for such things as loop counters and file descriptors, and long can be used for array indexes. On the other hand, use fixed-width types for explicit binary representations of:

  • On-disk data
  • Over-the-wire data
  • Hardware registers
  • Binary interface specifications
  • Binary data structures

uintptr_t and Other Helpful Types

Other useful types provided by <inttypes.h> include signed and unsigned integer types large enough to hold a pointer. These are given as intptr_t and uintptr_t. In addition, intmax_t and uintmax_t are defined to be the longest (in bits) signed and unsigned integer types available.

Using the uintptr_ttype as the integral type for pointers is a better option than using a fundamental type such as unsigned long. Even though an unsigned long is the same size as a pointer in both the ILP32 and LP64 data models, the use of the uintptr_t requires only the definition of uintptr_t to change when a different data model is used. This makes it portable to many other systems. It is also a clearer way to express your intentions in C.

The intptr_t and uintptr_t types are extremely useful for casting pointers when performing address arithmetic. They should be used instead of long or unsigned long for this purpose.

Limits

The limits defined by <inttypes.h> are constants specifying the minimum and maximum values of various integer types. This includes minimum and maximum values of each of the fixed-width types, such as INT8_MIN,..., INT64_MIN, INT8_MAX,..., INT64_MAX, and their unsigned counterparts.

The minimum and maximum for each of the least-sized types are given, too. These include INT_LEAST8_MIN,..., INT_LEAST64_MIN, INT_LEAST8_MAX,..., INT_LEAST64_MAX, and their unsigned counterparts.

Finally, the minimum and maximum value of the largest supported integer types are defined. These include INTMAX_MIN and INTMAX_MAX and their corresponding unsigned versions.

Format String Macros

Macros for specifying the printf(3S) and scanf(3S) format specifiers are also provided in <inttypes.h>. Essentially, these macros prepend the format specifier with an l or ll to specify the argument as a long or long long, given that the number of bits in the argument is built into the name of the macro.

Macros for printf(3S) format specifiers exist for printing 8-bit, 16-bit, 32-bit, and 64-bit integers, the smallest integer types, and the largest integer types, in decimal, octal, unsigned, and hexadecimal. See the examples that follow:

Example 1:
int64_t i;
printf("i =%" PRIx64 "n", i);

Similarly, there are macros for scanf(3S) format specifiers for reading 8-bit, 16-bit, 32-bit, and 64-bit integers and the largest integer type in decimal, octal, unsigned, and hexadecimal.

Example 2:
uint64_t u;
scanf("%" SCNu64 "n", &u);
 

The lint(1B) Tool

The lint(1B) program has been enhanced to detect potential 64-bit problems. This tool is useful in making code 64-bit safe. In addition, the -v option to the Sun C Compiler can be very helpful. It helps the compiler perform additional and stricter semantic checks. It also enables certain lint-like checks on the named files.

When you clean up code to be 64-bit safe, use header files that have the correct definition of the derived types and data structures for the 64-bit environment. In other words, use the header files present in the Solaris 7 operating environment.

When using lint(1B), remember that not all problems result in lint(1B) warnings, nor do all lint(1B) warnings indicate that a change is required. The examples that follow describe some of the more common challenges you are likely to encounter when converting code. Where appropriate, the corresponding lint(1B) warnings are shown in bold.

Pointer Sizes

Do not assume that an int and a pointer are the same size. Because ints and pointers are the same size in the ILP32 environment, a lot of code relies on this assumption. Pointers are often cast to int or unsigned int for address arithmetic. Instead, pointers could be cast to long because long and pointers are the same size in both ILP32 and LP64 worlds. Rather than explicitly using unsigned long, use uintptr_t instead because it expresses the intent more closely and makes the code more portable, insulating it against future changes.

Example 3:
char *p;
p = (char *) ((int)p & PAGEOFFSET);

%
warning: conversion of pointer loses bits

Suggested Use:
char *p;
p = (char *) ((uintptr_t)p & PAGEOFFSET);

int and long Data Type Sizes

Because ints and longs were never really distinguished in ILP32, a lot of existing code uses them indiscriminately while implicitly or explicitly assuming that they are interchangeable. Any code that makes this assumption must be modified to work for both ILP32 and LP64. While an int and a long are both 32-bits in the ILP32 data model, in the LP64 data model, a long is 64-bits.

Example 4:
int waiting;
long w_io;
long w_swap;
...
waiting = w_io + w_swap;

%
warning: assignment of 64-bit integer to 32-bit integer

Sign Extension

Sign extension is a common problem when converting to 64-bits. It is hard to detect before the problem actually occurs because lint(1B) does not warn you about it. Furthermore, the type conversion and promotion rules are somewhat obscure. To fix sign extension problems, you must use explicit casting to achieve the intended results. When compiled as a 64-bit program, the addrvariable in the following example becomes sign-extended, even though both addr and a.base are unsigned types.

Example 5:
%cat test.c
struct foo {
  unsigned int base:19, rehash:13;
};

main(int argc, char *argv[])
{
  struct foo a;
  unsigned long addr;

  a.base = 0x40000;
  addr = a.base << 13;  /* Sign extension here! */
  printf("addr 0x%lxn", addr);

  addr = (unsigned int)(a.base << 13); /* No sign extension here! */
  printf("addr 0x%lxn", addr);
}
 

Pointer Arithmetic

In general, using pointer arithmetic is preferable to using address arithmetic because pointer arithmetic is independent of the data model, whereas address arithmetic may not be. Pointer arithmetic usually leads to simpler code.

Example 6:
int *end;
int *p;
p = malloc(4 * NUM_ELEMENTS);
end = (int *)((unsigned int)p + 4 * NUM_ELEMENTS);

%
warning: conversion of pointer loses bits

Suggested Use:
int *end;
int *p;
p = malloc(sizeof (*p) * NUM_ELEMENTS);
end = p + NUM_ELEMENTS;

Repack Structures

Internal data structures in applications should be checked for holes. Extra padding between fields in the structure to meet alignment requirements can be used, because any long or pointer fields will grow to 64 bits for LP64. In the 64-bit environment on SPARC platforms, all types of structures are aligned to at least the size of the largest quantity within them. A simple rule for repacking the structure is to move the long and pointer fields to the beginning of the structure.

Example 7:
struct bar {
  int i;
  long j;
  int k;
  char *p;
};   /* sizeof (struct bar) = 32 */

Suggested Use:
struct bar {
  char *p;
  long j;
  int i;
  int k;
};   /* sizeof (struct bar) = 24 */

Unions

Be sure to check unions because their fields might have changed sizes between ILP32 and LP64.

Example 8:
typedef union {
       double   _d;
       long _l[2];
} llx_t;

Suggested Use:
typedef  union {
   double _d;
   int _l[2];
} llx_t;
 

Constants

A loss of data can occur in some constant expressions because of lack of precision. These types of problems are very hard to find. Be explicit about specifying the type(s) in your constant expressions. Add some combination of {u,U,l,L} to the end of each integer constant to specify its type. You might also use casts to specify the type of a constant expression.

Example 9:
int i = 32;
long j = 1 << i;  /* j will get 0 because RHS is integer */
                     /* expression */

Suggested Use:
int i = 32;
long j = 1L << i;

Implicit Duplication

The C compiler assumes a type intfor any function or variable that is used in a module and not defined or declared externally. Any longs and pointers used in this way are truncated by the compiler's implicit int declaration. The appropriate extern declaration for the function or variable should be placed in a header and not in the C module. This header should then be included by any C module that uses the function or variable. If this is a function or variable defined by the system headers, the proper header should still be included in the code.

Example 10:
int
main(int argc, char *argv[])
{
  char *name = getlogin()
  printf("login = %sn", name);
  return (0);
}
 
%
warning: improper pointer/integer combination: op "="
warning: cast to pointer from 32-bit integer
implicitly declared to return int
getlogin        printf

Suggested Use:
#include <unistd.h>
#include <stdio.h>
 
int
main(int argc, char *argv[])
{
  char *name = getlogin();
  (void) printf("login = %sn", name);
  return (0);
}

sizeof( ) is an unsigned long

In LP64, sizeof() has the effective type of an unsigned long. Occasionally sizeof() is passed to a function expecting an argument of type int, or assigned or cast to an int. In some cases, this truncation might cause loss of data.

Example 11:
long a[50];
unsigned char size = sizeof (a);

%
warning: 64-bit constant truncated to 8 bits by assignment
warning: initializer does not fit or is out of range: 0x190

Use of Casts

Relational expressions can be tricky because of conversion rules. You should be very explicit about how you want the expression to be evaluated by adding casts wherever necessary.

Format String Conversion

The format strings for printf(3S), sprintf(3S), scanf(3S), and sscanf(3S) might need to be changed for long or pointer arguments. For pointer arguments, the conversion operation given in the format string should be %p to work in both the 32-bit and 64-bit environments.

Example 12:
char *buf;
struct dev_info *devi;
...
(void) sprintf(buf, "di%x", (void *)devi);

%
warning: function argument (number) type inconsistent with format
  sprintf (arg 3)     void *: (format) int

Note:  For long arguments, the long size specification, l, should be prepended to the conversion operation character in the format string. Furthermore, check to be sure that the storage pointed to by buf is large enough to contain 16 digits.

Example 13:
size_t nbytes;
u_long align, addr, raddr, alloc;
printf("kalloca:%d%%%d from heap got%x.%x returns%xn",
nbytes, align, (int)raddr, (int)(raddr + alloc), (int)addr);

%
warning: cast of 64-bit integer to 32-bit integer
warning: cast of 64-bit integer to 32-bit integer
warning: cast of 64-bit integer to 32-bit integer

Suggested Use:
size_t nbytes;
u_long align, addr, raddr, alloc;
printf("kalloca:%lu%%%lu from heap got%lx.%lx returns%lxn",
nbytes, align, raddr, raddr + alloc, addr);
 

Derived Types That Have Grown in Size

A number of derived types have changed to now represent 64-bit quantities in the 64-bit application environment. This change does not affect 32-bit applications. However, any 64-bit applications that consume or export data described by these types needs to be re-evaluated for correctness. An example of this is in applications that directly manipulate the utmp(4) or utmpx(4) files. For correct operation in the 64-bit application environment, you should not attempt to directly access these files. Instead, you should use the getutxent(3C) and the related family of functions.
 

ifdef# for Explicit 32-bit Versus 64-bit Prototypes

In some cases, specific 32-bit and 64-bit versions of an interface are unavoidable. In the headers, these would be distinguishable by the use of the _LP64 or _ILP32 feature test macros. Similarly, code that is to work in 32-bit and 64-bit environments might also need to utilize the appropriate #ifdefs, depending on the compilation mode.

Calling Convention Sizes

When passing structures by value and compiling the code for SPARC V9, if the structure is small enough, it is passed in registers rather than as a pointer to a copy. This can cause problems when passing structures between C code and hand-written assembly code.

Algorithmic Changes

After code has been made 64-bit safe, review it again to verify that the algorithms and data structures still make sense. The data types are larger, so data structures might use more space. The performance of your code might change as well. Given these concerns, you might need to modify your code accordingly.
Rate and Review
Tell us what you think of the content of this page.
Excellent   Good   Fair   Poor  
Comments:
Your email address (no reply is possible without an address):
Sun Privacy Policy

Note: We are not able to respond to all submitted comments.