|
The 32-bit Application Programming Interfaces (APIs) supported in the 64-bit
operating environment are the same as the APIs supported in the 32-bit
operating environment. Thus, no changes are required for 32-bit applications
between the 32-bit and 64-bit environments. However, recompiling as a 64-bit
application can require cleanup.
This document discusses the following topics:
Basic Issues
Two basic issues arise for applications developers regarding converting
32-bit applications into 64-bit applications:
-
Data type consistency and the different data model.
-
Interoperation between applications using different data models.
Maintaining a single source with as few #ifdefs
as possible is usually better than maintaining multiple source trees. It's
best to write code that works correctly in both 32-bit and 64-bit environments.
At best, the conversion of current code might require only a recompilation
and relinking with the 64-bit libraries. For those cases where code changes
are required, Solaris 7 includes tools that help make conversion easier.
The Data Model
The 64-bit environment uses a different data type model than the 32-bit
environment. The C data-type model used for 32-bit applications is the
ILP32 model, so named because ints,
longs, and pointers are 32-bit.
The LP64 data model is the C data-type model for 64-bit applications. This
model was agreed upon by a consortium of companies across the industry.
It is so named because longs and pointers grow to 64-bit quantities. The
remaining C types int, short,
and charare the same as in the
ILP32 model. The standard relationship between C integral types still holds
true.
sizeof (char) <= sizeof (short)
<= sizeof (int) <= sizeof (long)
Table 1 lists the basic C types, and their corresponding sizes in bits
for both the ILP32 and LP64 data type models.
Table 1 Data Type Sizes in Bits1289
|
C Data Type
|
ILP32
|
LP64
|
|
char
|
8
|
unchanged
|
|
short
|
16
|
unchanged
|
|
int
|
32
|
unchanged
|
|
long
|
32
|
64
|
|
long long
|
64
|
unchanged
|
|
pointer 32
|
|
64
|
|
enum
|
32
|
unchanged
|
|
float
|
32
|
unchanged
|
|
double
|
64
|
unchanged
|
|
long double
|
128
|
unchanged
|
Tips for Converting Applications
The following topics describe useful tips and techniques for the 64-bit
environment.
Data Type Differences
It is not unusual for current 32-bit applications to assume that ints,
pointers, and longs
are the same size. Because the size of longs and pointers change
in the LP64 data model, you need to be aware that this change alone can
cause many 32-bit to 64-bit conversion problems.
In addition, declarations and casts become very important in showing
what is intended. The evaluation of expressions can be affected when the
types change. The effects of standard C conversion rules are influenced
by the change in data-type sizes. To adequately show what is intended,
you might need to declare the types of constants. Casts might also be needed
in expressions to make certain that the expression is evaluated correctly.
This is particularly true in the case of sign extension, where explicit
casting might be essential to show intent.
Implementing Single Source Code
The following topics describe some of the resources available to application
developers that help you write a piece of single-source code that supports
32-bit and 64-bit compilation.
Derived Types
Using the system derived types helps make code 32-bit and 64-bit safe,
since the derived types themselves must be safe for both the ILP32 and
LP64 data models. In general, using derived types to allow for change is
good programming practice. Should the data model change in the future,
or when porting to a different platform, only the system derived types
need to change rather than the application.
The system include files <sys/types.h>
and <inttypes.h>, which contain
constants, macros, and derived types that are helpful in making applications
32-bit and 64-bit safe.
<sys/types.h>
An application source file that includes <sys/types.h>
makes the definitions of _LP64 and _ILP32 available through inclusion of
<sys/isa_defs.h>. This header
also contains a number of basic derived types that should be used whenever
appropriate. In particular, the following are of special interest:
|
Type
|
Purpose
|
|
clock_t
|
Represents the system times in clock ticks.
|
|
dev_t
|
Used for device numbers.
|
|
off_t
|
Used for file sizes and offsets.
|
|
ptrdiff_t
|
The signed integral type for the result of subtracting
two pointers.
|
|
size_t
|
The size, in bytes, of objects in memory.
|
|
ssize_t
|
Used by functions that return a count of bytes or
an error indication.
|
|
time_t
|
Used for time in seconds.
|
All of these types remain 32-bit quantities in the ILP32 compilation
environment and grow to 64-bit quantities in the LP64 compilation environment.
<inttypes.h>
The include file <inttypes.h>
was added to the Solaris 2.6 release to provide constants, macros, and
derived types that help programmers make their code compatible with explicitly
sized data items, independent of the compilation environment. It contains
mechanisms for manipulating 8-bit, 16-bit, 32-bit, and 64-bit objects.
The file is part of an ANSI C proposal and tracks the ISO/JTC1/SC22/WG14
C committee's working draft for the revision of the current ISO
C standard, ISO/IEC 9899:1990 Programming language - C.
The basic features provided by <inttypes.h>
are:
-
A set of fixed-width integer types
-
uintptr_t and other helpful types
-
Constant macros
-
Limits
-
Format string macros
Fixed-Width Integer Types
The fixed-width integer types provided by <inttypes.h>
include both signed and unsigned integer types, such as int8_t,
int16_t, int32_t, int64_t, uint8_t, uint16_t, uint32_t, and uint64_t.
Derived types defined as the smallest integer types that can hold the specified
number of bits, include int_least8_t,...,
int_least64_t, uint_least8_t,..., uint_least64_t.
These fixed-width types should not be used indiscriminately. For example,
intcan continue to be used for
such things as loop counters and file descriptors, and long
can be used for array indexes. On the other hand, use fixed-width
types for explicit binary representations of:
-
On-disk data
-
Over-the-wire data
-
Hardware registers
-
Binary interface specifications
-
Binary data structures
uintptr_t
and Other Helpful Types
Other useful types provided by <inttypes.h>
include signed and unsigned integer types large enough to hold a pointer.
These are given as intptr_t and
uintptr_t. In addition, intmax_t
and uintmax_t are defined to be
the longest (in bits) signed and unsigned integer types available.
Using the uintptr_ttype as
the integral type for pointers is a better option than using a fundamental
type such as unsigned long. Even
though an unsigned long is the
same size as a pointer in both the ILP32 and LP64 data models, the use
of the uintptr_t requires only
the definition of uintptr_t to change when a different data model is used.
This makes it portable to many other systems. It is also a clearer way
to express your intentions in C.
The intptr_t and uintptr_t
types are extremely useful for casting pointers when performing
address arithmetic. They should be used instead of long
or unsigned long for this
purpose.
Limits
The limits defined by <inttypes.h>
are constants specifying the minimum and maximum values of various integer
types. This includes minimum and maximum values of each of the fixed-width
types, such as INT8_MIN,..., INT64_MIN,
INT8_MAX,..., INT64_MAX, and their unsigned counterparts.
The minimum and maximum for each of the least-sized types are given,
too. These include INT_LEAST8_MIN,...,
INT_LEAST64_MIN, INT_LEAST8_MAX,..., INT_LEAST64_MAX, and their
unsigned counterparts.
Finally, the minimum and maximum value of the largest supported integer
types are defined. These include INTMAX_MIN
and INTMAX_MAX and their corresponding
unsigned versions.
Format String Macros
Macros for specifying the printf(3S) and
scanf(3S) format specifiers are
also provided in <inttypes.h>.
Essentially, these macros prepend the format specifier with an
l or ll to specify the argument
as a long or long
long, given that the number of bits in the argument is built into
the name of the macro.
Macros for printf(3S) format
specifiers exist for printing 8-bit, 16-bit, 32-bit, and 64-bit integers,
the smallest integer types, and the largest integer types, in decimal,
octal, unsigned, and hexadecimal. See the examples that follow:
Example 1:
int64_t i;
printf("i =%" PRIx64 "n", i);
Similarly, there are macros for scanf(3S)
format specifiers for reading 8-bit, 16-bit, 32-bit, and 64-bit integers
and the largest integer type in decimal, octal, unsigned, and hexadecimal.
Example 2:
uint64_t u;
scanf("%" SCNu64 "n", &u);
The lint(1B) Tool
The lint(1B) program has been enhanced
to detect potential 64-bit problems. This tool is useful in making code
64-bit safe. In addition, the -v
option to the Sun C Compiler can be very helpful. It helps the compiler
perform additional and stricter semantic checks. It also enables certain
lint-like checks on the named files.
When you clean up code to be 64-bit safe, use header files that have
the correct definition of the derived types and data structures for the
64-bit environment. In other words, use the header files present in the
Solaris 7 operating environment.
When using lint(1B), remember that not all problems result in lint(1B)
warnings, nor do all lint(1B) warnings
indicate that a change is required. The examples that follow describe some
of the more common challenges you are likely to encounter when converting
code. Where appropriate, the corresponding lint(1B)
warnings are shown in bold.
Pointer Sizes
Do not assume that an int and a
pointer are the same size. Because ints
and pointers are the same size in the ILP32 environment, a lot of
code relies on this assumption. Pointers are often cast to int
or unsigned int for address
arithmetic. Instead, pointers could be cast to long
because long and pointers
are the same size in both ILP32 and LP64 worlds. Rather than explicitly
using unsigned long, use uintptr_t
instead because it expresses the intent more closely and makes the
code more portable, insulating it against future changes.
Example 3:
char *p;
p = (char *) ((int)p & PAGEOFFSET);
%
warning: conversion of pointer loses
bits
Suggested Use:
char *p;
p = (char *) ((uintptr_t)p & PAGEOFFSET);
int and
long Data Type Sizes
Because ints and longs
were never really distinguished in ILP32, a lot of existing code
uses them indiscriminately while implicitly or explicitly assuming that
they are interchangeable. Any code that makes this assumption must be modified
to work for both ILP32 and LP64. While an int
and a long are both 32-bits
in the ILP32 data model, in the LP64 data model, a long
is 64-bits.
Example 4:
int waiting;
long w_io;
long w_swap;
...
waiting = w_io + w_swap;
%
warning: assignment of 64-bit integer
to 32-bit integer
Sign Extension
Sign extension is a common problem when converting to 64-bits. It is hard
to detect before the problem actually occurs because lint(1B)
does not warn you about it. Furthermore, the type conversion and promotion
rules are somewhat obscure. To fix sign extension problems, you must use
explicit casting to achieve the intended results. When compiled as a 64-bit
program, the addrvariable in the
following example becomes sign-extended, even though both addr
and a.base are unsigned types.
Example 5:
%cat test.c
struct foo {
unsigned int base:19, rehash:13;
};
main(int argc, char *argv[])
{
struct foo a;
unsigned long addr;
a.base = 0x40000;
addr = a.base << 13;
/* Sign extension here! */
printf("addr 0x%lxn", addr);
addr = (unsigned int)(a.base
<< 13); /* No sign extension here! */
printf("addr 0x%lxn", addr);
}
Pointer Arithmetic
In general, using pointer arithmetic is preferable to using address arithmetic
because pointer arithmetic is independent of the data model, whereas address
arithmetic may not be. Pointer arithmetic usually leads to simpler code.
Example 6:
int *end;
int *p;
p = malloc(4 * NUM_ELEMENTS);
end = (int *)((unsigned int)p + 4
* NUM_ELEMENTS);
%
warning: conversion of pointer loses
bits
Suggested Use:
int *end;
int *p;
p = malloc(sizeof (*p) * NUM_ELEMENTS);
end = p + NUM_ELEMENTS;
Repack Structures
Internal data structures in applications should be checked for holes. Extra
padding between fields in the structure to meet alignment requirements
can be used, because any long or
pointer fields will grow to 64 bits for LP64. In the 64-bit environment
on SPARC platforms, all types of structures are aligned to at least the
size of the largest quantity within them. A simple rule for repacking the
structure is to move the long and pointer fields to the beginning of the
structure.
Example 7:
struct bar {
int i;
long j;
int k;
char *p;
}; /* sizeof (struct bar)
= 32 */
Suggested Use:
struct bar {
char *p;
long j;
int i;
int k;
}; /* sizeof (struct bar)
= 24 */
Unions
Be sure to check unions because their fields might have changed sizes between
ILP32 and LP64.
Example 8:
typedef union {
double _d;
long _l[2];
} llx_t;
Suggested Use:
typedef union {
double _d;
int _l[2];
} llx_t;
Constants
A loss of data can occur in some constant expressions because of lack of
precision. These types of problems are very hard to find. Be explicit about
specifying the type(s) in your constant expressions. Add some combination
of {u,U,l,L} to the end of each
integer constant to specify its type. You might also use casts to specify
the type of a constant expression.
Example 9:
int i = 32;
long j = 1 << i; /* j
will get 0 because RHS is integer */
/* expression */
Suggested Use:
int i = 32;
long j = 1L << i;
Implicit Duplication
The C compiler assumes a type intfor
any function or variable that is used in a module and not defined or declared
externally. Any longs and pointers
used in this way are truncated by the compiler's implicit int
declaration. The appropriate extern
declaration for the function or variable should be placed in a header
and not in the C module. This header should then be included by any C module
that uses the function or variable. If this is a function or variable defined
by the system headers, the proper header should still be included in the
code.
Example 10:
int
main(int argc, char *argv[])
{
char *name = getlogin()
printf("login = %sn", name);
return (0);
}
%
warning: improper pointer/integer
combination: op "="
warning: cast to pointer from 32-bit
integer
implicitly declared to return int
getlogin
printf
Suggested Use:
#include <unistd.h>
#include <stdio.h>
int
main(int argc, char *argv[])
{
char *name = getlogin();
(void) printf("login = %sn",
name);
return (0);
}
sizeof( ) is
an unsigned long
In LP64, sizeof() has the effective
type of an unsigned long. Occasionally
sizeof() is passed to a function
expecting an argument of type int,
or assigned or cast to an int.
In some cases, this truncation might cause loss of data.
Example 11:
long a[50];
unsigned char size = sizeof (a);
%
warning: 64-bit constant truncated
to 8 bits by assignment
warning: initializer does not fit
or is out of range: 0x190
Use of Casts
Relational expressions can be tricky because of conversion rules. You should
be very explicit about how you want the expression to be evaluated by adding
casts wherever necessary.
Format String Conversion
The format strings for printf(3S),
sprintf(3S), scanf(3S), and sscanf(3S)
might need to be changed for long or pointer arguments. For pointer arguments,
the conversion operation given in the format string should be %p
to work in both the 32-bit and 64-bit environments.
Example 12:
char *buf;
struct dev_info *devi;
...
(void) sprintf(buf, "di%x", (void
*)devi);
%
warning: function argument (number)
type inconsistent with format
sprintf (arg 3)
void *: (format) int
Note: For long arguments,
the long size specification, l,
should be prepended to the conversion operation character in the format
string. Furthermore, check to be sure that the storage pointed to by buf
is large enough to contain 16 digits.
Example 13:
size_t nbytes;
u_long align, addr, raddr, alloc;
printf("kalloca:%d%%%d from heap got%x.%x
returns%xn",
nbytes, align, (int)raddr, (int)(raddr
+ alloc), (int)addr);
%
warning: cast of 64-bit integer to
32-bit integer
warning: cast of 64-bit integer to
32-bit integer
warning: cast of 64-bit integer to
32-bit integer
Suggested Use:
size_t nbytes;
u_long align, addr, raddr, alloc;
printf("kalloca:%lu%%%lu from heap
got%lx.%lx returns%lxn",
nbytes, align, raddr, raddr + alloc,
addr);
Derived Types That Have Grown in Size
A number of derived types have changed to now represent 64-bit quantities
in the 64-bit application environment. This change does not affect 32-bit
applications. However, any 64-bit applications that consume or export data
described by these types needs to be re-evaluated for correctness. An example
of this is in applications that directly manipulate the utmp(4)
or utmpx(4) files. For correct
operation in the 64-bit application environment, you should not attempt
to directly access these files. Instead, you should use the getutxent(3C)
and the related family of functions.
ifdef# for
Explicit 32-bit Versus 64-bit Prototypes
In some cases, specific 32-bit and 64-bit versions of an interface are
unavoidable. In the headers, these would be distinguishable by the use
of the _LP64 or _ILP32 feature test macros. Similarly, code that is to
work in 32-bit and 64-bit environments might also need to utilize the appropriate
#ifdefs, depending on the compilation
mode.
Calling Convention Sizes
When passing structures by value and compiling the code for SPARC V9, if
the structure is small enough, it is passed in registers rather than as
a pointer to a copy. This can cause problems when passing structures between
C code and hand-written assembly code.
Algorithmic Changes
After code has been made 64-bit safe, review it again to verify that the
algorithms and data structures still make sense. The data types are larger,
so data structures might use more space. The performance of your code might
change as well. Given these concerns, you might need to modify your code
accordingly.
|
|