|
By Cecilia Hu, December, 2004
|
|
|
Abstract: This document describes how to modify 32-bit device drivers that run on the Solaris Operating System (OS) to be compatible with the 64-bit Solaris 10 OS on x86 platforms.
Contents:
1 Introduction
The capabilities of the Solaris platform continue to expand to meet customer needs. The Solaris 10 release is designed to fully support both 32-bit and 64-bit architectures. The Solaris OS supports machines based on both 32-bit and 64-bit SPARC processors as well as 32-bit and 64-bit x86 platforms.
The primary difference between the 32-bit and 64-bit development environments is that 32-bit applications are based on the ILP32 data model, while 64-bit applications are based on the LP64 model. The primary difference between applications for SPARC and x86-based systems, from the driver developer's point of view, is big-endian versus little-endian translation.
To write a common device driver for the Solaris OS, developers need to understand and consider these differences.
Note: This document addresses topics related to x86 platforms only. In this document, references to 64-bit operating systems refer to the Solaris OS on machines with AMD Opteron processors.
The Solaris OS runs in 64-bit mode on appropriate hardware, and provides a 64-bit kernel with a 64-bit address space for applications. The 64-bit kernel extends the capabilities of the 32-bit kernel by addressing more than 4 Gbyte of physical memory, by mapping up to 16 Tbyte of virtual address space for 64-bit application programs, and by allowing 32-bit and 64-bit applications to coexist on the same system.
This document discusses the differences between 32-bit and 64-bit data models, provides guidelines for cleaning 32-bit device drivers in preparation for the 64-bit Solaris OS kernel, and addresses driver-specific issues with the 64-bit Solaris OS kernel.
1.1 Audience and Organization
This information is intended for device driver developers who want to deliver 32-bit and 64-bit clean device drivers for the Solaris OS on x86 platforms. The article provides guidance on how to write code that is portable between the 32-bit environment and the 64-bit environment.
This document is organized as follows:
Notes:
Section 2, "Basic Information," explains some basic problem areas that a developer writing device drivers for Solaris systems may encounter when writing 32-bit and 64-bit clean device drivers.
Section 3, "General Conversion Guidelines," explains in detail the steps that should be taken in writing a common device driver for Solaris systems. At the end of this section is a short checklist for conversion. These guidelines can help you provide clean code for a 64-bit driver for the Solaris OS on x86 platforms.
Section 4, "Advanced Issues and Guidelines," addresses some advanced issues for developers writing device drivers for the 64-bit Solaris OS on x86 platforms, with a focus on enhancing performance.
Section 5, "Porting Example," presents an example that illustrates the 32-bit to 64-bit conversion process.
Section 6, "Conclusion," summarizes the issues involved in writing 32-bit and 64-bit common device drivers for the Solaris OS on x86 platforms.
1.2 Terms
ILP32
C language data model where int, long, and pointer data types are 32 bits in size.
LP64
C language data model where the int data type is 32 bits wide, but long and pointer data types are 64 bits wide.
32-bit program
Program compiled to run in 32-bit mode. For example, programs compiled for IA32 and 32-bit SPARC platforms.
64-bit program
Program compiled to run in 64-bit mode. For example, programs compiled
for the AMD64 and 64-bit SPARC platforms. Programs that have been successfully converted to run in 64-bit mode are also referred to as being 64-bit clean or 64-bit safe.
32-bit and 64-bit common device driver
A device driver with portable code that can be built and run in either a 32-bit or 64-bit environment.
2. Basic Information
Before you start to write a 64-bit clean device driver, it is useful to understand some of the differences between 32-bit and 64-bit operating systems. Most of these differences are similar to those you would encounter if you ported a driver from a 32-bit SPARC processor-based machine to a 64-bit SPARC processor-based machine.
2.1 Different Data Models
ILP32 is the C language data model for the 32-bit Solaris OS. ILP32 defines the int, long and pointer data types as 32 bits wide, type short as 16 bits, and type char as 8 bits.
LP64 is the C language data model for the 64-bit Solaris OS. LP64 defines the long and pointer data types as 64 bits wide. The LP64 data model also has a larger address space and a larger scalar arithmetic range.
The following table shows the basic language differences that exist both in driver programming and in application programming.
 |
char
|
8
|
8
|
short
|
16
|
16
|
int
|
32
|
32
|
long
|
32
|
64
|
long long
|
64
|
64
|
float
|
32
|
32
|
double
|
64
|
64
|
long double
|
96
|
128
|
pointer
|
32
|
64
|
It is not unusual for 32-bit applications to assume that types int, long and pointers are the same size. Drivers that run in a 64-bit environment may need to be converted to use the 64-bit data model. Because the size of type long and pointer changes in the LP64 data model, several potential problems could occur:
- Source code that assumes that types
int and long and pointers are the same size: This is incorrect for 64-bit Solaris systems. Type casts need updating if the underlying data types have changed.
- Data structures that contain the
long and pointer data types must be checked for different offset values than expected. Incorrect offset values are caused by alignment differences that occur when long and pointer fields grow to 64 bits.
- Implicit prototyped function declarations may cause type conversion for arguments and return values.
For more detailed information, see Section 3.1.1, "Converting Driver Code to Be 64-Bit Clean."
2.2 Driver-Specific Issues
In addition to general code cleanup to support the data model changes for LP64, device driver writers have these driver-specific issues to consider:
- In the 64-bit environment, new common access functions that use fixed-width data types are provided so that drivers can clearly specify the size of the data they are requesting. Drivers that use the old common access routines must be changed to use the fixed-width equivalent. For example,
ddi_getw(9F) needs to be changed to ddi_get16(9F).
- A driver may need to be updated to support data sharing between 64-bit drivers and 32-bit applications. The
ioctl(9E), devmap(9E), and mmap(9E) entry points must be written so that the driver can determine whether the data model of the application is the same as the data model of the kernel. If the data models differ, data structures may need to be adjusted. This usually means adding support to a 64-bit driver to accept a 32-bit application structure.
These two topics are discussed in Section 3.1.2, "Driver-Specific DDI Interfaces," and Section 3.2, "Other Driver Issues."
2.3 Advanced Issues
Advanced issues concerning DMA and performance are discussed in Section 4, "Advanced Issues and Guidelines."
3 General Conversion Guidelines
The 64-bit Solaris OS requires 64-bit driver objects; 32-bit device drivers cannot be used with 64-bit operating systems. Conversion from 32-bit to 64-bit code requires at minimum recompilation and re-linking with 64-bit libraries. For cases in which source code changes are required, Section 3 provides guidelines for writing clean driver code that works correctly in both 32-bit and 64-bit environments.
You may need to implement one or more of the suggestions discussed in this section to convert your code. These recommendations can help you maintain a single source and minimize use of #ifdef constructs.
3.1 Basic Steps for Converting Drivers
The principal work in converting your driver is to clean up the code for the 64-bit environment. The basic steps are similar to porting from machines based on 32-bit SPARC technology to machines based on 64-bit SPARC technology. Specific concerns related to systems built on AMD64 architecture are highlighted:
- Converting Driver Code to Be 64-Bit Clean:
- Check the code for the use of multiple data type models.
- Check for the use of the system-derived types that change size in ILP32 and LP64.
- Driver Specific DDI Interfaces:
- Check for problems due to DDI
typedef changes.
- Use new fixed-width DDI common access functions.
- Check changed fields in DDI data structures.
- Check changed arguments of DDI interfaces.
- Check the newly added and removed DDI interfaces.
Other Driver Issues:
- Converting
ioctl routines to be 64-bit free.
- Modifying the driver entry points that handle data sharing.
3.1.1 Converting Driver Code to Be 64-Bit Clean
Check the code for the use of multiple data type models.
Note the following when converting to LP64:
- Use system-derived types such as
size_t for type declarations
whenever possible. Using system-derived types for definitions allows for future
change.
- Use fixed-width types such as
uint32_t where appropriate to clearly
specify type declarations. To enable source code to be both 32-bit and 64-bit clean,
Solaris systems provide fixed-width integer types, derived types, constants, and
macros in the headers <sys/types.h> and <sys/inttypes.h>. The fixed-width types include both signed and unsigned integer types, such as int8_t, uint8_t, uint32_t, and uint64_t, as well as constants that specify their limits.
- Use the new derived types
uintptr_t and intptr_t as the
integral types for pointers.
- Update data structures by replacing
long with int32_t or uint32_t. This approach preserves the binary layout of 32-bit data structures and makes a driver 64-bit safe. These types are defined
in <sys/inttypes.h>.
You may want to run the lint utility in the Sun Studio 10 C5.7 compiler on your driver code to help check data model conversion problems.
The following guidelines and examples explain some potential problem areas that you
may encounter when porting from the ILP32 data model to the LP64 data model. All
samples include a recommended solution that runs correctly in both 32-bit and 64-bit
environments.
Example 1: 64-bit values should not be assigned to
smaller types. The code below is incorrect for a 64-bit environment:
int int_a, int_b;
long long_a, long_b;
int_a = long_a;
int_b = long_a + long_b;
|
This code does not cause any issues in the ILP32 environment, but it does have
potential for overflow in the LP64 environment because long is a 64-bit
type. If such assignments are intentional, use explicit casts to tell the compiler
and lint(1B).
int_a = (int) long_a;
int_b = (int) long_a + long_b;
|
Example 2: Improperly applied explicit casts can give unintended results.
int int_a;
long long_a;
va = (int)long_a/int_a;
va = (int)(long_a/int_a);
|
The first assignment in this code converts the 64-bit long_a to a 32-bit integer and then divides by the 32-bit int_a. The second assignment in this code divides long_a by int_a and then converts the result into a 32-bit integer.
Example 3: A pointer to an int is not compatible with a pointer to a long. Even the use of explicit casting is not correct.
int *int_a_p, *int_b_p;
long *long_a_p, *long_b_p;
long_a_p = int_a_p;
long_a_p = (long *)int_a_p;
int_b_p = long_b_p;
int_b_p = (int *)long_b_p;
|
This code results in alignment errors or wrong values on the 64-bit SPARC platform.
A pointer to an int has 4-byte alignment, and a pointer to long has 8-byte alignment. In the AMD64 architecture, data alignment is not imposed. However, for maximum performance and portability between x86 and SPARC platforms, avoid misaligned memory accesses.
Example 4: You cannot correct a potential overflow problem by casting to a
larger data type.
long long_a;
int int_a, int_b;
long_a = int_a * int_b;
long_a = (long) (int_a * int_b );
|
The result of both multiplications is type int, which is then
converted to type long before being assigned to long_a.
Instead, cast either operand to a long prior to the multiplication as shown in the following line of code. Then the result of the multiplication will be type long and correctly assigned to long_a.
long_a = (long)int_a * int_b;
|
Example 5: Untyped integral constants are int by default.
60000000 * 40000000 is 32-bit multiplication.
60000000L * 40000000 is 64-bit multiplication.
|
Example 6: When you perform arithmetic operations on pointers, converting a
pointer to a 32-bit integer (int or unsigned int) can give
unintended results in an LP64 environment.
int diff, base;
int *start, *end;
int pad;
base = start;
diff = end – start;
pad = (int) end % 16;
|
Instead, convert the pointer to intptr_t or uintptr_t before you perform any arithmetic operations on pointers. Use ptrdiff_t to hold the difference between two pointers.
ptrdiff_t diff;
intptr_t base;
int *start, *end;
uintptr_t pad;
base = (intptr_t) start;
diff = (ptrdiff_t) ((intptr_t) end – base);
pad = (uintptr_t) end % 16;
|
Example 7: The sizes of pointers and integer types are different depending on
the arithmetic context. A pointer in the kernel is explicitly cast as an integral type when performing arithmetic operations, such as shift and AND operations, in order to determine which memory segment contains a particular address. These explicit casts should be either to intptr_t or to uintptr_t. The casts preserve the 64-bit values in LP64 mode and the 32-bit values in ILP32 mode.
struct pagetable *p, *addr_item
#define ADDR_OFFSET 03
addr_item = (struct pagetable *) (((int)p)|ADDR_OFFSET );
|
The pagetable structure is used to manage buffer pages in a module. The address
pointer plus ADDR_OFFSET is the address of the data block. In a 64-bit
environment, the address should be cast to a 64-bit integer before the OR with ADDR_OFFSET.
addr_item = (struct pagetable *) (((uintptr_t)p) | ADDR_OFFSET);
|
Example 8: Inadequate function prototypes.
extern func_a(int), func_b(void);
long long_a, long_b;
long_a = func_a (long_b);
int_a = func_b ();
|
The return types of func_a() and func_b() are implicitly
declared as int. Type conversion may occur in parameter passing. The
return values of pointer or long may be truncated to 32-bit.
Example 9: The size of data objects changes.
The size of long and pointer types in a 64-bit environment changes the
size of the data structure. The alignment padding also changes the size of the data
structure.
struct device_regs{
ulong_t addr;
uint_t count;
};
|
This data type occupies 8 bytes in the 32-bit model, but occupies 12 bytes in the
64-bit model. If count is placed before addr, the size may become larger because a long has 8-byte alignment. Do not use a fixed offset to access the member fields. Instead, access data members by referencing the names of corresponding members.
struct device_regs{
uint32_t addr;
uint32_t count;
};
struct device_regs r;
uint_t *p = (uint_t *) ((char *) &r +4);
|
Instead, the code should be written to access the member count as
follows.
struct device_regs {
ulong_t addr;
uint_t count;
};
struct device_regs r;
uint_t *p = &r.count;
|
Use fixed-width structures if this is a desired case. For example, use a
fixed-width structure for a protocol header definition and device hardware register
definition.
struct header {
uint32_t type;
uint32_t length;
};
|
Check the use of system-derived types that change size in ILP32 and
LP64.
Some system derived types represent 32-bit quantities on a 32-bit system but
represent 64-bit quantities on a 64-bit system. For example:
clock_t: relative time in specified resolution
daddr_t: disk block address
ino_t: inode
intptr_t: integral pointer type
off_t: file offset
size_t: size of an object
ssize_t: size of an object or -1
time_t: time of day in seconds
timeout_id_t: timeout() handler id
uintptr_t: unsigned integral pointer type
Pay particular attention to the use of these derived types, especially when the
variables that use these types are assigned with the value from another derived type,
such as a fixed-width type.
Example 10:
size_t page_addr, v_addr;
page_addr = v_addr && 0xfffff000;
|
In this example, the second line should be page_addr = v_addr && ~0x0fffL or a similar value. In a 64-bit environment, the constant is type int by default, so the value of v_addr && 0xfffff000 only contains 20 bits in the middle of v_addr.
Example 11: This example shows the difference between
system-derived types in ILP32 and LP64 data models in reading or writing to a large
file. The example shows an error caused by the second argument of the following
function:
int fseeko(FILE *stream, off_t offset, int whence).
|
This function is identical to fseek(3C) except for the second argument,
offset, which is a long type in fseek(3C). In the following example, record_pos[] is used to record the position of accessed pointers in a large file:
int record_pos[MAX_RECORD_NUM];
off_t offset;
while ( !feof (fp) ) {
...
/* calculate the offset; */
fseeko ( fp, offset, SEEK_SET);
if ( condition ){
record_pos[i] = (int)offset;
}
}
|
In a 32-bit environment, you need to cast offset to type int
because a file cannot be larger than 4 Gbyte, which is the maximum value of an int variable. However, in a 64-bit environment, a file larger than 4 Gbyte cannot use record_pos[] to record the position.
3.1.2 Driver-Specific DDI Interfaces
Check for potential problems due to DDI typedef changes.
In the Solaris OS on 64-bit x86-based systems, the kernel redefines the DDI data
types to allow the compiler to check that the correct items are being passed. The
following type definitions are in <sys/dditypes.h> in the 32-bit Solaris kernel:
typedef void *ddi_dma_handle_t;
typedef void *ddi_dma_win_t;
typedef void *ddi_dma_seg_t;
typedef void *ddi_iblock_cookie_t;
typedef void *ddi_regspec_t;
typedef void *ddi_intrspec_t;
typedef void *ddi_softintr_t;
typedef void *dev_info_t;
typedef void *ddi_devmap_data_t;
typedef struct ddi_devid *ddi_devid_t;
typedef void *ddi_acc_handle_t;
|
The following type definitions are in <sys/dditypes.h> in
the 64-bit Solaris kernel:
typedef struct __ddi_dma_handle *ddi_dma_handle_t;
typedef struct __ddi_dma_win *ddi_dma_win_t;
typedef struct __ddi_dma_seg *ddi_dma_seg_t;
typedef struct __ddi_iblock_cookie *ddi_iblock_cookie_t;
typedef struct __ddi_regspec *ddi_regspec_t;
typedef struct __ddi_intrspec *ddi_intrspec_t;
typedef struct __ddi_softintr *ddi_softintr_t;
typedef struct __dev_info *dev_info_t;
typedef struct __ddi_devmap_data *ddi_devmap_data_t;
typedef struct __ddi_devid *ddi_devid_t;
typedef struct __ddi_acc_handle *ddi_acc_handle_t;
|
There is no impact on C binaries and correct C sources. Compilation errors occur in C sources that use these types incorrectly.
One way to avoid passing incorrect argument types to functions is to define the
structure pointers with specific structure tags. For example, notice the arguments to
the following two DDI functions that are declared in sunddi.h:
int ddi_add_softintr(dev_info_t *dip, int preference,
ddi_softintr_t *idp,
ddi_iblock_cookie_t *iblock_cookiep,
ddi_idevice_cookie_t *idevice_cookiep,
uint_t (*int_handler)(caddr_t int_handler_arg),
caddr_t int_handler_arg);
void ddi_remove_softintr(ddi_softintr_t id);
|
The third argument of function ddi_add_softintr() is a pointer to
ddi_softintr_t. In the 64-bit Solaris kernel, this is a pointer to pointer. In the partner function ddi_remove_softintr(), the argument is a ddi_softintr_t, which is a pointer to a ddi_softiniter structure. The interrupt cannot be removed if you make the following call:
ddi_remove_softintr (&id)
|
It may be difficult for developers to catch this kind of error in the program, but
the compiler can catch these errors if you use the ddi_softintr_t type.
Use fixed-width DDI common access functions.
Functions that use symbolic names to specify their data access size are obsolete. These functions include ddi_getb(9F), ddi_getw(9F), ddi_getl(9F), and ddi_getll(9F). The new function names specify a fixed-width data size, such as ddi_get8(9F), ddi_get16(9F), ddi_get32(9F), and ddi_get64(9F).
To port drivers to the 64-bit Solaris OS on x86 platforms, replace the obsolete
non-fixed-width DDI functions with fixed-width DDI common access functions, as shown
in the following table.
 |
ddi_getb(9F)
|
ddi_get8(9F)
|
reads 8-bit from device address
|
ddi_getw(9F)
|
ddi_get16(9F)
|
reads 16-bit from device address
|
ddi_getl(9F)
|
ddi_get32(9F)
|
reads 32 bits from device address
|
ddi_getll(9F)
|
ddi_get64(9F)
|
reads 64 bits from device address
|
ddi_putb(9F)
|
ddi_put8(9F)
|
writes 8-bit to device address
|
ddi_putw(9F)
|
ddi_put16(9F)
|
writes 16-bit to device address
|
ddi_putl(9F)
|
ddi_put32(9F)
|
writes 32 bits to device address
|
ddi_putll(9F)
|
ddi_put64(9F)
|
writes 64 bits to device address
|
ddi_rep_getb(9F)
|
ddi_rep_get8(9F)
|
reads 8-bit from device address repeatedly
|
ddi_rep_getw(9F)
|
ddi_rep_get16(9F)
|
reads 16-bit from device address repeatedly
|
ddi_rep_getl(9F)
|
ddi_rep_get32(9F)
|
reads 32 bits from device address repeatedly
|
ddi_rep_getll(9F)
|
ddi_rep_get64(9F)
|
reads 64 bits from device address repeatedly
|
ddi_rep_putb(9F)
|
ddi_rep_put8(9F)
|
writes 8-bit to device address repeatedly
|
ddi_rep_putw(9F)
|
ddi_rep_put16(9F)
|
writes 16-bit to device address repeatedly
|
ddi_rep_putl(9F)
|
ddi_rep_put32(9F)
|
writes 32 bits to device address repeatedly
|
ddi_rep_putll(9F)
|
ddi_rep_put64(9F)
|
writes 64 bits to device address repeatedly
|
pci_config_getb(9F)
|
pci_config_get8(9F)
|
reads 8-bit from PCI configuration space
|
pci_config_getw(9F)
|
pci_config_get16(9F)
|
reads 16-bit from PCI configuration space
|
pci_config_getl(9F)
|
pci_config_get32(9F)
|
reads 32 bits from PCI configuration space
|
pci_config_getll(9F)
|
pci_config_get64(9F)
|
reads 64 bits from PCI configuration space
|
pci_config_putb(9F)
|
pci_config_put8(9F)
|
writes 8-bit to PCI configuration space
|
pci_config_putw(9F)
|
pci_config_put16(9F)
|
writes 16-bit to PCI configuration space
|
pci_config_putl(9F)
|
pci_config_put32(9F)
|
writes 32 bits to PCI configuration space
|
pci_config_putll(9F)
|
pci_config_put64(9F)
|
writes 64 bits to PCI configuration space
|
Example: A driver that uses ddi_getl(9F) to access
32-bit data reads 64-bit data in a 64-bit environment. Drivers must use
ddi_get32(9F) to access 32-bit data in the 64-bit environment:
uint32_t ddi_get32(ddi_acc_handle_t hdl, uint32_t *dev_addr);
|
In addition to function names, certain function parameter types and function return
values are different in the 64-bit Solaris OS. Examples include unsigned char, unsigned short, and unsigned long change
to uint8_t, uint16_t, and uint32_t.
Those functions with changed parameter types and return values are shown in the
following table.
 |
unsigned char inb(int port)
|
uint8_t inb(int port)
|
reads 8-bit from an I/O port
|
unsigned short inw(int port)
|
uint16_t inw(int port)
|
reads 16-bit from an I/O port
|
unsigned long inl(int port)
|
uint32_t inl(int port)
|
reads 32 bits from an I/O port
|
void repinsb(int port, unsigned char *addr, int count)
|
void repinsb(int port, uint8_t *addr, int count)
|
reads multiple 8-bit from an I/O port
|
void repinsw(int port, unsigned short *addr, int count);
|
void repinsw(int port, uint16_t *addr, int count);
|
reads multiple 16-bit from an I/O port
|
void repinsd(int port, unsigned long *addr, int count);
|
void repinsd(int port, uint32_t *addr, int count);
|
reads multiple 32 bits from an I/O port
|
void outb(int port, unsigned char value);
|
void outb(int port, uint8_t value);
|
writes 8-bit to an I/O port
|
void outw(int port, unsigned short value);
|
void outw(int port, uint16_t value);
|
writes 16-bit to an I/O port
|
void outl(int port, unsigned long value);
|
void outl(int port, uint32_t value);
|
writes 32 bits to an I/O port
|
void repoutsb(int port, unsigned char *addr, int count);
|
void repoutsb(int port, uint8_t *addr, int count);
|
writes multiple 8-bit to an I/O port
|
void repoutsw(int port, unsigned short *addr, int count);
|
void repoutsw(int port, uint16_t *addr, int count);
|
writes multiple 16-bit to an I/O port
|
void repoutsd(int port, unsigned long *addr, int count)
|
void repoutsd(int port, uint32_t *addr, int count);
|
writes multiple 32 bits to an I/O port
|
Check changed fields in DDI data structures.
The data types of some of the fields in DDI data structures are changed in the 64-bit Solaris OS. Drivers that use these data structures should make sure that these fields are used appropriately.
The following table shows changed fields in DDI data structures.
 |
struct buf {
...
unsigned int b_bcount;
unsigned int b_resid;
int b_bufsize;
...
}
|
struct buf {
...
size_t b_bcount;
size_t b_resid;
size_t b_bufsize;
...
}
|
buf.h
|
typedef struct {
...
unsigned long dmac_address
unsigned int dmac_size;
unsigned int dmac_type;
...
} ddi_dma_cookie_t;
|
typedef struct {
...
union {
uint64_t _dmac_ll;
unit32_t _dmac_la[2];
} _dmu;
size_t dmac_size;
uint_t dmac_type;
...
} ddi_dma_cookie_t;
|
dditypes.h
The dmac_address is defined by MACRO and depends on the _LONG_LONG_HTOL. In 64-bit
Solaris x86:
#define dmac_laddress _dmu.dmac_ll
#define dmac_address _dmu.dmac_la[0]
|
struct scsi_pkt {
...
unsigned long pkt_flags;
long pkt_time;
long pkt_resid;
unsigned long pkt_state;
unsigned long pkt_statistics;
...
}
|
struct scsi_pkt {
...
uint_t pkt_flags;
int pkt_time;
ssize_t pkt_resid;
uint_t pkt_state;
uint_t pkt_statistics;
...
}
|
scsi/scsi_pkt.h
|
typedef struct
ddi_dma_attr {
unsigned int dma_attr_version;
unsigned long dma_attr_addr_lo;
unsigned long dma_attr_addr_hi;
unsigned long dma_attr_count_max;
unsigned long dma_attr_align;
unsigned int dma_attr_burstsizes;
unsigned int dma_attr_minxfer;
unsigned long dma_attr_maxxfer;
unsigned long dma_attr_seg;
int dma_attr_sgllen;
unsigned int dma_attr_granular;
unsigned int dma_attr_flags;
} ddi_dma_attr_t;
|
typedef struct ddi_dma_attr
{
uint_t dma_attr_version;
uint64_t dma_attr_addr_lo;
uint64_t dma_attr_addr_hi;
uint64_t dma_attr_count_max;
uint64_t dma_attr_align;
uint_t dma_attr_burstsizes;
uint32_t dma_attr_minxfer;
uint64_t dma_attr_maxxfer;
uint64_t dma_attr_seg;
int dma_attr_sgllen;
uint32_t dma_attr_granular;
uint_t dma_attr_flags;
} ddi_dma_attr_t;
|
ddidmareq.h
This structure defines attributes of the DMA engine and the device.
|
Check changed arguments of DDI interfaces.
The DDI function argument types in the following table have been changed in the
64-bit Solaris OS.
 |
void ddi_set_driver_private(dev_info_t *devi, caddr_t data);
|
void ddi_set_driver_private(dev_info_t *devi, void *data);
|
caddr_t ddi_get_driver_private(dev_info_t *devi);
|
void *ddi_get_driver_private(dev_info_t *devi);
|
struct buf *getrbuf(long sleepflag)
|
struct buf *getrbuf(int sleepflag)
|
void delay(long ticks);
|
void delay(clock_t ticks);
|
timeout_id_t timeout(void (*func)(caddr_t), caddr_t arg,long ticks);
|
timeout_id_t timeout(void (*func)(caddr_t), caddr_t arg, clock_t ticks);
|
struct map *rmallocmap(ulong_t mapsize);
|
struct map *rmallocmap(size_t mapsize);
|
struct map *rmallocmap_wait(ulong_t mapsize);
|
struct map *rmallocmap_wait(size_t mapsize);
|
struct buf *scsi_alloc_consistent_buf( struct scsi_address *ap, struct buf *bp,int datalen, ulong_t bflags, int (*callback )(caddr_t), caddr_t arg);
|
struct buf *scsi_alloc_consistent_buf(structs scsi_address *ap, struct buf *bp, size_t datalen, uint_t bflags, int (*callback )(caddr_t), caddr_t arg);
|
int uiomove(caddr_t address, long nbytes, enum uio_rw rwflag, uio_t *uio_p);
|
int uiomove(caddr_t address, size_t nbytes, enum uio_rw rwflag, uio_t *uio_p);
|
int cv_timedwait(kcondvar_t *cvp, kmutex_t *mp, long timeout);
|
int cv_timedwait(kcondvar_t *cvp, kmutex_t *mp,clock_t timeout);
|
int cv_timedwait_sig(kcondvar_t *cvp, kmutex_t *mp, long timeout);
|
int cv_timedwait_sig(kcondvar_t *cvp, kmutex_t *mp,clock_t timeout);
|
int ddi_device_copy(ddi_acc_handle_t src_handle, caddr_t src_addr, long src_advcnt,ddi_acc_handle_t dest_handle, caddr_t dest_addr, long dest_advcnt, size_t bytecount, ulong_t dev_datasz);
|
int ddi_device_copy(ddi_acc_handle_t src_handle, caddr_t src_addr, ssize_t src_advcnt, ddi_acc_handle_t dest_handle, caddr_t dest_addr, ssize_t dest_advcnt, size_t bytecount, uint_t dev_datasz);
|
int ddi_device_zero(ddi_acc_handle_t handle, caddr_t dev_addr, size_t bytecount, long dev_advcnt, ulong_t dev_datasz):
|
int ddi_device_zero(ddi_acc_handle_t handle, caddr_t dev_addr, size_t bytecount, ssize_t dev_advcnt,uint_t dev_datasz):
|
int ddi_dma_mem_alloc(ddi_dma_handle_t handle, uint_t length, ddi_device_acc_attr_t *accattrp, ulong_t flags, int (*waitfp)(caddr_t), caddr_t arg, caddr_t *kaddrp, uint_t *real_length, ddi_acc_handle_t *handlep);
|
int ddi_dma_mem_alloc(ddi_dma_handle_t handle, size_t length, ddi_device_acc_attr_t *accattrp, uint_t flags, int (*waitfp)(caddr_t), caddr_t arg, caddr_t *kaddrp, size_t *real_length, ddi_acc_handle_t *handlep);
|
int drv_getparm(unsigned int parm, unsigned long *value_p);
|
int drv_getparm(unsigned int parm, void *value_p);
|
In the 64-bit kernel, drv_getparm() can be used to fetch both 32-bit and 64-bit quantities. However, the interface does not define the data type of the value pointed to by value_p. This can lead to programming errors. You
should not use drv_getparm(). Use the following new routines instead:
clock_t ddi_get_lbolt(void);
time_t ddi_get_time(void);
cred_t *ddi_get_cred(void);
pid_t ddi_get_pid(void);
|
Changes to the DDI functions
The following DDI and libc functions are removed or added in the Solaris 10 OS:
- DDI functions removed. The driver interface
ddi_dma_segtocookie(9F) is obsolete and has been omitted from the kernel. Use ddi_dma_nextcookie(9F) instead.
- DDI functions added. The following functions are added to the DDI as a porting aid:
memcpy, memset, memmove, memcmp, strncat, strlcat, strlcpy, and strspn. This enables gcc-compiled drivers to work more easily, since gcc often generates references to these functions.
3.2 Other Driver Issues
3.2.1 Converting ioctl Routines to Be 64-Bit Clean
Many ioctl operations are common to device drivers in the same class. Many of these interfaces copy in or copy out data structures to or from the kernel. Some of these data structure members are changed in size in the 64-bit data model. The following table lists ioctl structures that you must convert explicitly in
64-bit driver ioctl routines for dkio, fdio, fbio, cdio, mtio, and scsi:
 |
DKIOCGAPART DKIOCSAPART DKIOGVTOC DKIOSVTOC
|
struct dk_map struct dk_allmap structpartition struct vtoc
|
dkio
|
FBIOPUTCMAP FBIOGETCMAP
|
struct fbcmap
|
fbio
|
FBIOPUTCMAPI FBIOGETCMAPI
|
struct fbcmap_i
|
FBIOSCURSOR FBIOSCURSOR
|
struct fbcursor
|
CDROMREADMODE1 CDROMREADMODE2
|
struct cdrom_read
|
cdio
|
CDROMCDDA CDROMCDXA CDROMSUBCODE
|
struct cdrom_cdda
structcdrom_cdxa
struct cdrom_subcode
|
FDIOCMD FDRAW
|
struct fd_cmd struct fd_raw
|
fdio
|
MTIOCTOP MTIOCGET MTIOCGETDRIVETYPE
|
struct mtop struct mtget struct mtdrivetype_request
|
mtio
|
USCSICMD
|
struct uscsi_cm
|
scsi
|
The nblocks property, the number of blocks each device contains, is
defined as a signed 32-bit integer. The nblocks property therefore limits
the maximum device size to 1 Tbyte. A new property, Nblocks, is
defined as an unsigned 64-bit integer to remove this limitation.
For more information, see Appendix C, "Making a Device Driver 64-Bit
Ready," in the Writing Device Drivers manual on the Sun Product Documentation site.
3.2.2 Modifying the Routines That Handle Data Sharing
If a 64-bit device driver uses ioctl(9E), devmap(9E), or mmap(9E) to share data structures with a 32-bit application, check whether those data structures contain long or pointer
types. The binary layout of such data structures is incompatible.
To handle potential data model differences, driver entry point routines that receive
arguments from user applications must determine whether the argument came from an
application that uses the same data type model as the kernel. The new DDI function
ddi_model_convert_from(9F) enables drivers to determine this.
The argument for ddi_model_convert_from(9F) is the data type
model of the current thread. If conversion to or from IPL32 is necessary, the return
value is DDI_MODEL_IPL32. If no conversion is needed, the return value is
DDI_MODEL_NONE. Typically, the _MULTI_DATAMODEL macro is defined by the system when the driver supports multiple data models.
Example: The xxxdevmap() function from
devmap(9E) provides a simple example. The devmap() function maps memory from a device into the address space of a process. The range of mapped memory in a device is from offset to offset+len. In this function, the data structure is used to interact with 32-bit and 64-bit applications. For a 32-bit application, the member addr in struct data contains a 32-bit user process's address, but for a 64-bit application, addr is a 64-bit address.
struct data {
int len;
caddr_t addr;
};
xxdevmap (dev_t dev, devmap_cookie_t dhp,
offset_t offset, size_t len, size_t *maplen,
uint_t model)
{
struct data dtc;
/* local copy for clash resolution */
struct data *dp = (struct data *)shared_area;
#ifdef _MULTI_DATAMODEL
switch (ddi_model_convert_from(model))
{
case DDI_MODEL_ILP32:
{
struct data32 {
int len;
uint32_t addr;
}*da32p;
/* cast shared_area->addr from 64-bit to 32-bit */
da32p = (struct data32 )shared_area;
dp = &dtc;
dp->len = da32p->len;
dp->addr = da32p->addr;
break;
}
case DDI_MODEL_NONE:
break;
}
#endif /* _MULTI_DATAMODEL */
/* continues along using dp */
...
}
|
To support 64-bit clean code, the ioctl(9E) and mmap(9E) routines need to consider the macro _MULTI_DATAMODEL as well.
3.3 How to Verify
The simplest and most straightforward procedure to verify a 64-bit driver ported from 32-bit source code is to run the 64-bit driver on a 64-bit kernel. If you do not have a 64-bit Solaris system, other source-level verification methods are available for
you to use. The compiler and lint are practical tools you can use to check
for the use of constructs that impact 32-bit and 64-bit portability.
lint
The lint tool is a C-program checker. The lint tool
in the Sun Studio C 5.7 compiler can be used to help find potential 64-bit problems
in your code. You can use the special -errchk=longptr64 option to request that lint notify you whenever you try to put something big into something small, such as casting a 64-bit pointer into a 32-bit int.
The lint tool prints the line number of the offending code, issues a warning message that describes the problem, and informs you that a pointer was involved or gives the sizes of types involved. This information (the fact that a pointer is
involved, and the sizes of the types) can be useful in finding only the 64-bit
problems and avoiding the pre-existing problems between 32-bit and smaller types.
The following sample shows how the lint output will appear. Hello.c contains syntax errors under the LP64 model. Hello64.c cleans up those errors after re-declaring the variable types and explicitly casting them.
sh$ cat -n hello.c
1 /* hello.c */
2 #include <stdio.h>
3 #include <stdlib.h>
4
5 static int func1(int pass1);
6 static long func2(long pass2);
7
8 void
9 main(void)
10 {
11 int i_a = 0, *i_ptr = 0;
12 long l_b = 0;
13
14 i_a = (int) i_ptr;
15 i_ptr = (void *)i_a;
16 i_a = l_b;
17 i_a = (int) l_b;
18 i_a = 0xffffaabbcc;
19 i_a = (int) 0xffffaabbcc;
20 i_a = func1(l_b);
21 i_a = func2(0xffffaabbcc);
22 printf("output 32-bit int %d\n", l_b);
23 scanf("input 32-bit int %d\n", &l_b);
24 }
25
26 static int
27 func1(int pass1)
28 {
29 return (pass1);
30 }
31
32 static long
33 func2(long pass2)
34 {
35 return (pass2);
36 }
sh$ /opt/SUNWspro/bin/lint hello.c -errchk=longptr64 |more
(14) warning: conversion of pointer loses bits
(15) warning: cast to pointer from 32-bit integer
(16) warning: assignment of 64-bit integer to 32-bit integer
(17) warning: cast from 64-bit integer to 32-bit integer
(18) warning: 64-bit constant truncated to 32 bits by assignment
(19) warning: cast from 64-bit integer constant expression to 32-bit integer
(20) warning: passing 64-bit integer arg, expecting 32-bit integer: func1(arg 1)
(21) warning: assignment of 64-bit integer to 32-bit integer
function returns value which is always ignored
printf scanf
function argument ( number ) type inconsistent with format
printf (arg 2) long :: (format) int hello.c(22)
scanf (arg 2) long * :: (format) int * hello.c(23)
sh$
sh$ cat -n hello64.c
1 /* hello64.c */
2 #include <stdio.h>
3 #include <stdlib.h>
4
5 static int func1(int pass1);
6 static long func2(long pass2);
7 void
8 main(void)
9 {
10 int i_a = 0, i_b = 0, *i_ptr = 0;
11 long l_a = 0, l_b = 0;
12
13 l_a = (unsigned long) i_ptr;
14 i_ptr = (void *)l_a;
15 l_a = l_b;
16 i_a = (int) l_b; /* intended narrow conversion */
17 l_a = 0xffffaabbcc;
18 i_a = (int) 0xffffaabbcc; /* intended narrow conversion */
19 i_a = func1(i_b);
20 l_a = func2(0xffffaabbcc);
21 printf("output 32-bit int %ld\n", l_b);
22 scanf("input 32-bit int %ld\n", &l_b);
23 }
24
25 static int func1(int pass1)
26 {
27 return (pass1);
28 }
29
30 static long func2(long pass2)
31 {
32 return (pass2);
33 }
sh$/opt/SUNWspro/SOS8/bin/lint hello64.c -errchk=longptr64 |more
(16) warning: cast from 64-bit integer to 32-bit integer
(18) warning: cast from 64-bit integer constant expression to 32-bit integer
set but not used in function
(10) i_a in main
function returns value which is always ignored
printf scanf
sh$
|
Compiler
The following example shows the error messages output by gcc from compiling the helloworld.c program on a 64-bit x86-based
system. The errors are eliminated in helloworld64.c by narrowing the
constants, padding the structures, and re-declaring the variable and function
types.
sh$ cat -n helloworld.c
1 #include <stdio.h>
2 #include <stdlib.h>
3
4 #define MAX_HEAD_COUNT 0xfffffffffffL
5 #define MAX_FORMAL_EMPLOYEE 0x0fffL
6
7 struct private_info {
8 long id;
9 int stat;
10 char *other_info;
11 } employee[MAX_HEAD_COUNT];
12
13 struct {
14 int outside_id;
15 unsigned int priv_info_addr;
16 } filo_index[MAX_FORMAL_EMPLOYEE]; /* all people who are recorded now */
17
18 long
19 encryp(long input)
20 {
21 return (input);
22 }
23
24 int
25 main(void)
26 {
27 int i, key; long tmp_id;
28 void *priv_p;
29
30 do {
31 scanf("%d%d%d", &i, &employee[i].stat, &employee[i].id);
32 if (i < 0) break;
33 if (i > MAX_HEAD_COUNT)
34 i = MAX_HEAD_COUNT;
35 if (employee[i].stat) {
36 filo_index[key].outside_id = employee[i].id;
37 key = encryp(filo_index[key].outside_id);
38 filo_index[key].priv_info_addr = (int)(employee + i);
39 }
40 } while (i < sizeof (employee));
41
42 while (1) {
43 scanf("%d", (int *)&tmp_id);
44 key = (int)encryp(tmp_id);
45 if (key == -1)
46 key = (int)MAX_FORMAL_EMPLOYEE;
47 priv_p = (void *)filo_index[key].priv_info_addr;
48 printf("%d\t%d\n", ((struct private_info *)priv_p)->id,
49 ((struct private_info *)priv_p)->stat);
50 }
51 }
sh$ gcc -fsyntax-only -Wall -Wcast-qual -Wconversion -Wmissing-format-attribute \
-Wpadded -Werror -g3 -o helloworld helloworld.c
cc1: warnings being treated as errors
helloworld.c:10: warning: padding struct to align `other_info'
helloworld.c: In function `main':
helloworld.c:31: warning: int format, different type arg (arg 4)
helloworld.c:33: warning: comparison is always false due to limited range of
data type
helloworld.c:34: warning: overflow in implicit constant conversion
helloworld.c:37: warning: passing arg 1 of `encryp' with different width due
to prototype
helloworld.c:38: warning: cast from pointer to integer of different size
helloworld.c:46: warning: cast to pointer from integer of different size
helloworld.c:48: warning: int format, different type arg (arg 2)
sh$
sh$ cat -n helloworld64.c
1 #include <stdio.h>
2 #include <stdlib.h>
3
4 #define MAX_HEAD_COUNT 0xffffL
5 #define MAX_FORMAL_EMPLOYEE 0x0fffL
6
7 struct private_info {
8 long id;
9 int stat;
10 char padding[4];
11 char *other_info;
12 } employee[MAX_HEAD_COUNT]; /* all people who have been employees */
13
14 struct {
15 int outside_id;
16 char padding[4];
17 unsigned long priv_info_addr;
18 } filo_index[MAX_FORMAL_EMPLOYEE]; /* all people who are employees now */
19
20 long
21 encryp(long input)
22 {
23 return (input);
24 }
25
26 int
27 main(void)
28 {
29 int i, key;
30 long tmp_id;
31 void *priv_p;
32
33 do {
34 scanf("%d%d%ld", &i, &employee[i].stat, &employee[i].id);
35 if (i < 0) break;
36 if (i > MAX_HEAD_COUNT)
37 i = MAX_HEAD_COUNT;
38 if (employee[i].stat) {
39 filo_index[key].outside_id = employee[i].id;
40 key = encryp((long)filo_index[key].outside_id);
41 filo_index[key].priv_info_addr =
42 (unsigned long)(employee + i);
43 }
44 } while (i < sizeof (employee));
45
46 while (1) {
47 scanf("%d", (int *)&tmp_id);
48 key = (int)encryp(tmp_id);
49 if (key == -1)
50 key = (int)MAX_FORMAL_EMPLOYEE;
51 priv_p = (void *)filo_index[key].priv_info_addr;
52 printf("%ld\t%d\n", ((struct private_info *)priv_p)->id,
53 ((struct private_info *)priv_p)->stat);
54 }
55 }
sh$ ./gcc -fsyntax-only -Wall -Wcast-qual -Wconversion -Wmissing-format-attribute \
-Wpadded -Werror -g3 -o helloworld64 helloworld64.c
sh$
|
3.4 Checklist for Getting Started
Use the following checklist to convert your driver code to the Solaris OS on 64-bit
x86-based systems:
- Read this entire document with emphasis on Section 3, "General
Conversion Guidelines."
- Include
<sys/types.h> (or at a minimum,
<sys/isa_defs.h>) in your code to pull in the _ILP32 or _LP64 definitions as well as many basic derived types.
- Move function prototypes and external declarations with non-local scope
to headers and include these headers in your code.
- Review all data structures and interfaces to verify that these are still
valid in the 64-bit environment.
- Carefully check your use of
long types and sized types
(int32_t, int64_t). If a fixed-length quantity is desired, then use the appropriately sized type, for example, matching hardware register structures. If an address is needed, then use uintptr_t, for example, if an opaque handle to hold an address is needed.
- Carefully check structures that use 64-bit long types, for example,
uint64_t. The alignment and size may change, if the code is compiled in 32-bit mode and then in 64-bit mode. See the example below:
#include <stdio.h>
#include <sys/types.h>
struct misalign {
uint32_t foo;
uint64_t bar;
};
main()
{
struct misalign a;
printf("sizeof struct is: %d\n", sizeof(a));
printf("offset of bar is: %d\n", (uintptr_t)&a.bar - (uintptr_t)&a);
}
|
On a 32-bit system based on 386 architecture, this code produces:
sizeof struct is: 12
offset of bar is: 4
|
On a 64-bit system based on AMD64 architecture, this code produces:
sizeof struct is: 16
offset of bar is: 8
|
As a result, attempts to make structures work in both 32-bit and 64-bit environments
by using 64-bit fields may not succeed. This occurs fairly often, especially in
structures that have been passed into and out of the kernel through ioctl
calls. A 32-bit application sees the structure differently than the 64-bit kernel
sees it.
- Replace any use of obsolete DDI interfaces with newer DDI interfaces that allow better compiler type checking. Especially note that the
hat_getkpfnum() function is removed. You must use the proper ddi_dma_* memory functions instead.
- Compile with either
gcc or the Sun Studio 10 C5.7 compiler in both 32-bit and 64-bit modes to get rid of all compiler warnings, unless
the application is being provided only as 64 bit. Note especially that the gcc compiler behaves differently than the Sun C compiler in 32-bit mode when assigning pointers into 64-bit entities. The gcc compiler sign extends, and the Sun C compiler does not.
- Run
lint(1) using the -errchk=longptr64 flag and review
each warning individually. Note that not all warnings require a change to the code.
Depending on the resulting changes, you might also want to run lint(1) again, both in the 32-bit environment and 64-bit environment.
- Test the application by executing the 32-bit version on the 32-bit OS,
and the 64-bit version on the 64-bit OS. It is not necessary to test the 32-bit
version on the 64-bit OS.
- See the next section for advanced topics specific to drivers and the
Solaris OS on 64-bit x86 platforms.
4 Advanced Issues and Guidelines
Following the general conversion guidelines from the previous section should help you produce clean 64-bit code in device drivers for the Solaris OS on x86 platforms.
However, that does not mean that the code is portable or tuned for performance. This
section can help you take full advantage of the features of the AMD Opteron
processor.
This section discusses some advanced topics and offers guidelines for addressing
these topics.
4.1 DMA Issues
The DMA framework in the Solaris OS hides the hardware details of a platform, such as I/O MMU, I/O cache, data alignment, data order and so on. When writing a driver to
work on multiple platforms, you need to consider the following issues:
- Be careful with I/O MMU translations.
- Maximum burst size can change.
- A device might have no 64-bit addressing capability although the driver
is 64-bit.
- You should check changes to DMA DDI functions and DMA DDI structures.
You might also encounter some performance-related problems with the DDI functions.
You may have issues specific to the 64-bit Solaris OS as well.
4.1.1 Be Careful With I/O MMU Translations
The CPU uses MMU to translate a virtual address (CPU view) to physical address (main
bus view). The device uses I/O MMU to translate I/O bus addresses (PCI address) to
physical addresses (main bus view). The I/O MMU can be very convenient for performing
DMA transfers.
The I/O MMU provides a device with the ability to perform DVMA. DVMA enables you to
program a DMA engine with a large block of virtual contiguous address. That relieves
the CPU from programming the DMA engine with many small blocks of physical
addresses. SPARC technology offers an I/O MMU and can perform DVMA. In some cases,
only a single DMA window with a few DMA cookies needs to be programmed to the DMA
engine.
Some platforms, IA for example, have no I/O MMU. This means that developers can get
multiple DMA windows and multiple DMA cookies. Of course, each DMA cookie may
only contain a few pages. Fortunately, some devices can provide scatter/gather (S/G)
capability to perform highly efficient DMA transferring. So when writing a device
driver to support both the SPARC and IA (AMD64 included) platforms, developers should
add it with multiple DMA window and multiple DMA cookie support. If the device has
S/G capability, the device driver should make good use of it to enhance DMA
performance.
Example:
struct sglentry {
size_t dma_addr;
uint32_t dma_size;
} sglist[SGLLEN];
|
In this example, each DMA cookie should be filled in sglist as an S/G
element.
Notes:
1. When using DVMA, cookie.dma_address is the virtual address that
appears on PCI bus. It is the responsibility of the I/O MMU to translate the virtual
addresses into physical addresses.
2. AMD64 compatible processors can take advantage of the so-called AGP GART, which is quite similar to the I/O MMU address translation table, to serve other PCI devices.
See the end of this section for more details.
4.1.2 Max Burst Size Can Change
Drivers specify the DMA burst sizes that their device supports in the dma_attr_burst sizes field of the ddi_dma_attr structure. This is a bitmap of the supported burst sizes. When you write a driver that is 32-bit and 64-bit compatible, you may need to change this structure to optimize performance.
When DMA resources are allocated, the system can impose further restrictions on the
burst sizes that the device can use. A better approach is to use the ddi_dma_burstsizes(9F) routine to obtain the allowed burst sizes. This routine returns the appropriate burst size bitmap for the device. When DMA resources are allocated, a driver can ask the system for the appropriate burst sizes to use for its DMA engine.
Example: Determining Burst Size
The following pseudocode explains a correct way to determine burst size:
#define BEST_BURST_SIZE 0x20 /* 32 bytes */
if (ddi_dma_buf_bind_handle(xsp->handle,xsp->bp, flags, xxstart,
(caddr_t)xsp, &cookie, &ccount) != DDI_DMA_MAPPED) {
/* error handling */
}
burst = ddi_dma_burstsizes(xsp->handle);
/* check which bit is set and choose one burstsize to */
/* program the DMA engine */
if (burst & BEST_BURST_SIZE) {
/* program DMA engine to use this burst size */
} else {
/* other cases */
}
|
4.1.3 Device With No 64-Bit Addressing Uses 64-Bit Driver
Some devices are only capable of 32-bit addressing. Others are capable of both 32-bit and 64-bit addressing. Knowing which addressing capability your device supports is very important. Note that some 32-bit devices are capable of 64-bit addressing and
also that some 32-bit PCI devices can achieve 64-bit addressing through a DAC (Dual
Address Cycle) approach. DAC can finish 64-bit addressing within two PCI clock
periods. For example, a device may possess two registers: DMADAC0 and DMADAC1. A DMA
engine can perform DAC within two PCI clock periods, where the first PCI address is a
Lo-Addr DMADAC0 with the PCI command (C/BE[3:0]#) D, and the second PCI address is
a Hi-Addr DMADAC1 with the PCI command (C/BE[3:0]#) 6 or 7 (depending on whether
there is a write or a read).
If the device has 64-bit addressing capability, then its performance should be
greatly enhanced in a 64-bit OS. But what if the device has no 64-bit addressing
capability?
The DMA engine has a limited addressing capability, for example, a PCI master device
that can only perform SAC (Single Address Cycle) would only be capable of 32-bit
addressing. Its ddi_dma_attr_t should be described as follows:
static ddi_dma_attr_t attributes =
{
DMA_ATTR_VO, /* Version number */
0x00000000, /* low address */
0xFFFFFFFF, /* high address */
0xFFFFFFFF, /* counter register max */
.....
};
|
This example tells the DDI DMA framework that your device has
only 0 to 32-bit addressing capability by assigning "0x00000000" to the low address and "0xFFFFFFFF" to the high address.
If the driver uses the ddi_dma_mem_alloc(9F) routine to allocate a piece of kernel virtual memory, ddi_dma_buf_bind_handle(9F) or ddi_dma_addr_bind_handle(9F) assures that DMA cookies are allocated in the range of the low and high addresses assigned by ddi_dma_attr_t. Therefore, you do not need to worry about issues such as whether the physical memory is greater than 4 Gbyte.
If this device needs to process a DMA request coming from other devices, you can
perform a device-to-device DMA transfer. In this situation, the local memory of
another device may be mapped into a segment beyond 4 Gbyte, so that you
cannot perform a DMA transfer directly. Fortunately, device-to-device DMA transfer is
only used rarely.
For those devices that support both 32-bit and 64-bit addressing, 64-bit drivers
should take advantage of the device's 64-bit addressing capability. The following
example describes the DMA engine for this case:
static ddi_dma_attr_t attributes =
{
DMA_ATTR_VO, /* Version number */
0x0000000000000000, /* low address */
0xFFFFFFFFFFFFFFFF, /* high address */
0xFFFFFFFF, /* counter register max */
.....
};
|
Then you can use dma_laddress in ddi_dma_cookie_t structure and program it into the DMA engine.
4.1.4 Changes in DMA DDI Functions and DMA DDI Structures
The primary changes to DMA DDI function and structures are in ddi_dma_cookie_t and ddi_dma_mem_alloc. Note that the size of some arguments returned from the function may change. The definitions of those arguments should be changed. For more details, please refer to Section 3, "Basic Conversion Guidelines."
4.1.5 Be Careful With the NUMA System
NUMA has a single OS image, single address space view, nonuniform memory, nonuniform
I/O, and non-coherent cache. After DMA transfers are done, ddi_dma_sync(9F) should be explicitly called to ensure that the caches are successfully flushed.
4.2 Other Related Issues
The following issues concern the performance of DDI functions:
- The
ddi_dma_mem_alloc(9F) function has not yet been optimized for
performance. Creation of a more highly efficient mem_alloc function in 64-bit is under investigation.
- No proper DDI functions are available to manage physical memory directly.
This results in inefficient use of kernel virtual memory for the construction of AGP
video drivers.
The AMD64 platform can offer some architectural advantages. For example, its AGP
Aperture can be shared by both AGP and PCI devices. That is, you can use the AGP
Aperture as an I/O MMU component. This is a good way to enhance DMA performance. You
can also enable cache coherency for the AGP aperture by setting one bit in the GART
entry. With this approach, the AGP master can read the data from the processor caches
faster than it can read data from the DDR memory.
5 Porting Example
This section uses sample code to explain how to port a driver from 32-bit to 64-bit.
The driver in this example manages a RAM space such as a RAM disk and uses
programmed I/O to drive a device with a 32-bit CSR register and a 32-bit data
register. This sample modifies a 32-bit driver to be a 64-bit safe driver. The
sample defines the MACRO VERSION_64-bit while compiling in a 64-bit
environment.
The original source code and the header file, pio.h, are provided in the
appendix. A link to the source code for the converted version appears below. The changes that
are made in this example can be summarized as follows:
- The obsolete functions,
ddi_putl() and
ddi_getl(), have been replaced by ddi_put32() and
ddi_get32(), respectively.
- The declarations of
size and *addr as type
int are changed to size_t and caddr_t,
respectively, to make them 32-bit and 64-bit free. See lines 37 and 38.
- The type of the return value from
getminor() has been changed
from int to minor_t. See lines 189-202.
- The type for
pio_p->addr needs to be changed from int to caddr_t. The function min() has a return value of type int. The type needs to be changed to size_t and a separate version of the min() function needs to be written for 64-bit cases. See lines 341-375.
- A variable named
tmp needs to be shared with the user
application and must be safe for both 32-bit and 64-bit cases. The buffer to be
copied needs to have a size appropriate to the case. See lines 406-446.
Note that these changes are also documented by comments.
Porting Example
6 Conclusion
This document describes issues that you need to be aware of when you write 32-bit and
64-bit safe drivers for the Solaris OS on x86 platforms. These issues include
multiple C language data models, the use of system-derived types that have changed,
and changes to some of the DDI interfaces. Also, you need to address some
driver-specific issues. Finally, you need to consider performance issues such as the
use of DMA.
This article lists and describes these issues, and it provides solutions and
recommendations for these issues. This guide should help you write clean code for
32-bit and 64-bit device drivers for the Solaris OS on x86 platforms.
For further information on device drivers in the Solaris OS, see
Writing Device Drivers. To see examples of some basic device drivers, see Device Driver Tutorial (PN 817-5789, Sun Microsystems). If you are new to development in the Solaris OS or are unfamiliar with
the range of information on the Solaris OS, see the Introduction to the Solaris
Development Environment.
7 References
- Writing Device Drivers, Appendix C: "Making a Device Driver
64-Bit Ready," PN 816-4854, Sun Microsystems, 2004
- Solaris 64-Bit Developer's Guide, PN 816-5138, Sun Microsystems,
2004
- STREAMS Programming Guide, PN 816-4855, Sun Microsystems, 2004
- Software Optimization Guide for AMD Athlon 64 and AMD Opteron Processors (pdf), PN 25112, Advanced Micro Devices, 2004
Appendix
This appendix lists the pio.h header file and the source code before
conversion, pio_32.c.
|
|