Sun Java Solaris Communities My SDN Account Join SDN
 
Article

Porting Device Drivers for the Solaris Operating System to 64-Bit x86 Architectures

 
By Cecilia Hu, December, 2004  

Abstract: This document describes how to modify 32-bit device drivers that run on the Solaris Operating System (OS) to be compatible with the 64-bit Solaris 10 OS on x86 platforms.

Contents:

1 Introduction

The capabilities of the Solaris platform continue to expand to meet customer needs. The Solaris 10 release is designed to fully support both 32-bit and 64-bit architectures. The Solaris OS supports machines based on both 32-bit and 64-bit SPARC processors as well as 32-bit and 64-bit x86 platforms.

The primary difference between the 32-bit and 64-bit development environments is that 32-bit applications are based on the ILP32 data model, while 64-bit applications are based on the LP64 model. The primary difference between applications for SPARC and x86-based systems, from the driver developer's point of view, is big-endian versus little-endian translation.

To write a common device driver for the Solaris OS, developers need to understand and consider these differences.

Note: This document addresses topics related to x86 platforms only. In this document, references to 64-bit operating systems refer to the Solaris OS on machines with AMD Opteron processors.

The Solaris OS runs in 64-bit mode on appropriate hardware, and provides a 64-bit kernel with a 64-bit address space for applications. The 64-bit kernel extends the capabilities of the 32-bit kernel by addressing more than 4 Gbyte of physical memory, by mapping up to 16 Tbyte of virtual address space for 64-bit application programs, and by allowing 32-bit and 64-bit applications to coexist on the same system.

This document discusses the differences between 32-bit and 64-bit data models, provides guidelines for cleaning 32-bit device drivers in preparation for the 64-bit Solaris OS kernel, and addresses driver-specific issues with the 64-bit Solaris OS kernel.

1.1 Audience and Organization

This information is intended for device driver developers who want to deliver 32-bit and 64-bit clean device drivers for the Solaris OS on x86 platforms. The article provides guidance on how to write code that is portable between the 32-bit environment and the 64-bit environment.

This document is organized as follows:

Notes:

Section 2, "Basic Information," explains some basic problem areas that a developer writing device drivers for Solaris systems may encounter when writing 32-bit and 64-bit clean device drivers.

Section 3, "General Conversion Guidelines," explains in detail the steps that should be taken in writing a common device driver for Solaris systems. At the end of this section is a short checklist for conversion. These guidelines can help you provide clean code for a 64-bit driver for the Solaris OS on x86 platforms.

Section 4, "Advanced Issues and Guidelines," addresses some advanced issues for developers writing device drivers for the 64-bit Solaris OS on x86 platforms, with a focus on enhancing performance.

Section 5, "Porting Example," presents an example that illustrates the 32-bit to 64-bit conversion process.

Section 6, "Conclusion," summarizes the issues involved in writing 32-bit and 64-bit common device drivers for the Solaris OS on x86 platforms.

1.2 Terms

ILP32

C language data model where int, long, and pointer data types are 32 bits in size.

LP64

C language data model where the int data type is 32 bits wide, but long and pointer data types are 64 bits wide.

32-bit program

Program compiled to run in 32-bit mode. For example, programs compiled for IA32 and 32-bit SPARC platforms.

64-bit program

Program compiled to run in 64-bit mode. For example, programs compiled for the AMD64 and 64-bit SPARC platforms. Programs that have been successfully converted to run in 64-bit mode are also referred to as being 64-bit clean or 64-bit safe.

32-bit and 64-bit common device driver

A device driver with portable code that can be built and run in either a 32-bit or 64-bit environment.

2. Basic Information

Before you start to write a 64-bit clean device driver, it is useful to understand some of the differences between 32-bit and 64-bit operating systems. Most of these differences are similar to those you would encounter if you ported a driver from a 32-bit SPARC processor-based machine to a 64-bit SPARC processor-based machine.

2.1 Different Data Models

ILP32 is the C language data model for the 32-bit Solaris OS. ILP32 defines the int, long and pointer data types as 32 bits wide, type short as 16 bits, and type char as 8 bits.

LP64 is the C language data model for the 64-bit Solaris OS. LP64 defines the long and pointer data types as 64 bits wide. The LP64 data model also has a larger address space and a larger scalar arithmetic range.

The following table shows the basic language differences that exist both in driver programming and in application programming.

Table 1: Basic Language Differences
 
 
C Type
ILP32
LP64
char
8
8
short
16
16
int
32
32
long
32
64
long long
64
64
float
32
32
double
64
64
long double
96
128
pointer
32
64

It is not unusual for 32-bit applications to assume that types int, long and pointers are the same size. Drivers that run in a 64-bit environment may need to be converted to use the 64-bit data model. Because the size of type long and pointer changes in the LP64 data model, several potential problems could occur:

  • Source code that assumes that types int and long and pointers are the same size: This is incorrect for 64-bit Solaris systems. Type casts need updating if the underlying data types have changed.
  • Data structures that contain the long and pointer data types must be checked for different offset values than expected. Incorrect offset values are caused by alignment differences that occur when long and pointer fields grow to 64 bits.
  • Implicit prototyped function declarations may cause type conversion for arguments and return values.

For more detailed information, see Section 3.1.1, "Converting Driver Code to Be 64-Bit Clean."

2.2 Driver-Specific Issues

In addition to general code cleanup to support the data model changes for LP64, device driver writers have these driver-specific issues to consider:

  • In the 64-bit environment, new common access functions that use fixed-width data types are provided so that drivers can clearly specify the size of the data they are requesting. Drivers that use the old common access routines must be changed to use the fixed-width equivalent. For example, ddi_getw(9F) needs to be changed to ddi_get16(9F).
  • A driver may need to be updated to support data sharing between 64-bit drivers and 32-bit applications. The ioctl(9E), devmap(9E), and mmap(9E) entry points must be written so that the driver can determine whether the data model of the application is the same as the data model of the kernel. If the data models differ, data structures may need to be adjusted. This usually means adding support to a 64-bit driver to accept a 32-bit application structure.

These two topics are discussed in Section 3.1.2, "Driver-Specific DDI Interfaces," and Section 3.2, "Other Driver Issues."

2.3 Advanced Issues

Advanced issues concerning DMA and performance are discussed in Section 4, "Advanced Issues and Guidelines."

3 General Conversion Guidelines

The 64-bit Solaris OS requires 64-bit driver objects; 32-bit device drivers cannot be used with 64-bit operating systems. Conversion from 32-bit to 64-bit code requires at minimum recompilation and re-linking with 64-bit libraries. For cases in which source code changes are required, Section 3 provides guidelines for writing clean driver code that works correctly in both 32-bit and 64-bit environments.

You may need to implement one or more of the suggestions discussed in this section to convert your code. These recommendations can help you maintain a single source and minimize use of #ifdef constructs.

3.1 Basic Steps for Converting Drivers

The principal work in converting your driver is to clean up the code for the 64-bit environment. The basic steps are similar to porting from machines based on 32-bit SPARC technology to machines based on 64-bit SPARC technology. Specific concerns related to systems built on AMD64 architecture are highlighted:

  • Converting Driver Code to Be 64-Bit Clean:
    • Check the code for the use of multiple data type models.
    • Check for the use of the system-derived types that change size in ILP32 and LP64.
  • Driver Specific DDI Interfaces:
    • Check for problems due to DDI typedef changes.
    • Use new fixed-width DDI common access functions.
    • Check changed fields in DDI data structures.
    • Check changed arguments of DDI interfaces.
    • Check the newly added and removed DDI interfaces.
    Other Driver Issues:
    • Converting ioctl routines to be 64-bit free.
    • Modifying the driver entry points that handle data sharing.

3.1.1 Converting Driver Code to Be 64-Bit Clean

Check the code for the use of multiple data type models.

Note the following when converting to LP64:

  • Use system-derived types such as size_t for type declarations whenever possible. Using system-derived types for definitions allows for future change.
  • Use fixed-width types such as uint32_t where appropriate to clearly specify type declarations. To enable source code to be both 32-bit and 64-bit clean, Solaris systems provide fixed-width integer types, derived types, constants, and macros in the headers <sys/types.h> and <sys/inttypes.h>. The fixed-width types include both signed and unsigned integer types, such as int8_t, uint8_t, uint32_t, and uint64_t, as well as constants that specify their limits.
  • Use the new derived types uintptr_t and intptr_t as the integral types for pointers.
  • Update data structures by replacing long with int32_t or uint32_t. This approach preserves the binary layout of 32-bit data structures and makes a driver 64-bit safe. These types are defined in <sys/inttypes.h>.

You may want to run the lint utility in the Sun Studio 10 C5.7 compiler on your driver code to help check data model conversion problems.

The following guidelines and examples explain some potential problem areas that you may encounter when porting from the ILP32 data model to the LP64 data model. All samples include a recommended solution that runs correctly in both 32-bit and 64-bit environments.

Example 1: 64-bit values should not be assigned to smaller types. The code below is incorrect for a 64-bit environment:

int  int_a,  int_b;
long long_a, long_b;
int_a = long_a;
int_b = long_a + long_b;

This code does not cause any issues in the ILP32 environment, but it does have potential for overflow in the LP64 environment because long is a 64-bit type. If such assignments are intentional, use explicit casts to tell the compiler and lint(1B).

int_a = (int) long_a;
int_b = (int) long_a + long_b;

Example 2: Improperly applied explicit casts can give unintended results.

int  int_a;
long long_a;
va = (int)long_a/int_a;
va = (int)(long_a/int_a);

The first assignment in this code converts the 64-bit long_a to a 32-bit integer and then divides by the 32-bit int_a. The second assignment in this code divides long_a by int_a and then converts the result into a 32-bit integer.

Example 3: A pointer to an int is not compatible with a pointer to a long. Even the use of explicit casting is not correct.

int  *int_a_p,  *int_b_p;
long *long_a_p, *long_b_p;
long_a_p = int_a_p;
long_a_p = (long *)int_a_p;
int_b_p  = long_b_p;
int_b_p  = (int *)long_b_p;

This code results in alignment errors or wrong values on the 64-bit SPARC platform. A pointer to an int has 4-byte alignment, and a pointer to long has 8-byte alignment. In the AMD64 architecture, data alignment is not imposed. However, for maximum performance and portability between x86 and SPARC platforms, avoid misaligned memory accesses.

Example 4: You cannot correct a potential overflow problem by casting to a larger data type.

long long_a;
int  int_a, int_b;
long_a = int_a * int_b;
long_a = (long) (int_a * int_b );

The result of both multiplications is type int, which is then converted to type long before being assigned to long_a. Instead, cast either operand to a long prior to the multiplication as shown in the following line of code. Then the result of the multiplication will be type long and correctly assigned to long_a.

long_a = (long)int_a * int_b;

Example 5: Untyped integral constants are int by default.

60000000  * 40000000	is 32-bit multiplication.
60000000L * 40000000	is 64-bit multiplication.

Example 6: When you perform arithmetic operations on pointers, converting a pointer to a 32-bit integer (int or unsigned int) can give unintended results in an LP64 environment.

int diff, base;
int *start, *end;
int pad;
base = start;
diff = end – start;
pad = (int) end % 16;

Instead, convert the pointer to intptr_t or uintptr_t before you perform any arithmetic operations on pointers. Use ptrdiff_t to hold the difference between two pointers.

ptrdiff_t diff;
intptr_t base;
int *start, *end;
uintptr_t pad;
base = (intptr_t) start;
diff = (ptrdiff_t) ((intptr_t) end – base);
pad = (uintptr_t) end % 16;

Example 7: The sizes of pointers and integer types are different depending on the arithmetic context. A pointer in the kernel is explicitly cast as an integral type when performing arithmetic operations, such as shift and AND operations, in order to determine which memory segment contains a particular address. These explicit casts should be either to intptr_t or to uintptr_t. The casts preserve the 64-bit values in LP64 mode and the 32-bit values in ILP32 mode.

struct pagetable *p, *addr_item
#define ADDR_OFFSET 03
addr_item = (struct pagetable *) (((int)p)|ADDR_OFFSET );

The pagetable structure is used to manage buffer pages in a module. The address pointer plus ADDR_OFFSET is the address of the data block. In a 64-bit environment, the address should be cast to a 64-bit integer before the OR with ADDR_OFFSET.

addr_item = (struct pagetable *) (((uintptr_t)p) | ADDR_OFFSET);

Example 8: Inadequate function prototypes.

extern func_a(int), func_b(void);
long   long_a, long_b;
long_a = func_a (long_b);
int_a  = func_b ();

The return types of func_a() and func_b() are implicitly declared as int. Type conversion may occur in parameter passing. The return values of pointer or long may be truncated to 32-bit.

Example 9: The size of data objects changes.

The size of long and pointer types in a 64-bit environment changes the size of the data structure. The alignment padding also changes the size of the data structure.

struct device_regs{
    ulong_t	addr;
    uint_t	count;
};

This data type occupies 8 bytes in the 32-bit model, but occupies 12 bytes in the 64-bit model. If count is placed before addr, the size may become larger because a long has 8-byte alignment. Do not use a fixed offset to access the member fields. Instead, access data members by referencing the names of corresponding members.

struct device_regs{
    uint32_t	addr;
    uint32_t	count;
};
struct device_regs r;
uint_t *p = (uint_t *) ((char *) &r +4);

Instead, the code should be written to access the member count as follows.

struct device_regs {
    ulong_t addr;
    uint_t  count;
};
struct device_regs r;
uint_t *p = &r.count;

Use fixed-width structures if this is a desired case. For example, use a fixed-width structure for a protocol header definition and device hardware register definition.

struct header {
    uint32_t	type;
    uint32_t	length;	
};

Check the use of system-derived types that change size in ILP32 and LP64.

Some system derived types represent 32-bit quantities on a 32-bit system but represent 64-bit quantities on a 64-bit system. For example:

clock_t:  relative time in specified resolution
daddr_t:  disk block address
ino_t:  inode
intptr_t:  integral pointer type
off_t:  file offset
size_t:  size of an object
ssize_t:  size of an object or -1
time_t:  time of day in seconds
timeout_id_t:  timeout() handler id
uintptr_t:  unsigned integral pointer type

Pay particular attention to the use of these derived types, especially when the variables that use these types are assigned with the value from another derived type, such as a fixed-width type.

Example 10:

size_t page_addr, v_addr;
page_addr = v_addr && 0xfffff000;

In this example, the second line should be page_addr = v_addr && ~0x0fffL or a similar value. In a 64-bit environment, the constant is type int by default, so the value of v_addr && 0xfffff000 only contains 20 bits in the middle of v_addr.

Example 11: This example shows the difference between system-derived types in ILP32 and LP64 data models in reading or writing to a large file. The example shows an error caused by the second argument of the following function:

int fseeko(FILE *stream, off_t offset, int whence).

This function is identical to fseek(3C) except for the second argument, offset, which is a long type in fseek(3C). In the following example, record_pos[] is used to record the position of accessed pointers in a large file:

int record_pos[MAX_RECORD_NUM];
off_t offset;
while ( !feof (fp) ) {
	...
	/* calculate the offset; */
	fseeko ( fp, offset, SEEK_SET);
	if ( condition ){
	    record_pos[i] = (int)offset;
	}
}

In a 32-bit environment, you need to cast offset to type int because a file cannot be larger than 4 Gbyte, which is the maximum value of an int variable. However, in a 64-bit environment, a file larger than 4 Gbyte cannot use record_pos[] to record the position.

3.1.2 Driver-Specific DDI Interfaces

Check for potential problems due to DDI typedef changes.

In the Solaris OS on 64-bit x86-based systems, the kernel redefines the DDI data types to allow the compiler to check that the correct items are being passed. The following type definitions are in <sys/dditypes.h> in the 32-bit Solaris kernel:

typedef void *ddi_dma_handle_t;
typedef void *ddi_dma_win_t;
typedef void *ddi_dma_seg_t;
typedef void *ddi_iblock_cookie_t;
typedef void *ddi_regspec_t;
typedef void *ddi_intrspec_t;
typedef void *ddi_softintr_t;
typedef void *dev_info_t;
typedef void *ddi_devmap_data_t;
typedef struct ddi_devid *ddi_devid_t;
typedef void *ddi_acc_handle_t;

The following type definitions are in <sys/dditypes.h> in the 64-bit Solaris kernel:

typedef struct __ddi_dma_handle *ddi_dma_handle_t;
typedef struct __ddi_dma_win *ddi_dma_win_t;
typedef struct __ddi_dma_seg *ddi_dma_seg_t;
typedef struct __ddi_iblock_cookie *ddi_iblock_cookie_t;
typedef struct __ddi_regspec *ddi_regspec_t;
typedef struct __ddi_intrspec *ddi_intrspec_t;
typedef struct __ddi_softintr *ddi_softintr_t;
typedef struct __dev_info *dev_info_t;
typedef struct __ddi_devmap_data *ddi_devmap_data_t;
typedef struct __ddi_devid *ddi_devid_t;
typedef struct __ddi_acc_handle *ddi_acc_handle_t;

There is no impact on C binaries and correct C sources. Compilation errors occur in C sources that use these types incorrectly.

One way to avoid passing incorrect argument types to functions is to define the structure pointers with specific structure tags. For example, notice the arguments to the following two DDI functions that are declared in sunddi.h:

int ddi_add_softintr(dev_info_t *dip, int preference,
	ddi_softintr_t *idp,
	ddi_iblock_cookie_t *iblock_cookiep,
	ddi_idevice_cookie_t *idevice_cookiep,
	uint_t (*int_handler)(caddr_t int_handler_arg),
	caddr_t int_handler_arg);
void ddi_remove_softintr(ddi_softintr_t id);

The third argument of function ddi_add_softintr() is a pointer to ddi_softintr_t. In the 64-bit Solaris kernel, this is a pointer to pointer. In the partner function ddi_remove_softintr(), the argument is a ddi_softintr_t, which is a pointer to a ddi_softiniter structure. The interrupt cannot be removed if you make the following call:

ddi_remove_softintr (&id)

It may be difficult for developers to catch this kind of error in the program, but the compiler can catch these errors if you use the ddi_softintr_t type.

Use fixed-width DDI common access functions.

Functions that use symbolic names to specify their data access size are obsolete. These functions include ddi_getb(9F), ddi_getw(9F), ddi_getl(9F), and ddi_getll(9F). The new function names specify a fixed-width data size, such as ddi_get8(9F), ddi_get16(9F), ddi_get32(9F), and ddi_get64(9F).

To port drivers to the 64-bit Solaris OS on x86 platforms, replace the obsolete non-fixed-width DDI functions with fixed-width DDI common access functions, as shown in the following table.

Table 2: Obsolete Non-Fixed-Width DDI Functions and Fixed-Width DDI Common Access Functions
 
 
Obsolete Function
Replacement Function
Description
ddi_getb(9F)
ddi_get8(9F)
reads 8-bit from device address
ddi_getw(9F)
ddi_get16(9F)
reads 16-bit from device address
ddi_getl(9F)
ddi_get32(9F)
reads 32 bits from device address
ddi_getll(9F)
ddi_get64(9F)
reads 64 bits from device address
ddi_putb(9F)
ddi_put8(9F)
writes 8-bit to device address
ddi_putw(9F)
ddi_put16(9F)
writes 16-bit to device address
ddi_putl(9F)
ddi_put32(9F)
writes 32 bits to device address
ddi_putll(9F)
ddi_put64(9F)
writes 64 bits to device address
ddi_rep_getb(9F)
ddi_rep_get8(9F)
reads 8-bit from device address repeatedly
ddi_rep_getw(9F)
ddi_rep_get16(9F)
reads 16-bit from device address repeatedly
ddi_rep_getl(9F)
ddi_rep_get32(9F)
reads 32 bits from device address repeatedly
ddi_rep_getll(9F)
ddi_rep_get64(9F)
reads 64 bits from device address repeatedly
ddi_rep_putb(9F)
ddi_rep_put8(9F)
writes 8-bit to device address repeatedly
ddi_rep_putw(9F)
ddi_rep_put16(9F)
writes 16-bit to device address repeatedly
ddi_rep_putl(9F)
ddi_rep_put32(9F)
writes 32 bits to device address repeatedly
ddi_rep_putll(9F)
ddi_rep_put64(9F)
writes 64 bits to device address repeatedly
pci_config_getb(9F)
pci_config_get8(9F)
reads 8-bit from PCI configuration space
pci_config_getw(9F)
pci_config_get16(9F)
reads 16-bit from PCI configuration space
pci_config_getl(9F)
pci_config_get32(9F)
reads 32 bits from PCI configuration space
pci_config_getll(9F)
pci_config_get64(9F)
reads 64 bits from PCI configuration space
pci_config_putb(9F)
pci_config_put8(9F)
writes 8-bit to PCI configuration space
pci_config_putw(9F)
pci_config_put16(9F)
writes 16-bit to PCI configuration space
pci_config_putl(9F)
pci_config_put32(9F)
writes 32 bits to PCI configuration space
pci_config_putll(9F)
pci_config_put64(9F)
writes 64 bits to PCI configuration space

Example: A driver that uses ddi_getl(9F) to access 32-bit data reads 64-bit data in a 64-bit environment. Drivers must use ddi_get32(9F) to access 32-bit data in the 64-bit environment:

uint32_t ddi_get32(ddi_acc_handle_t hdl, uint32_t *dev_addr);

In addition to function names, certain function parameter types and function return values are different in the 64-bit Solaris OS. Examples include unsigned char, unsigned short, and unsigned long change to uint8_t, uint16_t, and uint32_t.

Those functions with changed parameter types and return values are shown in the following table.

Table 3: Functions With Changed Parameter Types and Return Values
 
 
Obsolete Function Definition
Replacement Function Definition
Description
unsigned char inb(int port)
uint8_t inb(int port)
reads 8-bit from an I/O port
unsigned short inw(int port)
uint16_t inw(int port)
reads 16-bit from an I/O port
unsigned long inl(int port)
uint32_t inl(int port)
reads 32 bits from an I/O port
void repinsb(int port, unsigned
char *addr, int count)
void repinsb(int port,
uint8_t *addr, int count)
reads multiple 8-bit from an I/O port
void repinsw(int port,
unsigned short *addr, int count);
void repinsw(int port,
uint16_t *addr, int count);
reads multiple 16-bit from an I/O port
void repinsd(int port,
unsigned long *addr, int count);
void repinsd(int port,
uint32_t *addr, int count);
reads multiple 32 bits from an I/O port
void outb(int port,
unsigned char value);
void outb(int port,
uint8_t value);
writes 8-bit to an I/O port
void outw(int port,
unsigned short value);
void outw(int port,
uint16_t value);
writes 16-bit to an I/O port
void outl(int port,
unsigned long value);
void outl(int port,
uint32_t value);
writes 32 bits to an I/O port
void repoutsb(int port,
unsigned char *addr, int count);
void repoutsb(int port,
uint8_t *addr, int count);
writes multiple 8-bit to an I/O port
void repoutsw(int port,
unsigned short *addr, int count);
void repoutsw(int port,
uint16_t *addr, int count);
writes multiple 16-bit to an I/O port
void repoutsd(int port,
unsigned long *addr, int count)
void repoutsd(int port,
uint32_t *addr, int count);
writes multiple 32 bits to an I/O port

Check changed fields in DDI data structures.

The data types of some of the fields in DDI data structures are changed in the 64-bit Solaris OS. Drivers that use these data structures should make sure that these fields are used appropriately.

The following table shows changed fields in DDI data structures.

Table 4: Changed Fields in DDI Data Structures
 
 
Definition in 32-Bit Environment
Definition in 64-Bit Environment
Comments
struct buf {
    ...
    unsigned int b_bcount;
    unsigned int b_resid;
    int b_bufsize;
    ...
}
struct buf {
    ...
    size_t b_bcount;
    size_t b_resid;
    size_t b_bufsize;
    ...
}

buf.h
typedef struct {
    ...
    unsigned long dmac_address
    unsigned int  dmac_size;
    unsigned int  dmac_type;
    ...
} ddi_dma_cookie_t;
typedef struct {
    ...
    union {
        uint64_t _dmac_ll;
        unit32_t _dmac_la[2];
    } _dmu;
    size_t dmac_size;
    uint_t dmac_type;
    ...
} ddi_dma_cookie_t;

dditypes.h

The dmac_address is defined by MACRO and depends on the _LONG_LONG_HTOL. In 64-bit
Solaris x86:
#define dmac_laddress _dmu.dmac_ll
#define dmac_address _dmu.dmac_la[0]
struct scsi_pkt { 
    ...
    unsigned long  pkt_flags;
    long           pkt_time;
    long           pkt_resid;
    unsigned long  pkt_state;
    unsigned long  pkt_statistics;
    ...
}
struct scsi_pkt { 
    ...
    uint_t   pkt_flags;
    int      pkt_time;
    ssize_t  pkt_resid;
    uint_t   pkt_state;
    uint_t   pkt_statistics;
    ...
}

scsi/scsi_pkt.h
typedef struct 

ddi_dma_attr  {
    unsigned int  dma_attr_version;
    unsigned long dma_attr_addr_lo;
    unsigned long dma_attr_addr_hi;
    unsigned long dma_attr_count_max;
    unsigned long dma_attr_align;
    unsigned int dma_attr_burstsizes;
    unsigned int dma_attr_minxfer;
    unsigned long dma_attr_maxxfer;
    unsigned long dma_attr_seg;
    int          dma_attr_sgllen;
    unsigned int dma_attr_granular;
    unsigned int dma_attr_flags;
} ddi_dma_attr_t;
typedef struct ddi_dma_attr 

{
    uint_t   dma_attr_version;
    uint64_t dma_attr_addr_lo;
    uint64_t dma_attr_addr_hi;
    uint64_t dma_attr_count_max;
    uint64_t dma_attr_align;
    uint_t dma_attr_burstsizes;
    uint32_t dma_attr_minxfer;
    uint64_t dma_attr_maxxfer;
    uint64_t dma_attr_seg;
    int      dma_attr_sgllen;
    uint32_t dma_attr_granular;
    uint_t   dma_attr_flags;
} ddi_dma_attr_t;

ddidmareq.h

This structure defines attributes of the DMA engine and the device.

Check changed arguments of DDI interfaces.

The DDI function argument types in the following table have been changed in the 64-bit Solaris OS.

Table 5: Changes in DDI Function Argument Types
 
 
Function Definition in Previous Releases
Function Definition in 64-Bit Environment
void ddi_set_driver_private(dev_info_t *devi, caddr_t data);
void ddi_set_driver_private(dev_info_t *devi, void *data);
caddr_t ddi_get_driver_private(dev_info_t *devi);
void *ddi_get_driver_private(dev_info_t *devi);
struct buf *getrbuf(long sleepflag)
struct buf *getrbuf(int sleepflag)
void delay(long ticks);
void delay(clock_t ticks);
timeout_id_t timeout(void (*func)(caddr_t), caddr_t arg,long ticks);
timeout_id_t timeout(void (*func)(caddr_t), caddr_t arg, clock_t ticks);
struct map *rmallocmap(ulong_t mapsize);
struct map *rmallocmap(size_t mapsize);
struct map *rmallocmap_wait(ulong_t mapsize);
struct map *rmallocmap_wait(size_t mapsize);
struct buf *scsi_alloc_consistent_buf( struct scsi_address *ap, struct buf *bp,int datalen, ulong_t bflags, int (*callback )(caddr_t), caddr_t arg);
struct buf *scsi_alloc_consistent_buf(structs scsi_address *ap, struct buf *bp, size_t datalen, uint_t bflags, int (*callback )(caddr_t), caddr_t arg);
int uiomove(caddr_t address, long nbytes, enum uio_rw rwflag, uio_t *uio_p);
int uiomove(caddr_t address, size_t nbytes, enum uio_rw rwflag, uio_t *uio_p);
int cv_timedwait(kcondvar_t *cvp, kmutex_t *mp, long timeout);
int cv_timedwait(kcondvar_t *cvp, kmutex_t *mp,clock_t timeout);
int cv_timedwait_sig(kcondvar_t *cvp, kmutex_t *mp, long timeout);
int cv_timedwait_sig(kcondvar_t *cvp, kmutex_t *mp,clock_t timeout);
int ddi_device_copy(ddi_acc_handle_t src_handle, caddr_t src_addr, long src_advcnt,ddi_acc_handle_t dest_handle, caddr_t dest_addr, long dest_advcnt, size_t bytecount, ulong_t dev_datasz);
int ddi_device_copy(ddi_acc_handle_t src_handle, caddr_t src_addr, ssize_t src_advcnt, ddi_acc_handle_t dest_handle, caddr_t dest_addr, ssize_t dest_advcnt, size_t bytecount, uint_t dev_datasz);
int ddi_device_zero(ddi_acc_handle_t handle, caddr_t dev_addr, size_t bytecount, long dev_advcnt, ulong_t dev_datasz):
int ddi_device_zero(ddi_acc_handle_t handle, caddr_t dev_addr, size_t bytecount, ssize_t dev_advcnt,uint_t dev_datasz):
int ddi_dma_mem_alloc(ddi_dma_handle_t handle, uint_t length, ddi_device_acc_attr_t *accattrp, ulong_t flags, int (*waitfp)(caddr_t), caddr_t arg, caddr_t *kaddrp, uint_t *real_length, ddi_acc_handle_t *handlep);
int ddi_dma_mem_alloc(ddi_dma_handle_t handle, size_t length, ddi_device_acc_attr_t *accattrp, uint_t flags, int (*waitfp)(caddr_t), caddr_t arg, caddr_t *kaddrp, size_t *real_length, ddi_acc_handle_t *handlep);
int drv_getparm(unsigned int parm, unsigned long *value_p);
int drv_getparm(unsigned int parm, void *value_p);

In the 64-bit kernel, drv_getparm() can be used to fetch both 32-bit and 64-bit quantities. However, the interface does not define the data type of the value pointed to by value_p. This can lead to programming errors. You should not use drv_getparm(). Use the following new routines instead:

clock_t ddi_get_lbolt(void);
time_t ddi_get_time(void);
cred_t *ddi_get_cred(void);
pid_t ddi_get_pid(void);

Changes to the DDI functions

The following DDI and libc functions are removed or added in the Solaris 10 OS:

  • DDI functions removed. The driver interface ddi_dma_segtocookie(9F) is obsolete and has been omitted from the kernel. Use ddi_dma_nextcookie(9F) instead.
  • DDI functions added. The following functions are added to the DDI as a porting aid: memcpy, memset, memmove, memcmp, strncat, strlcat, strlcpy, and strspn. This enables gcc-compiled drivers to work more easily, since gcc often generates references to these functions.
3.2 Other Driver Issues

3.2.1 Converting ioctl Routines to Be 64-Bit Clean

Many ioctl operations are common to device drivers in the same class. Many of these interfaces copy in or copy out data structures to or from the kernel. Some of these data structure members are changed in size in the 64-bit data model. The following table lists ioctl structures that you must convert explicitly in 64-bit driver ioctl routines for dkio, fdio, fbio, cdio, mtio, and scsi:

Table 6: ioctl Structures That Must Be Converted
 
 
ioctl Command
Affected Data Structure
Reference
DKIOCGAPART
DKIOCSAPART
DKIOGVTOC
DKIOSVTOC
struct dk_map
struct dk_allmap
structpartition
struct vtoc
dkio
FBIOPUTCMAP
FBIOGETCMAP
struct fbcmap
fbio
FBIOPUTCMAPI
FBIOGETCMAPI
struct fbcmap_i
FBIOSCURSOR
FBIOSCURSOR
struct fbcursor
CDROMREADMODE1
CDROMREADMODE2
struct cdrom_read
cdio
CDROMCDDA
CDROMCDXA
CDROMSUBCODE
struct cdrom_cdda
structcdrom_cdxa
struct cdrom_subcode
FDIOCMD
FDRAW
struct fd_cmd
struct fd_raw
fdio
MTIOCTOP
MTIOCGET
MTIOCGETDRIVETYPE
struct mtop
struct mtget
struct mtdrivetype_request
mtio
USCSICMD
struct uscsi_cm
scsi

The nblocks property, the number of blocks each device contains, is defined as a signed 32-bit integer. The nblocks property therefore limits the maximum device size to 1 Tbyte. A new property, Nblocks, is defined as an unsigned 64-bit integer to remove this limitation.

For more information, see Appendix C, "Making a Device Driver 64-Bit Ready," in the Writing Device Drivers manual on the Sun Product Documentation site.

3.2.2 Modifying the Routines That Handle Data Sharing

If a 64-bit device driver uses ioctl(9E), devmap(9E), or mmap(9E) to share data structures with a 32-bit application, check whether those data structures contain long or pointer types. The binary layout of such data structures is incompatible.

To handle potential data model differences, driver entry point routines that receive arguments from user applications must determine whether the argument came from an application that uses the same data type model as the kernel. The new DDI function ddi_model_convert_from(9F) enables drivers to determine this. The argument for ddi_model_convert_from(9F) is the data type model of the current thread. If conversion to or from IPL32 is necessary, the return value is DDI_MODEL_IPL32. If no conversion is needed, the return value is DDI_MODEL_NONE. Typically, the _MULTI_DATAMODEL macro is defined by the system when the driver supports multiple data models.

Example: The xxxdevmap() function from devmap(9E) provides a simple example. The devmap() function maps memory from a device into the address space of a process. The range of mapped memory in a device is from offset to offset+len. In this function, the data structure is used to interact with 32-bit and 64-bit applications. For a 32-bit application, the member addr in struct data contains a 32-bit user process's address, but for a 64-bit application, addr is a 64-bit address.

struct data {
	int len;
	caddr_t addr;
};

xxdevmap (dev_t dev, devmap_cookie_t dhp,
	offset_t offset, size_t len, size_t *maplen,
	uint_t model)
{
	struct data dtc;
	/* local copy for clash resolution */
	struct data *dp = (struct data *)shared_area;
#ifdef _MULTI_DATAMODEL
	switch (ddi_model_convert_from(model)) 
	{
		case DDI_MODEL_ILP32:
		{
		struct data32 {
			int len;
			uint32_t addr;
		}*da32p;
		/* cast shared_area->addr from 64-bit to 32-bit */
		da32p = (struct data32 )shared_area;
		dp = &dtc;
		dp->len = da32p->len;
		dp->addr = da32p->addr;
		break;
	}
	case DDI_MODEL_NONE:
		break;
}
#endif /* _MULTI_DATAMODEL */

/* continues along using dp */
...
}

To support 64-bit clean code, the ioctl(9E) and mmap(9E) routines need to consider the macro _MULTI_DATAMODEL as well.

3.3 How to Verify

The simplest and most straightforward procedure to verify a 64-bit driver ported from 32-bit source code is to run the 64-bit driver on a 64-bit kernel. If you do not have a 64-bit Solaris system, other source-level verification methods are available for you to use. The compiler and lint are practical tools you can use to check for the use of constructs that impact 32-bit and 64-bit portability.

lint

The lint tool is a C-program checker. The lint tool in the Sun Studio C 5.7 compiler can be used to help find potential 64-bit problems in your code. You can use the special -errchk=longptr64 option to request that lint notify you whenever you try to put something big into something small, such as casting a 64-bit pointer into a 32-bit int.

The lint tool prints the line number of the offending code, issues a warning message that describes the problem, and informs you that a pointer was involved or gives the sizes of types involved. This information (the fact that a pointer is involved, and the sizes of the types) can be useful in finding only the 64-bit problems and avoiding the pre-existing problems between 32-bit and smaller types.

The following sample shows how the lint output will appear. Hello.c contains syntax errors under the LP64 model. Hello64.c cleans up those errors after re-declaring the variable types and explicitly casting them.

sh$ cat -n hello.c
     1	/* hello.c */
     2	#include	<stdio.h>
     3	#include	<stdlib.h>
     4	
     5	static int func1(int pass1);
     6	static long func2(long pass2);
     7	
     8	void
     9	main(void)
    10	{
    11		int i_a = 0, *i_ptr = 0;
    12		long l_b = 0;
    13	
    14		i_a = (int) i_ptr;
    15		i_ptr = (void *)i_a;
    16		i_a = l_b;
    17		i_a = (int) l_b;
    18		i_a = 0xffffaabbcc;
    19		i_a = (int) 0xffffaabbcc;
    20		i_a = func1(l_b);
    21		i_a = func2(0xffffaabbcc);
    22		printf("output 32-bit int %d\n", l_b);
    23		scanf("input 32-bit int %d\n", &l_b);
    24	}
    25	
    26	static int
    27	func1(int pass1)
    28	{
    29		return (pass1);
    30	}
    31	
    32	static long
    33	func2(long pass2)
    34	{
    35		return (pass2);
    36	}

sh$ /opt/SUNWspro/bin/lint hello.c  -errchk=longptr64 |more
(14) warning: conversion of pointer loses bits
(15) warning: cast to pointer from 32-bit integer
(16) warning: assignment of 64-bit integer to 32-bit integer
(17) warning: cast from 64-bit integer to 32-bit integer
(18) warning: 64-bit constant truncated to 32 bits by assignment
(19) warning: cast from 64-bit integer constant expression to 32-bit integer
(20) warning: passing 64-bit integer arg, expecting 32-bit integer: func1(arg 1)
(21) warning: assignment of 64-bit integer to 32-bit integer

function returns value which is always ignored
    printf              scanf           

function argument ( number ) type inconsistent with format
    printf (arg 2) 	long  :: (format) int 	hello.c(22)
    scanf (arg 2) 	long * :: (format) int *	hello.c(23)
sh$

sh$ cat -n hello64.c
     1	/* hello64.c */
     2	#include <stdio.h>
     3	#include <stdlib.h>

     4	
     5	static int func1(int pass1);
     6	static long func2(long pass2);
     7	void
     8	main(void)
     9	{
    10		int  i_a = 0, i_b = 0, *i_ptr = 0;
    11		long l_a = 0, l_b = 0;
    12	
    13		l_a = (unsigned long) i_ptr;
    14		i_ptr = (void *)l_a;
    15		l_a = l_b;
    16		i_a = (int) l_b; /* intended narrow conversion  */
    17		l_a = 0xffffaabbcc;
    18		i_a = (int) 0xffffaabbcc; /* intended narrow conversion */
    19		i_a = func1(i_b);
    20		l_a = func2(0xffffaabbcc);
    21		printf("output 32-bit int %ld\n", l_b);
    22		scanf("input 32-bit int %ld\n", &l_b);
    23	}
    24	
    25	static int func1(int pass1)
    26	{
    27		return (pass1);
    28	}
    29	
    30	static long func2(long pass2)
    31	{
    32		return (pass2);
    33	}

sh$/opt/SUNWspro/SOS8/bin/lint hello64.c -errchk=longptr64 |more
(16) warning: cast from 64-bit integer to 32-bit integer
(18) warning: cast from 64-bit integer constant expression to 32-bit integer

set but not used in function
    (10) i_a in main

function returns value which is always ignored
    printf              scanf           
sh$

Compiler

The following example shows the error messages output by gcc from compiling the helloworld.c program on a 64-bit x86-based system. The errors are eliminated in helloworld64.c by narrowing the constants, padding the structures, and re-declaring the variable and function types.

sh$ cat -n helloworld.c 
     1	#include <stdio.h>
     2	#include <stdlib.h>
     3	
     4	#define	MAX_HEAD_COUNT 0xfffffffffffL
     5	#define	MAX_FORMAL_EMPLOYEE 0x0fffL
     6	
     7	struct private_info {
     8		long id;
     9		int stat;
    10		char *other_info;
    11	} employee[MAX_HEAD_COUNT];
    12	
    13	struct {
    14		int outside_id;
    15		unsigned int priv_info_addr;
    16	} filo_index[MAX_FORMAL_EMPLOYEE]; /* all people who are recorded now */
    17	
    18	long
    19	encryp(long input)
    20	{
    21		return (input);
    22	}
    23	
    24	int
    25	main(void)
    26	{
    27		int i, key; long tmp_id;
    28		void *priv_p;
    29	
    30		do {
    31			scanf("%d%d%d", &i, &employee[i].stat, &employee[i].id);
    32			if (i < 0) break;
    33			if (i > MAX_HEAD_COUNT)
    34				i = MAX_HEAD_COUNT;
    35			if (employee[i].stat) {
    36				filo_index[key].outside_id = employee[i].id;
    37				key = encryp(filo_index[key].outside_id);
    38				filo_index[key].priv_info_addr = (int)(employee + i);
    39			}
    40		} while (i < sizeof (employee));
    41	
    42		while (1) {
    43			scanf("%d", (int *)&tmp_id);
    44			key = (int)encryp(tmp_id);
    45			if (key == -1)
    46				key = (int)MAX_FORMAL_EMPLOYEE;
    47			priv_p = (void *)filo_index[key].priv_info_addr;
    48			printf("%d\t%d\n", ((struct private_info *)priv_p)->id,
    49			    ((struct private_info *)priv_p)->stat);
    50		}
    51	}
    
sh$ gcc -fsyntax-only -Wall -Wcast-qual -Wconversion -Wmissing-format-attribute \
    -Wpadded -Werror -g3 -o helloworld helloworld.c 
cc1: warnings being treated as errors
helloworld.c:10: warning: padding struct to align `other_info'
helloworld.c: In function `main':
helloworld.c:31: warning: int format, different type arg (arg 4)
helloworld.c:33: warning: comparison is always false due to limited range of 
    data type
helloworld.c:34: warning: overflow in implicit constant conversion
helloworld.c:37: warning: passing arg 1 of `encryp' with different width due 
    to prototype
helloworld.c:38: warning: cast from pointer to integer of different size
helloworld.c:46: warning: cast to pointer from integer of different size
helloworld.c:48: warning: int format, different type arg (arg 2)
sh$

sh$ cat -n helloworld64.c
     1	#include <stdio.h>

     2	#include <stdlib.h>
     3	
     4	#define	MAX_HEAD_COUNT 0xffffL
     5	#define	MAX_FORMAL_EMPLOYEE 0x0fffL
     6	
     7	struct private_info {
     8		long id;
     9		int stat;
    10		char padding[4];
    11		char *other_info;
    12	} employee[MAX_HEAD_COUNT]; /* all people who have been employees */
    13	
    14	struct {
    15		int outside_id;
    16		char padding[4];
    17		unsigned long priv_info_addr;
    18	} filo_index[MAX_FORMAL_EMPLOYEE]; /* all people who are employees now */
    19	
    20	long
    21	encryp(long input)
    22	{
    23		return (input);
    24	}
    25	
    26	int
    27	main(void)
    28	{
    29		int i, key;
    30		long tmp_id;
    31		void *priv_p;
    32	
    33		do {
    34			scanf("%d%d%ld", &i, &employee[i].stat, &employee[i].id);
    35			if (i < 0) break;
    36			if (i > MAX_HEAD_COUNT)
    37				i = MAX_HEAD_COUNT;
    38			if (employee[i].stat) {
    39				filo_index[key].outside_id = employee[i].id;
    40				key =  encryp((long)filo_index[key].outside_id);
    41				filo_index[key].priv_info_addr =
    42				    (unsigned long)(employee + i);
    43			}
    44		} while (i < sizeof (employee));
    45	
    46		while (1) {
    47			scanf("%d", (int *)&tmp_id);
    48			key = (int)encryp(tmp_id);
    49			if (key == -1)
    50			    key = (int)MAX_FORMAL_EMPLOYEE;
    51			priv_p = (void *)filo_index[key].priv_info_addr;
    52			printf("%ld\t%d\n", ((struct private_info *)priv_p)->id,
    53			    ((struct private_info *)priv_p)->stat);
    54		}
    55	}

sh$ ./gcc -fsyntax-only -Wall -Wcast-qual -Wconversion -Wmissing-format-attribute \
    -Wpadded -Werror -g3 -o helloworld64 helloworld64.c
sh$ 

3.4 Checklist for Getting Started

Use the following checklist to convert your driver code to the Solaris OS on 64-bit x86-based systems:

  • Read this entire document with emphasis on Section 3, "General Conversion Guidelines."
  • Include <sys/types.h> (or at a minimum, <sys/isa_defs.h>) in your code to pull in the _ILP32 or _LP64 definitions as well as many basic derived types.
  • Move function prototypes and external declarations with non-local scope to headers and include these headers in your code.
  • Review all data structures and interfaces to verify that these are still valid in the 64-bit environment.
  • Carefully check your use of long types and sized types (int32_t, int64_t). If a fixed-length quantity is desired, then use the appropriately sized type, for example, matching hardware register structures. If an address is needed, then use uintptr_t, for example, if an opaque handle to hold an address is needed.
  • Carefully check structures that use 64-bit long types, for example, uint64_t. The alignment and size may change, if the code is compiled in 32-bit mode and then in 64-bit mode. See the example below:

#include <stdio.h>
#include <sys/types.h>

struct misalign {
        uint32_t        foo;
        uint64_t        bar;
};

main()
{
        struct misalign a;

        printf("sizeof struct is: %d\n", sizeof(a));
        printf("offset of bar is: %d\n", (uintptr_t)&a.bar - (uintptr_t)&a);
}

On a 32-bit system based on 386 architecture, this code produces:

sizeof struct is: 12
offset of bar is: 4

On a 64-bit system based on AMD64 architecture, this code produces:

sizeof struct is: 16
offset of bar is: 8

As a result, attempts to make structures work in both 32-bit and 64-bit environments by using 64-bit fields may not succeed. This occurs fairly often, especially in structures that have been passed into and out of the kernel through ioctl calls. A 32-bit application sees the structure differently than the 64-bit kernel sees it.

  • Replace any use of obsolete DDI interfaces with newer DDI interfaces that allow better compiler type checking. Especially note that the hat_getkpfnum() function is removed. You must use the proper ddi_dma_* memory functions instead.
  • Compile with either gcc or the Sun Studio 10 C5.7 compiler in both 32-bit and 64-bit modes to get rid of all compiler warnings, unless the application is being provided only as 64 bit. Note especially that the gcc compiler behaves differently than the Sun C compiler in 32-bit mode when assigning pointers into 64-bit entities. The gcc compiler sign extends, and the Sun C compiler does not.
  • Run lint(1) using the -errchk=longptr64 flag and review each warning individually. Note that not all warnings require a change to the code. Depending on the resulting changes, you might also want to run lint(1) again, both in the 32-bit environment and 64-bit environment.
  • Test the application by executing the 32-bit version on the 32-bit OS, and the 64-bit version on the 64-bit OS. It is not necessary to test the 32-bit version on the 64-bit OS.
  • See the next section for advanced topics specific to drivers and the Solaris OS on 64-bit x86 platforms.

4 Advanced Issues and Guidelines

Following the general conversion guidelines from the previous section should help you produce clean 64-bit code in device drivers for the Solaris OS on x86 platforms. However, that does not mean that the code is portable or tuned for performance. This section can help you take full advantage of the features of the AMD Opteron processor.

This section discusses some advanced topics and offers guidelines for addressing these topics.

4.1 DMA Issues

The DMA framework in the Solaris OS hides the hardware details of a platform, such as I/O MMU, I/O cache, data alignment, data order and so on. When writing a driver to work on multiple platforms, you need to consider the following issues:

  • Be careful with I/O MMU translations.
  • Maximum burst size can change.
  • A device might have no 64-bit addressing capability although the driver is 64-bit.
  • You should check changes to DMA DDI functions and DMA DDI structures.

You might also encounter some performance-related problems with the DDI functions. You may have issues specific to the 64-bit Solaris OS as well.

4.1.1 Be Careful With I/O MMU Translations

The CPU uses MMU to translate a virtual address (CPU view) to physical address (main bus view). The device uses I/O MMU to translate I/O bus addresses (PCI address) to physical addresses (main bus view). The I/O MMU can be very convenient for performing DMA transfers.

The I/O MMU provides a device with the ability to perform DVMA. DVMA enables you to program a DMA engine with a large block of virtual contiguous address. That relieves the CPU from programming the DMA engine with many small blocks of physical addresses. SPARC technology offers an I/O MMU and can perform DVMA. In some cases, only a single DMA window with a few DMA cookies needs to be programmed to the DMA engine.

Some platforms, IA for example, have no I/O MMU. This means that developers can get multiple DMA windows and multiple DMA cookies. Of course, each DMA cookie may only contain a few pages. Fortunately, some devices can provide scatter/gather (S/G) capability to perform highly efficient DMA transferring. So when writing a device driver to support both the SPARC and IA (AMD64 included) platforms, developers should add it with multiple DMA window and multiple DMA cookie support. If the device has S/G capability, the device driver should make good use of it to enhance DMA performance.

Example:
struct sglentry {
	size_t        dma_addr;
	uint32_t      dma_size;
} sglist[SGLLEN];

In this example, each DMA cookie should be filled in sglist as an S/G element.

Notes:

1. When using DVMA, cookie.dma_address is the virtual address that appears on PCI bus. It is the responsibility of the I/O MMU to translate the virtual addresses into physical addresses.

2. AMD64 compatible processors can take advantage of the so-called AGP GART, which is quite similar to the I/O MMU address translation table, to serve other PCI devices. See the end of this section for more details.

4.1.2 Max Burst Size Can Change

Drivers specify the DMA burst sizes that their device supports in the dma_attr_burst sizes field of the ddi_dma_attr structure. This is a bitmap of the supported burst sizes. When you write a driver that is 32-bit and 64-bit compatible, you may need to change this structure to optimize performance. When DMA resources are allocated, the system can impose further restrictions on the burst sizes that the device can use. A better approach is to use the ddi_dma_burstsizes(9F) routine to obtain the allowed burst sizes. This routine returns the appropriate burst size bitmap for the device. When DMA resources are allocated, a driver can ask the system for the appropriate burst sizes to use for its DMA engine.

Example: Determining Burst Size

The following pseudocode explains a correct way to determine burst size:

#define BEST_BURST_SIZE 0x20 /* 32 bytes */
	if (ddi_dma_buf_bind_handle(xsp->handle,xsp->bp, flags, xxstart,
	    (caddr_t)xsp, &cookie, &ccount) != DDI_DMA_MAPPED) {
		/* error handling */
	}
	burst = ddi_dma_burstsizes(xsp->handle);
	/* check which bit is set and choose one burstsize to */
	/* program the DMA engine */
	if (burst & BEST_BURST_SIZE) {
		/* program DMA engine to use this burst size */
	} else {
		/* other cases */
	}

4.1.3 Device With No 64-Bit Addressing Uses 64-Bit Driver

Some devices are only capable of 32-bit addressing. Others are capable of both 32-bit and 64-bit addressing. Knowing which addressing capability your device supports is very important. Note that some 32-bit devices are capable of 64-bit addressing and also that some 32-bit PCI devices can achieve 64-bit addressing through a DAC (Dual Address Cycle) approach. DAC can finish 64-bit addressing within two PCI clock periods. For example, a device may possess two registers: DMADAC0 and DMADAC1. A DMA engine can perform DAC within two PCI clock periods, where the first PCI address is a Lo-Addr DMADAC0 with the PCI command (C/BE[3:0]#) D, and the second PCI address is a Hi-Addr DMADAC1 with the PCI command (C/BE[3:0]#) 6 or 7 (depending on whether there is a write or a read).

If the device has 64-bit addressing capability, then its performance should be greatly enhanced in a 64-bit OS. But what if the device has no 64-bit addressing capability?

The DMA engine has a limited addressing capability, for example, a PCI master device that can only perform SAC (Single Address Cycle) would only be capable of 32-bit addressing. Its ddi_dma_attr_t should be described as follows:

static ddi_dma_attr_t attributes = 
{
	DMA_ATTR_VO, /* Version number */ 
	0x00000000,  /* low address */ 
	0xFFFFFFFF,  /* high address */ 
	0xFFFFFFFF,  /* counter register max */ 
	.....
};

This example tells the DDI DMA framework that your device has only 0 to 32-bit addressing capability by assigning "0x00000000" to the low address and "0xFFFFFFFF" to the high address.

If the driver uses the ddi_dma_mem_alloc(9F) routine to allocate a piece of kernel virtual memory, ddi_dma_buf_bind_handle(9F) or ddi_dma_addr_bind_handle(9F) assures that DMA cookies are allocated in the range of the low and high addresses assigned by ddi_dma_attr_t. Therefore, you do not need to worry about issues such as whether the physical memory is greater than 4 Gbyte.

If this device needs to process a DMA request coming from other devices, you can perform a device-to-device DMA transfer. In this situation, the local memory of another device may be mapped into a segment beyond 4 Gbyte, so that you cannot perform a DMA transfer directly. Fortunately, device-to-device DMA transfer is only used rarely.

For those devices that support both 32-bit and 64-bit addressing, 64-bit drivers should take advantage of the device's 64-bit addressing capability. The following example describes the DMA engine for this case:

static ddi_dma_attr_t attributes = 
{
	DMA_ATTR_VO,        /* Version number */ 
	0x0000000000000000, /* low address */ 
	0xFFFFFFFFFFFFFFFF, /* high address */ 
	0xFFFFFFFF,         /* counter register max */ 
	.....
};

Then you can use dma_laddress in ddi_dma_cookie_t structure and program it into the DMA engine.

4.1.4 Changes in DMA DDI Functions and DMA DDI Structures

The primary changes to DMA DDI function and structures are in ddi_dma_cookie_t and ddi_dma_mem_alloc. Note that the size of some arguments returned from the function may change. The definitions of those arguments should be changed. For more details, please refer to Section 3, "Basic Conversion Guidelines."

4.1.5 Be Careful With the NUMA System

NUMA has a single OS image, single address space view, nonuniform memory, nonuniform I/O, and non-coherent cache. After DMA transfers are done, ddi_dma_sync(9F) should be explicitly called to ensure that the caches are successfully flushed.

4.2 Other Related Issues

The following issues concern the performance of DDI functions:

  • The ddi_dma_mem_alloc(9F) function has not yet been optimized for performance. Creation of a more highly efficient mem_alloc function in 64-bit is under investigation.
  • No proper DDI functions are available to manage physical memory directly. This results in inefficient use of kernel virtual memory for the construction of AGP video drivers.

The AMD64 platform can offer some architectural advantages. For example, its AGP Aperture can be shared by both AGP and PCI devices. That is, you can use the AGP Aperture as an I/O MMU component. This is a good way to enhance DMA performance. You can also enable cache coherency for the AGP aperture by setting one bit in the GART entry. With this approach, the AGP master can read the data from the processor caches faster than it can read data from the DDR memory.

5 Porting Example

This section uses sample code to explain how to port a driver from 32-bit to 64-bit. The driver in this example manages a RAM space such as a RAM disk and uses programmed I/O to drive a device with a 32-bit CSR register and a 32-bit data register. This sample modifies a 32-bit driver to be a 64-bit safe driver. The sample defines the MACRO VERSION_64-bit while compiling in a 64-bit environment.

The original source code and the header file, pio.h, are provided in the appendix. A link to the source code for the converted version appears below. The changes that are made in this example can be summarized as follows:

  • The obsolete functions, ddi_putl() and ddi_getl(), have been replaced by ddi_put32() and ddi_get32(), respectively.
  • The declarations of size and *addr as type int are changed to size_t and caddr_t, respectively, to make them 32-bit and 64-bit free. See lines 37 and 38.
  • The type of the return value from getminor() has been changed from int to minor_t. See lines 189-202.
  • The type for pio_p->addr needs to be changed from int to caddr_t. The function min() has a return value of type int. The type needs to be changed to size_t and a separate version of the min() function needs to be written for 64-bit cases. See lines 341-375.
  • A variable named tmp needs to be shared with the user application and must be safe for both 32-bit and 64-bit cases. The buffer to be copied needs to have a size appropriate to the case. See lines 406-446.

Note that these changes are also documented by comments.

Porting Example

6 Conclusion

This document describes issues that you need to be aware of when you write 32-bit and 64-bit safe drivers for the Solaris OS on x86 platforms. These issues include multiple C language data models, the use of system-derived types that have changed, and changes to some of the DDI interfaces. Also, you need to address some driver-specific issues. Finally, you need to consider performance issues such as the use of DMA.

This article lists and describes these issues, and it provides solutions and recommendations for these issues. This guide should help you write clean code for 32-bit and 64-bit device drivers for the Solaris OS on x86 platforms.

For further information on device drivers in the Solaris OS, see Writing Device Drivers. To see examples of some basic device drivers, see Device Driver Tutorial (PN 817-5789, Sun Microsystems). If you are new to development in the Solaris OS or are unfamiliar with the range of information on the Solaris OS, see the Introduction to the Solaris Development Environment.

7 References

 
Appendix

This appendix lists the pio.h header file and the source code before conversion, pio_32.c.

Rate and Review
Tell us what you think of the content of this page.
Excellent   Good   Fair   Poor  
Comments:
Your email address (no reply is possible without an address):
Sun Privacy Policy

Note: We are not able to respond to all submitted comments.