Sun Java Solaris Communities My SDN Account Join SDN
 
Documentation

Physical Address Extension (PAE) Mode Notes

Caution: Some device drivers have been found to have bugs when they are used on machines with more than 4 Gbytes of memory. PCI device drivers written by Sun have been tested on IA (Intel Architecture) compatible machines with more than 4 Gbytes of memory. Sun's OEM partners intend to test their machines with devices they supply on IA compatible machines with more than 4 Gbytes of memory. In some cases, however, if you add a third-party device driver to your system, it might become unstable. Panics and data corruption might result if this third-party driver is configured with a DMA address range larger than its device can handle. If your system becomes unstable and you need that driver, you must disable PAE mode.

PAE Mode Driver Requirements

PAE mode does not impose any new requirements for device drivers, though current requirements need to be emphasized. Device drivers need to be aware that PAE mode allows for physical addresses above 4 Gbytes, which means that device driver use of the DDI DMA interface and configuration of the DDI DMA address ranges (see ddi_dma_attr(9S)) need to be, more than ever, correct and accurate. Misbehaving drivers might not exhibit any problems running without PAE mode, but they can cause memory and data corruption when PAE mode is enabled. For example, if a device that can only handle physical locations below 4 Gbytes (32-bit addresses) is configured incorrectly with a DDI DMA address limit (dma_attr_addr_hi) above 4 Gbytes, DMA writes to addresses above 4 Gbytes would corrupt the memory locations at those addresses modulo 4 Gbytes. On the other hand, drivers might not realize the full performance benefits of PAE mode if the device can handle an address range larger than what is configured for the device.

Device drivers that use DMA memory that is not directly allocated from the DDI DMA routines (for example, memory from kmem_alloc() or user buffer) should verify that these memory buffers conform to the DMA requirements, including physical  address range, of the device (seeddi_dma_setup(9F)). A sufficient number of such problems have been found with kmem memory that  a 'restricted_kmemalloc' flag was created to force kmem_alloc memory to be below 4 Gbytes. Device drivers should be tested with this flag set to 0 so that this flag can be removed in the future.

When a DMA engine cannot access all of physical memory, the Solaris DDI routines copy the buffer to/from memory that is accessible to the engine. DMA actually uses the extra copy, which is located at a suitable address for the engine. Between setting up DMA and completing it, the driver must not read or write the buffer without using ddi_dma_sync(9F). If it does, data corruption can occur because the driver is accessing the wrong copy of the data.

Back to Top


How to Get a Driver to Work in PAE Mode

There are two ways for a driver to work in PAE mode:

  • Let the framework handle it. The ddi_dma_attr structure specifies that DMA must use addresses less than 2^32. If any buffer is passed to the framework outside the range specified by ddi_dma_attr, the framework will copy the data to/from a buffer created by the framework at a suitable address for the DMA.
  • Have the driver handle it. The driver must be prepared to accept cookies that contain addresses of 2^32 or higher.

With the first method above, there is a performance disadvantage due to the extra copying. Also, the driver must not use the data between the time that the DMA is set up and when it is complete because it will be using the wrong copy of the data. If the buffer must be used during this interval, ddi_dma_sync(9F) can be used to force the buffer to be copied again.

Back to Top


Testing That a Driver Works in PAE Mode

PAE mode can be tested by running the driver under stress, assuming that at least some of the DMA transfers will involve memory at high addresses. You can also instrument the driver to have it confirm that it has encountered high addresses, or you can use kadb as described below to check for high addresses.

To test that the driver is handling greater than 32-bit addresses, do the following:

1. Boot the kernel with kadb:

Select (b)oot or (i)interpreter: b kadb

2. If the driver is for a network interface card (NIC), the dma buffers are allocated when the network interface is initialized. Therefore, for a NIC, the interface should be closed and the driver module unloaded before setting the kadb breakpoint.

ifconfig <interface> #Note the IP address, broadcast address and netmask for the interface
ifconfig <interface> unplumb
modunload -i <module_id>
modeload <driver filename>

3. Enter kadb by pressing Ctrl-Alt-D or ~# on a tip line.

4. Set a breakpoint in ddi_dma_alloc_handle(9F):

kadb[0]: ddi_dma_alloc_handle:b

5. Continue program execution:

kadb[0]::c

6. At the system prompt type a command that will use your driver (for example, df -k or mountfor a storage driver or ifconfig for a NIC driver),

# df -k or
# ifconfig <interface> plumb ....

# breakpoint at:
ddi_dma_alloc_handle:           pushl  %ebp

Back to Top


Generate the stack trace:

kadb[0]:$c
ddi_dma_alloc_handle(0xe062b4c8,0xfece41d4,
0x1,0x0,0xe0a43ed8) + 0
cadp_phys_buf_alloc(0xe05bb000,0xe0a43ea0,0x14,0x54,0x3f,
0x1) + 9f
cadp_osmiob_common(0xe05bb000,0xe0a43ea0,0x6,0x14,0x1,
0xe04f8908) + 60
cadp_osmioballoc(0xe04f8908,0xe0a43e40,0x6,0x20,0x10,0x50)
+ 66
ghd_pktalloc(0xe05bd19c,0xe07028d0,0x6,0x20,0x10,0x0) + 
7b
ghd_tran_init_pkt_attr(0xe05bd19c,0xe07028d0,0x0,
0xe0b26820,0x6,0x20,0x10,0x40000,
0x0,0x0,0x50,0xe04f8938) + 4f
cadp_tran_init_pkt(0xe07028d0,0x0,0xe0b26820,0x6,0x20,0x10,
0x40000,0x0,0x0) + ce

7. Look at the second argument in ddi_alloc_handle, which is 0xfece41d4. Type the following to obtain the ddi_dma_attr structure:

kadb[0]:0xfece41d4$<ddi_dma_attr
 
This output should look like:

cadp_dma_attr_nosg:       addr_high is the field we are interested in. It tells us if the address is high.
cadp_dma_attr_nosg:
  version   addr_lo   addr_high
  0   0   ffffffffffffffff
cadp_dma_attr_nosg+0x14:
  count_max   align   burst
  ffffff   40   7f
cadp_dma_attr_nosg+0x28:
  minxfer   maxxfer   seg
  1   fffffe   ffffffffffffffff
cadp_dma_attr_nosg+0x3c:
  sgllen   granular   flags
  1   1   0
kadb[0]:

high_addr is ffffffffffffffff confirming that the driver supports greater than 32-bit addressing. If high_addr was ffffffff the driver would only support 32-bit addresses. You still have to verify that the driver uses greater than 32-bit addresses. Here's how:

Back to Top


8.  Remove the breakpoint from ddi_dma_alloc_handle:

kadb[0]: ddi_dma_alloc_handle:d

9. Set a breakpoint in ddi_dma_addr_bind_handle to obtain the address of the first returned cookie to verify that it has an address greater than 32-bits:

kadb[0]: ddi_dma_addr_bind_handle:b

10. Continue execution:

kadb[0]:c

# breakpoint at:
ddi_dma_addr_bind_handle:         &   pushl  %ebp

Generate the stack trace:

kadb[0]: $c
ddi_dma_addr_bind_handle(0xe130bda0,0x0,0xe1120840,0x3c0,0x13,
  0x1,0x0,0xe1235c68,0xef6bbb18)
+ 0
cadp_phys_buf_alloc(0xe10f8000,0xe1235c28,0x14,0x54,0x3f,0x1) + db
cadp_osmiob_common(0xe10f8000,0xe1235c28,0x6,0x14,0x1) + 4c
cadp_osmioballoc(0xe1118f28,0xe1235bc8,0x6,0x20,0x10,0x50) + 54
ghd_pktalloc(0xe10fa19c,0xe0f9b2b0,0x6,0x20,0x10,0x0,
  0x0,0x50) + 80
ghd_tran_init_pkt_attr(0xe10fa19c,0xe0f9b2b0,0x0,0x0,0x6,
  0x20,0x10,0x40000,0x0,0x0,0x50,0xe1118f58)
+ 3c
cadp_tran_init_pkt(0xe0f9b2b0,0x0,0x0,0x6,0x20,0x10,
  0x40000,0x0,0x0) + be
scsi_init_pkt(0xe0f9b2b0,0x0,0x0,0x6,0x20,0x10,0x40000,
  0xfeaddde5,0xe091d910) +4e
make_sd_cmd(0xe091d910,0xe1118028,0xfeaddde5) + e2
sdstart(0xe091d910) + 108
sdstrategy(0xe1118028) + 4e9
sdioctl_cmd(0x7405c0,0xef6bbd30,0x1,0x1,0x1) + 299
sd_unit_ready(0x7405c0) + 7d
sd_ready_and_valid(0x7405c0,0xe091d910) + 7a
sdopen(0xef6bbe28,0x3,0x0,0xe06ced88) + 1fc
dev_open(0xef6bbe28,0x3,0x0,0xe06ced88) + 24
spec_open(0xef6bbe6c,0x3,0xe06ced88) + 8e6
mountfs(0xe1311ec0,0x0,0x7405c0,0xe10454a0,0xe06ced88,
  0x0,0xef6bbebc,0x4) + 81
ufs_mount(0xe1311ec0,0xe1312738,0xef6bbf90,0xe06ced88) + 223
domount(0x0,0xef6bbf90,0xe1312738,0xe06ced88,0xef6bbf64) + 4ec
mount(0xef6bbf90,0xef6bbf80) + c2
syscall_ap(0x8047f5a,0x8047f6c,0x104,0x8064248,0x8047e4c,0x4) + 4e
_sys_call() + e5

Execute the ddi_dma_addr_bind_handle subroutine you halted in and return to the routine that called it:

kadb[0]: :u
stopped at:
cadp_phys_buf_alloc+0xdb:       addl   $0x24,%esp
[mutex_exit_critical_size+0xb,-]
 /|\
  |
  |_______ This might not be called cadp_phys_..... 
             it may be something else.

The eighth argument in ddi_dma_addr_bind_handle (0xe1235c68) is a DMA cookie. This is what should contain an address greater than 32 bits.

11. Check this by typing:

kadb[0]: 0xe1235c68/2X
0xe1235c68:     6a8f840         0 (Just a 32-bit address
                                   as this is 0.)

If the far right value is nonzero, the driver handled a greater than 32-bit address. If the value is zero, it is necessary to examine more cookies. The ninth argument (0xef6bbb18) is a pointer to the number of cookies.

12. Check this by typing:

kadb[0]: 0xef6bbb18/X 0xef6bbb18: 1

13. If the number of cookies is greater than 1, set a breakpoint in ddi_dma_nextcookie to verify that one of the returned cookies has an address greater than 32-bits

kadb[0]: ddi_dma_nextcookie:b

14. Continue execution:

kadb[0]:c
# breakpoint at:
ddi_dma_nextcookie:                  pushl  %ebp

Generate the stack trace:

kadb[0]: $c

ddi_dma_nextcookie(0xe0a00a98,0xe3ca5c7c) + 0 ....
Execute the ddi_dma_nextcookie subroutine you halted in and return to the routine that called it:

kadb[0]: :u

stopped at:
cadp_phys_buf_alloc+0x.....
 /|\
  |
  |_______ This might not be called cadp_phys_..... 
             it may be something else

The second argument in ddi_dma_nextcookie (0xe3ca5c7c) is a DMA cookie. This is what should contain an address greater than 32 bits.

15. Check this by typing:

kadb[0]: 0xe3ca5c7c/2X

0xe3ca5c7c:     1c63000      0 (Just a 32-bit address
                                   as this is 0.)

If the far right value is nonzero, the driver handled a greater than 32-bit address. You might have to perform this sequence a number of times before getting a greater than 32-bit address.

Back to Top