Sun Java Solaris Communities My SDN Account Join SDN
 
Documentation

Converting Device Drivers to Support PCI Hotplugging

 

This paper provides the information device driver writers need to convert device drivers to be hotplug-capable. It is applicable to the Solaris 2.6 operating environment.

Introduction

The Solaris 2.6 environment currently supports Dynamic Reconfiguration (DR) (see "References"), which places similar requirements on device drivers. The information in this paper is being provided in advance of PCI hotplugging support in the Solaris product so that driver writers will have sufficient time to develop and test hotplug-capable PCI device drivers.

The focus of this paper is on bus interfaces, such as the PCI hotplugging specification (see the PCI Special Internet Group Web Site, which provide a controlled environment for hot removal and insertion of cards.

Before you can convert a driver to be hotplug-capable, you need to understand hotplug scenarios and how they affect device drivers. Section 2 describes controlled PCI hotplugging as outlined in the PCI hotplug specification. Section 3 describes the requirements on device drivers and the flow of control during hot removal and insertion. Section 4 outlines the steps required to make a device driver hotplug-capable. Section 5 provides guidelines for testing hotplug-capable device drivers in Solaris 2.6.

For more information download Sample Solaris Drivers. This package contains a number of sample drivers that are hotplug-capable and provide greater insight into the process.

The document Writing Device Drivers describes in detail how to write a Solaris device driver. This paper complements that document, which is a prerequisite to understanding this paper. (Note: On docs.sun.com, find the version of Writing Device Drivers that matches the release number you are working with.)

Writing Device Drivers does not describe the use of the suspend/resume feature, which is also used in Dynamic Reconfiguration, because it does not affect device drivers. However, suspend/resume requirements should be considered when testing the driver.

Overview of PCI Hotplugging

This section summarizes the sequence of events in hot removal and hot insertion that are relevant to the device driver. For more details and terminology definitions, refer to the PCI hotplug specification.

Hotplugging typically involves:

1. Preparation

Indicate to the system that a hotplug operation will take place on particular hardware.

2. Isolation

Specify that the hardware be isolated in preparation for a safe insertion or removal.

3. Removal, insertion, or both

Insert and/or replace the hardware.

4. Verification

Verify that the card is accessible and perform minimal sanity checks.

5. Integration

Re-integrate the hardware back into the system.

6. Configuration

Use kernel functions to configure the system, loading and attaching device drivers for newly installed hardware, and deconfiguring the system for removed hardware.

Hot Removal With No Replacement

This operation is used to remove a card. You may want to remove a card because, for example, its failure mode affects system operation. To perform a hotplug removal:

1. Indicate, using an administrative application, that a card needs to be removed or replaced.

2. Through the administrative application, use kernel functions to take the appropriate driver offline.

If the driver instance is busy or open, this operation may fail and you will first have to terminate the processes that are accessing the device.

3. Through the administrative application, use kernel functions to turn off the appropriate slot; an optional slot-state indicator will show that the slot is off and it is now safe to remove the card.

4. Remove the card.

5. Inform the administrative application that the card has been removed.

Hot Insertion

This procedure is used to insert a card in a previously unused slot.

1. Indicate, using an administrative application, that a card is to be inserted in a particular slot.

2. Through the administrative application, use kernel functions to turn off the appropriate slot (if it isn't off already).

An optional slot-state indicator will show that the slot is off.

3. Insert the card.

4. Indicate to the administrative application that the card has been inserted.

5. Through the administrative application, use kernel functions to turn on the slot and the slot-state indicator.

6. Through the kernel, configure the system, and load and attach the device driver for the newly installed hardware.

Hot Removal Followed by Insertion

This procedure is used to replace a card with either an identical or a different card.

1. Indicate, using an administrative application, that a card is to be removed or replaced in a given slot.

2. Through the administrative application, use kernel functions to place the appropriate driver offline.

If the driver instance is busy or open, this operation may fail and you will first have to terminate any processes that are accessing the device.

3. Use the administrative application to disable the appropriate slot, and turn on an optional slot-state indicator to indicate the physical slot and verify that it is now safe to remove the card.

4. Remove the card.

5. Insert a replacement card.

6. Indicate to the administrative application that a card has been inserted.

7. Through the administrative application, use kernel functions to turn on the slot, and, optionally, turn off the slot-state indicator.

8. Through the kernel, configure the system, and load and attach the device driver for the newly installed hardware.

Solaris Hotplugging Driver Issues

This section describes the flow of control from a device driver perspective. Most device drivers can support hotplugging with minimal changes.

Hot Removal

The kernel checks the driver state, and if the driver instance is not open, the kernel can invoke the driver's detach(9E) entry point with the command DDI_DETACH. The driver instance checks its internal state and verifies whether it is safe to detach. If so, the driver frees resources, cancels timeouts, and returns success. At this point, the kernel can remove power from the slot and the hardware can be removed.

Hot Insertion

Hot insertion uses the normal probe(9E) and attach(9E) entry points of a driver after the hardware has been inserted into the slot in a controlled manner.

Hotplug-Capable Device Driver Development

This section briefly discusses how to make Solaris device drivers compliant with the new Solaris technologies of Dynamic Reconfiguration and future Solaris hotplug technologies.

D_HOTPLUG Flag

Hotplug-capable drivers should set D_HOTPLUG in their cb_ops cb_flag.

Detach Entry Point

The driver's detach entry point can be called with these commands:

  • DDI_DETACH
    The system attempts to unload the driver module. This may be because memory is low or the hardware will be removed. There is no information available to the driver to distinguish between these two cases.
  • DDI_PM_SUSPEND
    This command is issued when the device is being suspended after a period of inactivity. Refer to Writing Device Drivers for more details on device power management.
  • DDI_SUSPEND
    This command is issued when the entire system is suspended before power is (possibly) removed or for Dynamic Reconfiguration. All incoming or queued requests must be blocked until the system is resumed. Refer to the bst or sst sample drivers for example code, and Writing Device Driversfor documentation.

It is strongly recommended to fully implement support for all detach commands.

Note the following uses are possible:

  • DDI_SUSPEND may be followed by DDI_RESUME with no power interruption.
  • DDI_PM_SUSPEND may be followed by power interruption, and is followed by DDI_PM_RESUME.
  • DDI_PM_SUSPEND may be followed by DDI_SUSPEND, DDI_RESUME and DDI_PM_RESUME.
  • DDI_DETACH may be followed by power interruption; any further references to the device will need to be preceded by a DDI_ATTACH.

During Dynamic Reconfiguration, there are two possible detach scenarios.

1. All the devices on the detaching system board receive a DDI_DETACH, and all other devices keep running; or

2. All devices (other than those on the detaching system board) receive a DDI_SUSPEND. All the devices on the detaching system board receive a DDI_DETACH. All devices (other than those on the detaching system board) receive a DDI_RESUME.

DDI_DETACH and DDI_SUSPEND are discussed in more detail below, with emphasis on common errors in device drivers that support these commands. For more details, refer to Writing Device Drivers.

DDI_DETACH Command

The DDI_DETACH operation is the inverse of DDI_ATTACH. The DDI_DETACH handling should only deallocate the data structures for the specified instance. Driver global resources must only be deallocated in _fini(). Since instances will be assigned in arbitrary order, the driver must be able to handle out-of-sequence presentation of instances. In fact, no assumption should be made about the instance number.

If the driver fails the DDI_DETACH, the driver should clearly indicate, using console messages, what the user should do to ensure that the next DDI_DETACH will succeed. However, flooding the console with messages should be avoided.

The DDI_DETACH command code should perform the following actions.

1. Check whether DDI_DETACH is safe.

The driver can assume that the device has been closed before DDI_DETACH is issued. However, there may be outstanding callbacks that cannot be cancelled, or the device may not currently be in a state that permits it to be reliably shut down and restarted later. While timeouts or callbacks are still active, proper locking must be enforced.

The driver should not block while waiting for callback completion or for the device to become idle.

Devices that maintain some state after a close operation must be carefully analyzed. When a driver not currently in use is automatically unloaded (for example, because the system memory is low) and later automatically reloaded when the user opens the device, this might cause undesirable operation.

For example, a tape driver that supports non-rewinding tape access might fail the detach operation when the tape head is not at the beginning of the tape. If the drive is powered down, the head position will be lost.

2. Shut down the device and disable interrupts.

A device needs enough hardware information/support to be able to shut off and restart interrupts. This may already be coded in the driver as a function of the existing detach routines.

3. Remove any interrupts registered with the system.

4. Cancel any outstanding timeouts and callbacks.

5. Quiesce or remove any driver threads.

6. Deallocate memory resources.

The driver should be unloadable without memory leaks.

7. Unmap any mapped device registers.

8. Execute ddi_set_driver_private(dip, NULL).

9. Free the softstate structure for this instance.

When there is failure during detach, the driver must decide whether to continue the detach and return success or undo the detach actions completed to that point and return failure. Undoing might be risky and it is usually preferable to continue the detach operation.

DDI_DETACH may be followed by power interruption; any further references to the device will need to be preceeded by a DDI_ATTACH.

Note the following when using timeout() routines:

  • Avoid using them if at all possible-too many are generally an indication of a poorly designed driver.
  • Be careful that you do not have multiple instances of the same routine running; for example, a second call to un->un_tid = timeout(XX_to, arg, ticks); will cause un_tid to be overwritten, removing the ability to untimeout() the first timeout() routine.
  • Self-rescheduling timeout() routines are those routines that contain a call to timeout() to reschedule themselves. These need particular care to kill.

The timeout routine should take the form of:

static void

XX_timeout(caddr_t arg)
{
	struct xx *un = (struct un *)arg;
	mutex_enter(&un->un_lock);

	.....

	XX_start_timeout(un);

	mutex_exit(&un->un_lock);
}
static void

XX_start_timeout(struct xx *un)

{
ASSERT(MUTEX_HELD(&un->un_lock));

	if ((un->un_tid == 0) && ((un->un_flags & XXSTOP) ==

		0)){un->un_tid = timeout(XX_to, arg, ticks);

	}
}
static void

XX_stop_timeout(struct xx *un)

{
int 	tid;
mutex_enter(&un->un_lock);

if ((tid = un->un_tid) != 0) {

	/* do not reschedule timeout */

	un->un_flags |= XXSTOP; 

	/* do not hold across untimeout() */

	mutex_exit(&un->un_lock);
	(void) untimeout(tid);
	mutex_enter(&un->un_lock);

	un->un_flags &= ~XXSTOP;

	mutex_exit(&un->un_lock);
} else {

	mutex_exit(&un->un_lock);
}
}

When deallocating memory, always verify first that the pointer is valid:
#if NONONONO

kmem_free(un->un_buf, un->un_buf_len);
#else
if (un->un_buf) {

		kmem_free(un->un_buf, un->un_buf_len);

		un->un_buf = NULL;
}
#endif

DDI_SUSPEND Command

System power management and the Dynamic Reconfiguration framework pass the command DDI_SUSPEND to the detach(9E) driver entry point to request that the driver save the device hardware state. The driver may fail the suspend operation if outstanding operations cannot be completed soon or aborted, or if non-cancellable callbacks are outstanding, in which case the system will abort the suspend operation. Note that the driver instance may already have been power managed using DDI_PM_SUSPEND.

To process the DDI_SUSPEND command, the driver must:

1. Set a suspended flag to block new operations.

2. Wait until outstanding operations have completed, or abort them if they can be restarted.

3. Block further operations from being initiated until the device is resumed (except for dump(9E) requests). Refer to sample code in Writing Device Drivers and the bst sample driver.

4. Cancel pending callouts such as timeout callbacks, and quiesce or destroy other driver threads.

5. Save any volatile hardware state in memory. This state includes the contents of device registers, and can also include downloaded firmware.

Power Management in Writing Device Drivers describes in more detail some special power management considerations.

DDI_SUSPEND will always be followed by DDI_RESUME. There may or may not be power interruption.

Attach Entry Point

The system calls attach(9E) to attach a device instance to the system or to resume operation after power has been suspended. The driver's attach(9E) entry point should handle these commands:

  • DDI_ATTACH
    Initialize the device instance. Refer to Writing Device Drivers for a detailed discussion of this command.
  • DDI_PM_RESUME
    Restore the hardware state of a device after the device has been suspended. Refer to Writing Device Drivers.
  • DDI_RESUME
    Restore the hardware state of a device after the system has been suspended. Refer to Writing Device Drivers.

DDI_PM_RESUME Command

When power is restored to the device that was suspended with DDI_PM_SUSPEND, that device will be resumed using the DDI_PM_RESUME command. The device driver should restore the hardware state, set up timeouts again if necessary, and enable interrupts again.

The DDI_PM_RESUME code should make no assumptions about the state of the hardware, which may or may not have lost power.

DDI_RESUME Command

When power is restored to the system or the system is unquiesced, each device that was suspended will be resumed using the DDI_RESUME command. The device driver should restore the hardware state, set up timeouts again if necessary, and enable interrupts again.

If the device is still suspended by DDI_PM_SUSPEND, the driver has to enter a state where it will call ddi_dev_is_needed(9F) for any new or pending requests, since an attach() with DDI_PM_RESUME is still forthcoming.

The resume code should make no assumptions about the state of the hardware, which may or may not have lost power.

Special Issues With Nexus Drivers

Nexus drivers usually do not have a cb_ops(9S) structure, so to enable hotplugging, a minimal cb_ops(9S) structure must be created and exported in the dev_ops(9S).

/*
 * autoconfiguration routines.
 */
static struct dev_ops xx_dev_ops = {
	....
	&xx_cb_ops,		/* devo_cb_ops */
	....
};
static struct cb_ops xx_cb_ops = {
nodev,                  /* open */

nodev,                  /* close */

nodev,                  /* strategy */

nodev,                  /* print */

nodev,                  /* dump */

nodev,                  /* read */

nodev,                  /* write */

nodev,                  /* ioctl */

nodev,                  /* devmap */

nodev,                  /* mmap */

nodev,                  /* segmap */

nochpoll,               /* poll */

nodev,                  /* cb_prop_op */

0,                      /* streamtab  */

D_MP | D_HOTPLUG,	 /* Driver compatibility flag */

CB_REV,                 /* cb_rev */

nodev,                  /* async I/O read entry point */

nodev                   /* async I/O write entry point */
};

The DDI_DETACH and DDI_SUSPEND/RESUME requirements are very similar to those of leaf drivers described above.

Device Driver Testing

This section provides some hints for driver testing.

Unloading

The driver should be able to sustain heavy driver activity, followed by module unload, without errors or memory leaks. This can be loop tested as follows (using C shell):

% while 1

	<run driver test>

	modunload -i <driver name>
> end

Alternatively:

% while 1

add_drv <driver name>

<run driver test>

rem_drv <driver name>
> end

Suspend/Resume Testing

Start up a test on your driver and simultaneously run:

uadmin 3 8

in a loop to test Suspend/Resume. The driver test should check for data corruption or unexpected loss of state.

You can also test Suspend/Resume by generating I/O to your device, and then pressing the top-right button on the keyboard to suspend the machine. Then resume the machine; there should be no panics, error messages, or data corruption.

If you have access to a server system that supports Dynamic Reconfiguration, you may also want to test your driver on this configuration.

Conclusion

The Solaris 2.6 driver framework provides the necessary hooks to make drivers hotplug-capable with minimal changes to the driver code. If a driver already supports the detach commands DDI_DETACH and DDI_SUSPEND and the attach command DDI_RESUME, simply adding the D_HOTPLUG flag to the cb_ops cb_flag will make the driver hotplug-capable.

References

Refer to these materials for additional information on this topic.