Sun Java Solaris Communities My SDN Account Join SDN

Article

SCSI DISK FMA Project Part 1: SCSI Device Drivers as FMA Telemetry Detectors

 
By Chris Horne, Grant Zhang, David Zhang, and Ti Liu, December 2008  
Contents

This is the first article in a series about the SCSI DISK FMA project:

Overview
What Is an FMA Error Detector?

An error "detector" is responsible for observing and reporting hardware errors. Any device that has the ability to report status can be an error detector. The driver for this device is responsible for sending out status reports. An example is the NIC (bge) driver for PCI Ethernet cards.

To be able to detect as many hardware errors as possible, Solaris drivers must support and implement the FMA framework. Drivers need to be able to send out error report events (ereport) to the FMA daemon (fmd), by which FMA faults or upsets can be diagnosed. End users can use fmdump to fetch all ereports from the existing error log and use fmadm to maintain system faults.

Why Use SD to Report Status?

The device can be an internal disk, or it can be a device located somewhere in an external enclosure. In both cases, the device detects problems and reports them using T10 standards-defined SCSI transport and protocol. The Solaris endpoint for detecting the errors is the leaf (disk/tape) driver, sd (7D), st (7D).

Note: The putback of PSARC 2008/558 is the delivery limited to the disk leaf driver, sd (7D).

How to Use SD to Report Status

To report a device error, a device path and error reason are necessary. The SD driver can get an error reason by checking back sense data from any failed scsi command. The SD driver can also discover the real device path via devid and the instance number if MPxIO is enabled.

The previous sd driver can print only some error messages to the syslog. After being integrated with the SCSI FMA project, any detected disk error can be recorded into the structured log (using fmdump -e) and can trigger automatic diagnosis to report system faults if necessary. Users can refer to the article describing predictive self-healing to see detailed steps for resolving system faults.

Original SCSI DISK FMA

Assume that a user-land module for the fm daemon was deployed before a SCSI DISK FMA putback. Running fmadm configure:

    disk-transport             1.0     active  Disk Transport Agent
 

The original SCSI DISK FMA reports only three types of disk faults via a user-land daemon detector (disk-transport). The disk faults are:

  • DISK-8000-0X -- fault.io.disk.predictive-failure
  • DISK-8000-12 -- fault.io.disk.over-temperature
  • DISK-8000-2J -- fault.io.disk.self-test-failure

Browse the article for Message ID DISK-8000-0X for detailed information.

The original SCSI FMA is limited to user-land level. It is only useful for disks and leaking device status reported with the T10 standards-defined SCSI protocol.

Current SCSI DISK FMA

The current project (SCSI FMA phase III) introduces two new disk faults:

  • DISK-8000-3E -- fault.io.scsi.cmd.disk.dev.rqs.derr
  • DISK-8000-4Q -- fault.io.scsi.cmd.disk.dev.rqs.merr

The leaf driver (SD) is responsible for converting SCSI-protocol-defined telemetry into FMA ereport form. Additionally, the driver has already performed its own low-level error handling. This low-level error handling initiates and coordinates command-level retry and recovery procedures within the driver itself. In this phase, all ereports have the driver-assessment property to indicate low-level error handling status. The following driver-assessment values are used:

fatal, retry, recovered, fail, info
 

In this phase, Solaris systems record all failed SCSI commands into an FMA error log, using fmdump -ev to retrieve all existing logs. For example, if a driver fails an operation due to getting medium error sense data, the ereport looks like this.

    ...
    class = ereport.io.scsi.cmd.disk.dev.rqs.merr
        device-path = /pci@0,0/pci1022,7458@1/pci11ab,11ab@1/disk@0,0
        devid = id1,sd@SATA_____HITACHI_HDS7250S______KRVN67ZBG8VSTF

        driver-assessment = retry
        op-code = 0x8
        cdb = 0x8 0x0 0x3e 0xc1 0x1 0x0
        pkt-reason = 0x0
        pkt-state = 0x0
        pkt-stats = 0x0
        stat-code = 0x2
        key = 0x3
        asc = 0x11
        ascq = 0x0
        sense-data = 0xf0 0xdd 0xb3 0xfe 0xca 0xdd 0xba 0xfe 0xca 0xdd 0xba 0xfe 0x11 0x0 0xba 0xfe 0xca 0xdd 0xba 0xfe
	lba = 0x12345678
    ...
 

If the sense key is a hardware error (0x4) or medium error (0x3), you can also see the syslogs from the console and you can use fmadm faulty to see detailed steps to fix those hardware errors. See the following example.

--------------- ------------------------------------  -------------- ---------
TIME            EVENT-ID                              MSG-ID         SEVERITY
--------------- ------------------------------------  -------------- ---------
Sep 25 14:06:18 1707f2f9-af9c-c76e-a166-bdb5fcc4cad6  DISK-8000-4Q   Critical 

Fault class : fault.io.scsi.cmd.disk.dev.rqs.merr
Affects     : dev:///:devid=id1,sd@SATA_____HITACHI_HDS7250S______KRVN63ZAJLP44D//pci@0,0/pci1022,7458@1/pci11ab,11ab@1/disk@5,0
                  faulted and taken out of service
FRU         : "HD_ID_23" (hc://:product-id=Sun-Fire-X4500:chassis-id=00-14-4F-20-E3-08:server-id=icecube:serial=KRVN63ZAJLP44D:part=HITACHI-HDS7250SASUN500G-0633KLP44D:revision=K2AOAJ0A/chassis=0/bay=23/disk=0)
                  faulty

Description : The command was terminated with a non-recovered error condition
              that may have been caused by a flaw in the media or an error in
              the recorded data. 
              Refer to http://sun.com/msg/DISK-8000-4Q for more information.

Response    : The device may be offlined or degraded.

Impact      : It is likely that continued operation will result in data
              corruption, which may eventually cause the loss of service or the
              service degradation.

Action      : Schedule a repair procedure to replace the affected device. Use
              'fmadm faulty' to find the affected disk.
 

More detailed steps for handling SCSI FMA faults and analyzing ereports are addressed in Part 2 and Part 3 of this series. Enjoy debugging.

Future SCSI DISK FMA

Planned enhancements to the SCSI DISK FMA include:

  • SCSI FMA for tape devices.
  • A human-readable library that can translate the hexadecimal value of the asc/ascq/sense key into string messages, for example:
     
         0x00 -> TestUnitReady
    
     
  • Diagnosis of transport layer errors and correct representation of the device path.
For More Information
About the Authors

Chris Horne is a Sun Senior Staff Engineer. His research interests include Solaris IO, the Storage software stack, and any innovations in operating systems.

Grant Zhang is a Software Engineering Manager in the Solaris Storage Group at Sun Microsystems Inc. Grant has an M.S. in Computer Engineering from Queen's University in Canada and is currently pursuing an M.B.A. Degree at Peking University.

David Zhang is a Sun Software Engineer. His SCSI FMA team is working on disk/tape fault management projects based on the SCSI protocol. He has an M.S. in Computer Science from Harbin Institute of Technology.

Ti Liu is a Sun Software Engineer. She has an M.S. in Computer Science.

Rate and Review
Tell us what you think of the content of this page.
Excellent   Good   Fair   Poor  
Comments:
Your email address (no reply is possible without an address):
Sun Privacy Policy

Note: We are not able to respond to all submitted comments.