Sun Java Solaris Communities My SDN Account Join SDN
 
Documentation

Sun SPARC Implementation for the PCI Bus

The Sun Host to PCI Bridge (HPB) is a PCI-based I/O subsystem that connects the system bus (Ultra Port Architecture (UPA)) with PCI buses. The HPB includes:

  • Two physically separate PCI bus segments, each with full master and target support
  • Two separate streaming caches, one for each bus segment, for accelerating PCI DMA activity
  • A Memory Management Unit (IOMMU) for mapping the Direct Memory Access (DMA) address for both buses
  • An interrupt dispatch unit for delivering interrupt requests to the CPU module

This is a block diagram of a Sun Host to PCI Bridge (HPB). Only the functional blocks pertained to PCI are shown.

 

 

 

 

 

 

 

 

 

 

A summary of the characteristics of the HPB is presented in the following table.

Characteristics Bus A Bus B
Data Transfer Width 64-bit 64-bit
Clock Frequency 33/66 MHz capable 33 MHz capable
Burst Size 64 bytes 64 bytes
Number of Read Buffers One 64-byte for DMA One 64-byte for DMA
Number of Write Buffers Two 64-byte for DMA; One 64-byte for PIO Two 64-byte for DMA; One 64-byte for PIO
Dual Address Cycles Bypass DMA only Bypass DMA only
Fast Back-to-Back Device In target mode only In target mode only
Byte Twisting Yes for DMA Yes for DMA
Interrupt Latency 6 cycles inside the IDU 6 cycles inside the IDU
Cache Line Size 64 bytes 64 bytes
Number of Cache Lines 16 16
Disconnected on Cache Line Target mode (DMA) only Target mode (DMA) only
Configuration Mechanism (per section 3.7.4 of PCI 2.1 specification) Configuration Mechanism #2 Configuration Mechanism #2
Configuration Space 256 bytes, starting at physical address 1FE.0101.0000 256 bytes, starting at physical address 1FE.0100.0000
I/O Space 8 Kbytes. At physical address 1FE.0200.0000 8 Kbytes. At physical address 1FE.0201.0000
Memory Space 2 Gbytes. At physical address 1FF.0000.0000 2 Gbytes. At physical address 1FF.8000.0000
Configuration Cycles Master mode only Master mode only
Special Cycle Master mode only Master mode only
Arbitrary Byte Enables Consistent DMA only Consistent DMA only
Peer-to-Peer DMA On a single segment On a single segment
Interrupt 4 interrupt lines shared among PCI devices 4 interrupt lines shared among PCI devices
IOMMU Page Size 8 Kbytes and 64 Kbytes. Only 8 Kbytes page size used in the STC 8 Kbytes and 64 Kbytes. Only 8 Kbytes page size used in the STC
DVMA Addressing Space (set by PCI nexus driver) 64 Mbytes in Solaris 2.5.1. May change in later release of Solaris 64 Mbytes in Solaris 2.5.1. May change in later release of Solaris
PIO Read Size 1, 2, 4, 8, 16, 64 bytes for memory cycles;

1, 2, and 4 bytes for I/O or configuration cycles
1, 2, 4, 8, 16, 64 bytes for memory cycles;

1, 2, and 4 bytes for I/O or configuration cycles
PIO Write Size 0-16 arbitrary byte enables and 64-byte aligned for memory cycles;

0-4 arbitrary byte enables for I/O or configuration cycles
0-16 arbitrary byte enables and 64-byte aligned for memory cycles;

0-4 arbitrary byte enables for I/O or configuration cycles
Cache-line Wrap
Addressing Mode
Not Supported Not Supported
Local (on-PCI) Cache Not Supported Not Supported
Exclusive Access to Main Memory LOCK# signal not
connected
LOCK# signal not
connected
Address/Data Stepping Not Supported Not Supported
DOS Compatibility Hole Not Supported Not Supported
External Arbiter Not Supported Not Supported
Subtractive Decode Not Supported Not Supported
 

PCI Bus Interface

The PCI Bus Module (PBM) implements a complete PCI master and target interface. The HPB contains two nearly identical copies of this module. Each module implements all of the required host bridge functions as contained in the PCI Revision 2.1 specification except the interrupt logic. The interrupt function is implemented in the Interrupt Dispatch Unit (see "Interrupt Dispatch Unit" for details).

Each PBM also handles the big-to-little-endian byte twisting required for correct operation of Programmed I/O (PIO) and Direct Memory Access (DMA) data paths. The byte at address 0 on the big-endian side is directly wired to the byte at address 0 on the little-endian side (for example, bits 63:56 map to bits 7:0, bits 55:48 map to bits 15:8, and so forth). Because byte lanes at the same address are connected, DMA of byte streams works correctly without further intervention.

For PIO access larger than a byte, byte twisting is not sufficient. For example,
if the 32-bit value 0x12345678 is written to a 32-bit register on a PCI device, the PCI device sees the value 0x78563412 instead. SPARC-V9 Architecture Manual defines special support for little-endian access to correct this discrepancy. By either marking the page containing the PCI register as little-endian in the processor's MMU, or by using one of the little-endian Address Space Identifiers (ASIs)1, the CPU alters its ordering of the bytes so that the PCI device correctly sees 0x12345678. Solaris 2.5 software provides a set of DDI functions to address the endian issue. Writing Portable DDI-Compliant PCI Drivers (a Sun white paper) provides a list of DDI functions for writing endian-neutral device drivers.

As a target on the PCI bus, the PBM only responds to PCI memory space commands. All other PCI commands are ignored. The following table describes the PCI commands that the PBM is able to generate as a master, as well as how it responds as a target.

C/BE#
PCI Command Description
Generated in
Master Mode
Response in
Target Mode
0000 Interrupt Acknowledge No Ignored
0001 Special Cycle Yes Ignored
0010 I/O Read (IOR) Yes Ignored
0011 I/O Write (IOW) Yes Ignored
0100 Reserved --- ---
0101 Reserved --- ---
0110 Memory Read (MR) Yes Read Access
0111 Memory Write (MW) Yes Write Access
1000 Reserved --- ---
1001 Reserved --- ---
1010 Configuration Read (CR) Yes Ignored
1011 Configuration Write (CW) Yes Ignored
1100 Memory Read Multiple (MRM) No Read (with prefetch if streamable)
1101 Dual Address Cycle (DAC) No Bypass access
1110 Memory Read Line (MRL) Yes Read (with prefetch if streamable)
1111 Memory Write and Invalidate (MWI) No Equivalent to Memory Write
 

Supported PCI Features

  • 64-bit bus extension (as a target only for DMA)
  • 64-bit addressing (Dual Address Cycle) for IOMMU bypass
  • Required adapter and host-bridge configuration registers
  • Fast back-to-back cycles as a target
  • Arbitrary byte enables (consistent mode only)
  • Ability to generate memory, I/O, and configuration read and write cycles as a master
  • Ability to generate special cycles as a master
  • Ability to respond to memory space accesses as a target
  • Peer-to-peer DMA on a single segment

Unsupported PCI Features

  • Exclusive access to main memory (LOCK# signal not connected)
  • Peer-to-peer DMA between different PCI bus segments
  • Local (on-PCI) cache support
  • External arbiter
  • Cache-line wrap addressing mode
  • Fast back-to-back cycles as PIO master
  • Address/data stepping
  • Subtractive decode
  • Any DOS compatibility feature

Streaming Cache

The STream Cache (STC) is used to accelerate PCI DMA to and from system memory. For DMA reads, the STC speculatively prefetches 64-byte cache lines. For DMA writes, the STC buffers up 64-byte lines before sending to the UPA bus. The STC also acts as a local cache for PCI read accesses to the same block. There are two separate STC blocks in the HPB, one associated with each PBM. Each STC contains storage for sixteen 64-byte lines which are allocated during DMA on a least recently used basis.

The STC resides outside of the coherent memory domain; therefore, it must rely on software to maintain data correctness to the PCI devices. The STC includes registers to invalidate and to flush the STC entries for maintaining memory coherency. Device drivers call ddi_dma_sync(9F) to write to the STC registers after the DMA transfer is completed.

Interrupt Dispatch Unit

In the Sun Ultra architecture, interrupts to a processor are sent as packets on the UPA bus. The Interrupt Dispatch Unit (IDU) is the main resource for generating such packets. The IDU accepts interrupt requests from the UPA, PCI buses, graphics slots, and internal HPB sources, and dispatches interrupt packets to the UPA.

The PCI devices on the slots of each PCI bus share the four PCI interrupt lines (INTA#, INTB#, INTC#, INTD#). The Interrupt Number Offset (INO) is an interrupt concentrator to encode the interrupt sources onto a 6-bit value. The IDU includes the INO in the packet to the UPA.

Before the IDU sends any PCI related interrupt to the UPA, it checks with the appropriate PBM to see if it has any posted consistent DMA write data (see "Consistent vs. Streaming DMA" for details). If so, the IDU waits until the PBM indicates that the write data has been sent to the UPA.

IOMMU

The IOMMU performs virtual to physical address translation during DMA. It maps 32-bit PCI virtual address to 41-bit system bus physical address. The IOMMU consists of a 16-entry Translation Lookaside Buffer (TLB) to cache recently used translations as well as a Translation Storage Buffer (TSB), which is a software managed data structure. The number of TSB entries is software configurable and is typically set to 8K, which gives the total DMA space of 64 Mbytes (8K * IOMMU_PAGESIZE, see "DVMA Resources and IOMMU Translations" for details).

Merge Buffer

The DMA merge buffer is used for servicing DMA writes of less than 64 bytes (subline writes). This is required for Sun4u, which is only accessible in cached 64-byte quantities. The merge buffer contains a 64-byte buffer for storing the partial line. There are valid bits for each byte in the buffer, so it is able to handle completely arbitrary byte enables on a consistent write from a PCI device.

Bus Master Operation (PIO)

Each PBM implements all of the required host bridge functions for PCI, and also acts as the central resource for arbitration, reset, and system error (SERR#) monitoring. The PBM handles the timing of PIO requests to the PCI bus, which includes target disconnects, retries, and various error conditions during the PIO.

The PBM is capable of generating aligned PIO reads of 1, 2, 4, 8, 16, and 64 bytes to memory space and 1-, 2-, and 4-byte accesses to I/O and configuration space. It is also capable of generating PIO writes with arbitrary byte enables with 0-16 bytes transferred to memory space, and 0-4 bytes transferred to I/O or configuration space. In addition, 64-byte PIO writes with all bytes enabled can be generated to memory space.

PIO write data is posted to the 64-byte write buffer to be dispatched by the PBM, which handles target retries and disconnects transparently to other control blocks. PIO read data is loaded directly from the PCI bus to the HPB's internal bus in 64-bit quantities.

Target Operation (DMA)

As a target, the PBM only responds to PCI memory space commands (MR, MW, MRL, MRM, MWI) and Dual Access Cycle. All other PCI commands are ignored. Typically, the transactions address of the memory commands is treated as a virtual address, and translated to a physical address by the IOMMU. These transactions are referred to as DMA transactions. The PBM supports both consistent and streaming mapped DMA.

DMA writes are posted to dual 64-byte buffers. One buffer can drain to the HPB's internal data bus while the other is filling from the PCI bus. Byte enables are stored with the data. DMA reads are single buffered in the PBM to insulate the internal data bus from disconnects and pauses due to master wait states, master latency time-out, slower bus speed, and narrower bus width.

When a DMA burst transfer attempts to go past a cache line (64 bytes) boundary, the HPB generates a disconnect. This should cause the master device to attempt the transaction again beginning at the address of the next untransferred data.

Under certain conditions, the PBM issues a retry command for an incoming PCI transaction. These conditions include:

  • PBM requests the IOMMU to do a TSB tablewalk to get mapping for this transaction.
  • The STC indicates that it is initiating a request to get the desired read data.
  • Due to congestion, there are not enough buffers to accept a transaction.

Arbitrary byte enables are supported for consistent DMA transactions. In streaming mode, all data must be contiguous bytes within a single PCI transaction. Gaps between the end address of one PCI transaction and the start address of the next are allowed in streaming DMA transactions.

<< Previous |  Contents |  Next>>