Sun Java Solaris Communities My SDN Account Join SDN

Article

Storage Device Evolution: General-Purpose Peer Processing Arrives

 
By Brian Wong, July 2007  

Once upon a time, storage devices were simple hardware under the direct control of "the computer." The term was unambiguous, because there was only one processing entity. Since those times, storage devices have grown substantially in quantity and in complexity. Today, devices such as virtualization engines, disk arrays, and even disk drives and host bus adapters (HBAs) include programmable components. Units such as enterprise disk arrays and NAS filers are even more sophisticated, often including multiple processors running large amounts of firmware.

From Devices to Small Computers

The large increase in firmware size is directly related to the functions we demand of storage devices. Years ago, storage consisted of a physical device such as a disk or tape – and little else. Over time, those individual devices added buffers, local optimization (such as elevator seek algorithms and command queuing), multi-port/multi-host operation, and a variety of other capabilities.

The realization that devices were getting truly sophisticated dawned on me in about 1995 when I noticed that a disk drive's embedded target controller included an Intel 80186 processor, 640KB of RAM – and 384KB of ROM for firmware storage. The disk drive had a PC/XT stuck to the bottom of it! Perhaps more importantly, the device also included an operating system and application combination of comparable complexity to MS-DOS 5.0 and Lotus 1-2-31! By 1995, a "mere" disk drive was about ten years behind recognizable applications in complexity.

As if that weren't enough, that was about the same time that RAID made the aggregation of devices commonplace. Of course, RAID too is implemented as more software2. A typical disk array of the late 90's was therefore comparable in complexity to a network of PCs. After all, they had core processors like 80386's, i96t0 or MicroSPARC-I, which made them one generation behind the microprocessors used in front-line PCs and workstations of the time. And these RAID processors were controlling a dozen to hundreds of disk drives, each of which as noted was of complexity comparable to an older PC.

The demands on the devices were also increasing in sophistication, which of course is why the hardware increased in power. The devices were evolving from dedicated single-function items (a EIDE disk in the late-80s) to RAID arrays controlling networks of intelligent disk drives, large-scale buffer management, and multipathing interfaces. When the devices originally became programmable, hardware resources were precious, and shaving every machine cycle counted, or else the desired function really might not fit into the cost envelope. Real-time monitors were both relatively simple and were fast enough to allow maximum capability to be compressed into limited hardware. The fact that they were inexpensive certainly did not hurt.

By the turn of the century, most storage devices larger than a single disk, tape, or optical drive had arrived at a configuration that consisted of a previous-generation microprocessor, a real-time operating system, and custom application software linked directly into the realtime monitor. Larger devices, such as enterprise disk arrays, high-end NAS filers, and massive tape libraries actually used networks of these systems.

About that time, designers started noticing that implementation was becoming increasingly difficult. Functions such as remote replication over IP networks, security frameworks such as public key encryption for secure authentication, and even FibreChannel storage area networks created massive amounts of complicated code. Even a relatively simple function such as a remote console required the addition of serial devices, Ethernet and the implementation of standard network protocols such as TCP/IP.

Furthermore, a funny thing started happening: a lot of the functionality being built into storage devices depended on the same technology that was found in the operating systems that were running on the host computers that were processing the data.

This realization coincided with the availability of open source software, and an interesting proportion of storage devices developed in the past few years have been based on some version of UNIX, often Linux but sometimes a BSD derivative and occasionally other variants. This phenomenon might be characterized as the beginning of a trend toward the use of general purpose processors and especially software at the core of much more advanced storage processors.

The trend toward increased function in the storage shows no sign of slowing. In fact, we are arguably about to embark on a whole new round of increased storage processor function.

Virtually every data center – from Web 2.0 services to the most traditional bank to scientific research lab – is struggling with enormous problems in their storage. From the mass of legal mandates for data retention and auditing capabilities, to the tiered storage and data search capabilities and indexing, IT organizations are needing many new capabilities from their storage.

High Semantic Architectures

A subtle change from the past is that virtually all of the new capabilities require that storage processors possess an understanding of data as opposed to mere storage and retrieval of bits. Subtle though it may be, this distinction represents a radical departure – a watershed, really – because it implies an architectural change in the relationship between the computers that store the data and the computers that process the data.

Historically, the relationship between the processor and storage has been a strictly master/slave arrangement. The processor was responsible for everything except the actual persistence of the data: it ran all the code that decided which blocks were used and which were free, determined allocation, and computed the contents of metadata such as ownership, permissions and size. This organization made sense, since the disk storage was physically a part of the computer. With no other computer available, there weren't many other options.

As we noted earlier, the configurations grew in size and complexity, sprouting various computers and processors in the storage. Although SANs are physically rather different than traditional directly-attached disk and tape storage, they retained the same basic architecture. In particular, the host computers are still responsible for most storage computation. The salient point is that for all their sophistication, SAN disk arrays still have no understanding of the semantics of the data that they hold.

Networked file systems brought the first break from this architecture. SMB (now known as CIFS), Netware, and NFS represent a fundamentally different architecture. The storage component in NAS architectures, as is common with most object storage implementations, is responsible for maintaining the metadata, and for arbitrating external host access to it. The impact of this change is enormous. It permits data to be shared between multiple hosts in a relatively straightforward manner, because metadata is shared and has precisely defined semantics. Perhaps more importantly, it puts the storage processor in a peer relationship to the computational processor.

The case of "serverless backup" clearly demonstrates these differences. Back around 2000, the industry was searching for ways to offload the relatively expensive process of doing backups from main-line servers. To do this, a third-party copy engine had to be introduced, with access to both the disk storage and the tape drives. However, the copy engine couldn't just read the disk – it had no idea of the format the disk volumes, since one LUN could easily be an NTFS file system and the next might be a volume replication of a UNIX file system or a raw database table. The copy engine could only do its job by asking the original server – the very system we wanted to offload – to actually walk the file system and generate a list of files and blocks to be copied to the tape. Once provided with the information, the copy engine could dump the data to tape. However, this wasn't the end of the story. To protect the process, the copy engine had to have the server lock the files until it had safely copied them, and of course users wanted those files released as soon as they were dumped, not after the entire dump completed. This further necessitated communication between the copy engine and the server – to the point that doing the "serverless" backup so involved the server that it wasn't serverless at all.

Now compare this to having an NFS or CIFS server that is hosting the data. To back up the data without disturbing the application computer, one does the backup on the NFS server . Because it has possession of all of the semantics about the data, this is a job so trivial that its simplicity is often overlooked. This is the power of sharing the data semantics with the storage, and it is clearly the path forward in storage devices.

Peer Processing

When storage assumes greater responsibility in the overall solution architecture, it naturally has greater processing demands. Most high-semantic storage devices such as NAS filers, archivers, and fixed-content providers are based on current-generation front-line processors, possibly in multiples. There is sufficient computational demand that the typical system is growing rapidly in power and capacity.

Part of that increased computational complexity is due to the interpretation of relatively complex client/server protocols. CIFS, NFS, and other high-semantic protocols require considerably more interpretation on both sides of the wire than storage-only protocols such as SCSI and ATA. The high-level protocols enable the separation of processing and storage; they are nearly always symmetric (in that there is nothing to stop any given processor from implementing either client or server code). In particular, the client and server code that runs in the storage processor is extremely close to the server and client code that runs in the computational processor – if it is different at all. These protocols are also symmetric in terms of their resource consumption; the client and server implementations generally consume fairly similar resources.

Peer processing combines with another trend to result in the widespread adoption of general-processing platforms for storage devices. In particular, data sizes are growing at rates considerably faster than that of processor capability. This much is obvious from the fact that processor capability has grown by approximately a factor of ten in the last decade, while disk and tape capacities have grown by nearly 30x in the same period. What is less obvious is that the amount of data deployed per unit of processing is also increasing. Note that the metric is the amount of data, rather than the amount of storage, which is increasing at an even more rapid pace. The vastly increased data volumes and rapidly dropping cost of processors is creating substantial pressure on the capacity of individual storage units, especially in storage consolidation configurations. The multiplicative effect of these factors leads to storage devices that are likely to be more powerful than the individual computers that they serve.

These devices are extremely complex, and the amount of effort associated with creating and debugging the extended functionality is now nearly indistinguishable from corresponding development of standard computational platforms. It is no longer economically justifiable to develop technologies such as virtual memory, multiprocessor, multi-core/multi-strand systems, and bus infrastructures for just storage-specific applications.

Summary

Storage devices have been growing in complexity for many years, to the point where individual modern devices such as disk and tape drives exceed the capability and complexity of previous-generation PCs. At the same time, strong pressures from the user community to solve many additional storage-related problems is leading to the adoption of new and older architectures that place data semantics in the storage processors rather than in host computers. The symmetric nature of these protocols, combined with the explosion of both computational resources and retained data are leading to the development of peer-to-peer architectures in data centers. Finally, the processing demands are being concentrated, which will lead to extremely powerful storage devices and will be economical only through the re-application of general purpose hardware and software platforms.

My next article will focus on how Sun is meeting these new storage challenges with OpenSolaris and other storage initiatives.

    11-2-3 was similar to an early version of Excel.

    2RAID implementations are often categorized as to "hardware" and "software" flavors, but there is little difference in practice. Essentially all "hardware" implementations are simply software running on an outboard processor.

About the Author

Brian Wong is a Sun Distinguished Engineer. His research interests include capacity planning, storage systems and operating systems, with an emphasis on simplifying the use of advanced technologies in any of these areas. He has been seen recently chasing chickens and ducks around his Virginia farm.

Rate and Review
Tell us what you think of the content of this page.
Excellent   Good   Fair   Poor  
Comments:
Your email address (no reply is possible without an address):
Sun Privacy Policy

Note: We are not able to respond to all submitted comments.