Abstract: This case study shows how Solaris Zones technology was
used to solve scalability and performance problems relating to a single-threaded web server.
Contents:
1. Introduction
Multithreading applications are a way to scale and meet today's growing business requirements while reducing the number of systems needed.
Multithreaded applications scale well on the new chip multithreading (CMT) systems like the Sun Fire T1000/T2000 and V490 servers. Carriers and service providers can consolidate the back-end systems using CMT and multithreading. However, some applications are single-threaded, which will always provide an obstacle to performance and scalability. The Solaris 10 OS offers Solaris Zones, a virtualization environment that can be used to overcome this problem and scale horizontally on a single vertical system. This paper presents a case study of how Solaris Zones technology was used to overcome a scalability bottleneck related to a single-threaded web server.
Market Development Engineering (MDE) at Sun Microsystems works with independent software vendors (ISVs) on making ISV applications run the best on Sun. Tunathon is a yearly program run by MDE to bring together ISVs and the various product groups at Sun. The goal of the tunathon is to improve the ISVs' application performance. The ISVs have direct access to the product groups to resolve any performance-related issues in order to make their applications run faster on Sun. Sun provides the necessary hardware and software, along with engineering. TransNexus,
an ISV, develops operations and billing support software (OSS/BSS) for Voice over IP (VoIP) and network management. TransNexus participated in
Tunathon2005 because it was facing difficulties scaling the performance of a single-threaded web server.
1.1 TransNexus Applications
Open Settlements Protocol (OSP) is an international standard for VoIP carriers that provides a secure mechanism for IP communication. An OSP server authorizes call setup between peer VoIP gateways. The source gateway (the originating gateway in a call setup) sends an authorization request message to the OSP server to obtain the IP address of a destination gateway that can complete the call to the dialed number. The OSP server sends an authorization response message back to the source gateway. The authorization response message contains the IP address of the
destination gateway that can complete the call to the dialed number and also a digitally signed token to be used by the source
gateway in a call setup. The source gateway uses the digitally signed token to connect to the destination gateway; the destination gateway
verifies the token to make sure that it's coming from a trusted source.
The major steps involved in a call setup are shown in the following figure.
When the call is over, the source gateway and destination gateway both send
a
UsageIndication message to the OSP server. This message is confirmed by
a UsageConfirmation message from the OSP server to the source gateway and
destination gateway, as shown in the following figure.
Most of the TransNexus applications follow the above-mentioned OSP protocol for IP communication.
1.2 NexSRS Server
NexSRS server is an OSP server from TransNexus that runs on the Solaris 10 OS. Clients communicate with NexSRS using HTTP. NexSRS uses an external web server (Xitami) to process the HTTP requests. The external web server passes on the client request to NexSRS for processing. HTTP connections have configurable persistence, so the same HTTP connection can be used for multiple transactions.
2. The Problem
The external web server used by NexSRS is single-threaded, and it becomes a bottleneck for the scalability of NexSRS server and other NexSRS components. The external web server reaches 100% CPU utilization and cannot scale beyond, limiting NexSRS performance. NexSRS itself is multithreaded but is limited by the number of HTTP requests coming in from the external web server. This limits NexSRS and the external web server to two- or four-CPU systems.
3. The SolutionThe idea is to scale horizontally on a vertical system through virtualization. With Solaris Zones, you can create virtualized
application execution environments within a single instance of the operating system. The virtualization environment allows multiple
workloads to be run in isolation (see Solaris Zones: Operating System Support for Consolidating Commercial Workloads (pdf)). Applications running in one environment do not affect or see the data of another application running in another environment. The virtualized environment does not allow one application to hog system resources, and it also provides facilities to monitor, secure, configure, and administer at the application level as well as at the virtualization environment level.
In this scenario, multiple zones can be created on a 2-, 4-, 8- or 12-CPU system. An instance of the external web server and NexSRS can be deployed in each zone. Each zone now behaves as a separate system, allowing horizontal scaling on a single system. Each system can handle independent workloads or be load balanced to handle a single workload.
3.1 Introduction to Solaris ZonesZones in the Solaris 10 OS provide multiple virtual operating system environments (zones) sharing the same kernel instance: A single physical server is divided into multiple virtual servers, each with its own operating system environment. The two kinds of zones are global and non-global. The global zone is the Solaris OS environment that is bootable by the system hardware; non-global zones are created and managed by the global zone administrator. The global zone is a fully functional Solaris environment and is comparable to a normal Solaris instance. By default, the global zone is always running on a system even when no other zone is configured. Each non-global zone has its own
file systems, networking, security, and operating system resources.
The
zonecfg, zoneadm, and zlogin commands are used by the global zone administrator to create, install, configure, and boot non-global zones. Theoretically, 8192 zones can be created on any system; however,
in practice, the number depends on the total available resources and their use by various application modules (in order to optimize the use of resources).
A non-global zone can be in the following states:
A non-global zone transitions through the following states during a typical bring-up process: Configured --> Installed --> Ready --> Running.
These states are shown in the following figure.
3.1.1 Planning for Zone Creation
We need to know the virtual IP addresses of the non-global zones to be created; this virtual IP address is configured
on the physical network interface of the system.
Every non-global zone has a unique name.
global and any string starting with SUNW are reserved, so
these can't be specified as a zone name. Two types of non-global zone file systems can be created, sparse and whole root. The sparse root
zone model optimizes sharing of resources while the whole root zone provides maximum configurability. Non-global zones that have inherit-pkg-dir resources are called sparse root zones. Every non-global zone has its own root directory; the path to that directory is relative to the root directory of the global zone and is configured by setting the zonepath parameter while configuring the non-global zone. The root directory for every non-global zone must be created by the global zone administrator with privilege 700 to prevent other users running in the global zone from accessing the non-global zones.
The following information is needed at the time of zone creation:
The only mandatory parameter for zone creation is
zonepath. All other parameters are optional.
3.1.2 Zone Creation and Configuration
The global zone administrator uses the
zonecfg -z <zone-name> command to create and configure a non-global zone.
If we use the
info command at this time, we can see whatever configuration has been done so far.
3.1.3 Zone Installation
The
zoneadm command is used to verify and
install the zone, SRS1, at this point.
At this point, a boot environment (BE) is created at the
/zone1 location; zone SRS1 is now ready to be booted.
3.1.4 Booting the Zone
Before booting the first time,
zlogin -C zonename can be used to connect and establish a console session with the zone. This will prompt for information such as host name, time zone (TZ), and name service to complete the system identification. The global zone administrator can then boot the non-global zone using the zoneadm -z <zone-name> boot command from another terminal window.
Following is a snapshot:
After issuing the
zoneadm -z SRS1 boot command, we can see that the status of the zone SRS1 has changed from installed to running. Zone SRS1 has the ID 1. It's just like another system running, which has its own IP address and the Solaris OS environment. We can ping this virtual system:
The
zonename command can be used to obtain the zone name.
Note: The system identification parameters can also be set in the
/<zonepath>/root/etc/sysidcfg file before the first boot, avoiding the need to enter system identification
parameters during the first boot.
4. How Solaris Zones Help in Scaling Horizontally
NexSRS does not scale beyond two or more CPUs as it is limited by the external web server it uses to receive HTTP requests. An increase in load leads to the external web server reaching 100% CPU usage, limiting performance. Since the external web server is single-threaded, this is the maximum load that can be handled by NexSRS.
Solaris Zones technology allows a single instance of the Solaris OS to be virtualized into multiple application execution environments. NexSRS can be installed in these multiple virtualized environments, allowing multiple instances of the external web server to be run. Each instance of the external web server can now process the incoming requests and pass them on to NexSRS for further processing.
4.1 What We Did
We created a non-global zone and installed a copy of the external web server (Xitami) and NexSRS in the zone. An instance of Xitami and of NexSRS were run in the global zone and the new zone. We used two load generators to load the two instances of NexSRS. The load generators saw the two zones as separate systems. Each instance of NexSRS had its own configuration and processed the HTTP requests from the Xitami instance running in its zone. Note: The main system used was the Sun Fire V480 server, as shown in Figure 4.
Zones allowed two instances of Xitami to run on the same system, allowing NexSRS to received more HTTP requests and scale horizontally, overcoming the 100% CPU bottleneck.
4.1.1 Steps for Global Zones
We followed these steps:
4.1.2 Steps for Non-Global Zones
We followed these steps:
4.2 How We Generated the LoadThe load was generated from a Sun Fire V880 server with eight CPUs. We used the ApacheBench (ab) tool to load NexSRS. We started three instances of ApacheBench in three different shells, each sending a message to load NexSRS in the global zone -- see "Load gen (1)" in the following figure. Similarly, we used another three instances of ApacheBench in three different shells to load NexSRS in the non-global zone -- see "Load gen (2)" in Figure 5.
4.2.1 Loading NexSRS in a Global Zone
ApacheBench was used to load NexSRS in the global zone; see "Load gen (1)" in Figure 5. Note: The host name is
eagle:
4.2.2 Loading NexSRS in a Non-Global Zone
We used ApacheBench again to load NexSRS in the non-global zone -- see "Load gen (2)" in Figure 5. Note: The host name is
moe. The load was executed in parallel with the above load:
4.3 How We Measured the CPS
Calls per second (CPS) was measured by tailing the
nexus.log file -- the log files show calls per minute (CPM), which needs to be converted. ApacheBench also outputs CPS at the end of the test, and this was compared with the log file to ensure that tests ran successfully.
4.3.1 System Performance
CPU Usage on Solaris 9 OS
Xitami was bound to a 1-CPU processor set.
CPU Usage on Solaris 10 OS (No Zones)
Xitami was bound to a 1-CPU processor set, and
xitami.environment was set to 0.
Global Zone CPU Usage on Solaris 10 OS (With Zones)
Xitami was bound to a 1-CPU processor set, and
xitami.environment was set to 0.
Non-Global Zone CPU Usage on Solaris 10 OS (With Zones)
xitami.environment was set to 0.
Xitami was not bound to a processor set, since there were not enough idle resources. With sufficient idle resources, Xitami could be bound to a 1-CPU processor set, further
improving performance.
5. Improvement in PerformanceWithout zones we were getting 6227 CPM on the Solaris 9 OS and 8235 CPM on the Solaris 10 OS, and with zones we were able to reach 12955
CPM (7735 global zone CPM + 5220 non-global zone CPM). This was a 108% performance improvement on the same system.
5.1 Reason for ImprovementSolaris Zones allowed multiple instances of a single-threaded web server to run on a single system. Without the zones, only one instance of the web server could be run, which limited performance and scaling on a system. With an increase in the load, the single-threaded web server used to become a bottleneck using 100% of a CPU. With the zones, multiple instances of the web server can be run in their own virtualized environments, allowing the system to scale beyond the performance of a single instance.
6. ConclusionSolaris Zones technology offers horizontal scaling on a vertical system, allowing multiple virtualized environments to run in isolation. Multiple single systems can be consolidated into a single system, saving operational and capital expenditure (opex/capex), while improving the performance on a single system. In this tunathon, we used two instances of the external web server, Xitami, and and two instances of NexSRS to improve performance. The two NexSRS instances made the system appear as two separate systems. The separate NexSRS instances could be replaced with a single NexSRS instance running in the global or non-global zone, and the two Xitami instances could communicate with this NexSRS instance. A load balancer could distribute the load to the two Xitami instances.
A global zone and a non-global zone were used for scalability here; two non-global zones could also be used to achieve the same result, and they might be easier to manage. We used a processor set to manage the CPU resources. These could also be managed with the Solaris resource manager, which uses resource pools to allow resource management at the zone level (see References section for more information).
7. AcknowledgmentsWe would like to thank all the people who helped us with this project. Thanks to Hashamkha Pathan, Prashant Srinivasan, and Jan Van Bruaene for taking the time to review the paper. Their suggestions have increased the readability of the document. We would also like to acknowledge the developers.sun.com staff. Finally, we would like to thank William Murray from TransNexus who spent a lot of time helping us get started on the project.
8. References
About the Authors
Ashutosh Kumar has been with Sun since December 2004. He has been working on performance tuning for ISV applications, especially relating to Java technology and JVM performance. He does a lot of volunteer work with destitute women and composes poetry in his free time.
Nagendra Nagarajayya has been working with Sun for the last 12 years. He is a Staff Engineer in Market Development Engineering (MDE), working with ISVs in the telecommunications (telco) industry on issues related
to architecture, performance tuning, sizing and scaling,
benchmarking, porting, and so on. He specializes in multithreaded
issues, concurrency and parallelism, HA, distributed computing, networking, and performance tuning.
Dmitry Isakbayev has worked at TransNexus since 1997 and leads all software development. TransNexus has been an innovator of commercial and open source VoIP Operations and Billing Support Systems (OSS/BSS) since 1997. Key features of the TransNexus OSS/BSS solution include least cost routing, quality of service routing, secure inter-domain VoIP peering, traffic analysis and control, management reports, and more.
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
| ||||||||||||