Sun Java Solaris Communities My SDN Account

Article

Determining Free Physical RAM and Using It to Improve Your Application

 
By Greg Nakhimovsky, March 17, 2004  
Topics
 
Abstract
This article explores how application programs can find out how much free physical memory is available and how you can take advantage of this information. Specifically for the Solaris Operating System (Solaris OS), it provides a routine to determine the currently available physical memory, plus a sample program demonstrating how this interface can be used.

Introduction
One of the fundamental concepts in computer programming is the space/time tradeoff. The general idea is that by using more memory you may be able to increase the performance of your program. Alternatively, you may be able to use less memory if you are willing to sacrifice some performance for it. In this article, I concentrate on the former proposition, that is, exploring how to improve application programs if you know that you can count on a certain amount of available physical memory.

Application programs can trade the available physical memory for speed in many different ways. Here are a few examples:
  • Table lookup. In many cases, you can precompute a number of values, save them in memory, and then look up the result instead of computing it. Accessing a table is a very fast operation if the entire table can fit into physical memory. For a simple example using table lookup, see Example: Compute Population Counts below. Also, see "Table Lookup" (Reference [7]).

  • Caching, used in a variety of ways, for example in web browsers and network file systems. Instead of computing the result or transferring it from a remote location each time, you can store the result in memory and reuse it when it is needed again later.

  • Hashing, which provides a fast way to search a large, unsorted data set at the cost of extra memory. For an example, see "Hashing in Forth" (Reference [8]).

  • Using display lists for OpenGL graphics. An OpenGL display list is a specially optimized cache of OpenGL commands stored locally. You can define it once, and then you can use it as many times as necessary, which may increase the performance of your graphics application.
Before you can use any kind of space/time tradeoff in your application, you need to evaluate from within your program how much free physical memory you can use without causing paging to disk.

Such an evaluation and subsequent usage of this information can be difficult, for many reasons:
  • Different systems provide different ways (if any) to determine the amount of free physical memory available at the moment. So the method to obtain this information is highly platform-dependent. This information may also be more or less accurate and reliable on different systems.

  • The amount of free memory you get is subject to various race conditions, particularly under multi-user and multi-processing operating systems (which almost all of them are nowadays). The amount of free memory you obtain may become invalid at any time, because another program may consume some additional memory and change the amount of free memory at any moment. If your application is multithreaded, the other threads in your own application can also consume more memory and thus invalidate this information.

  • Your application may or may not have knowledge of its own memory requirements in the future. For example, interactive applications' future memory requirements may depend on the actions of the user (which are unknown so far).
As a result of these difficulties, most application programmers don't use any explicit space/time tradeoffs to improve their applications. They simply assume that the system will provide enough memory for the chosen algorithms. When there is a shortage of physical memory, most modern operating systems will engage virtual memory capabilities using disk swap space to complement the physical memory.

The main problem with virtual memory systems when they start paging to disk is that the performance of your application can suffer tremendously when that happens. The typical disk access speeds are many orders of magnitude lower than any kind of physical memory access speeds. Therefore, you need to avoid paging to disk as much as possible. No amount of theoretical performance improvement is worth introducing paging to disk in most cases.

So the challenge is to figure out a way to determine the amount of free physical memory and use it productively while not causing any paging to disk.
 
How to Determine Available Physical Memory Under the Solaris OS
The Solaris OS has the kstat(3KSTAT) facility that allows any program to access various kernel statistics values. The kstat library, libkstat.so, defines the application user interface (API) to the system's kstat facility.

In the Solaris OS (especially starting with Solaris 8), the amount of free physical memory currently available may be determined from the freemem and lotsfree values. Paging to disk will not occur until freemem falls below lotsfree. Therefore, the following quantity represents the amount of memory that an application can safely use without causing any paging to disk (that is, without awakening the page scanner):
freemem - lotsfree
Solaris 8 introduced a new file system caching architecture, the cyclical page cache. As a result, freemem reports a much more accurate amount of free physical memory. (Before Solaris 8, freemem was always reported very low because the file cache memory was subtracted from it.)

Here is routine phys_mem_avail() that you can use to obtain the amount of free physical memory in kilobytes at the moment. It is based on the kstat(3KSTAT) interface.

phys_mem_avail.c

As you can see, this routine obtains kernel values freemem and lotsfree and returns their difference converted to kilobytes.

Note that the returned value may be negative, which would mean that paging to disk is imminent or already occurring. The larger the absolute value of this negative value is, the higher the page scan rate will be, and more paging to disk will be occurring.

In this routine, the freemem value is the same as free memory reported by the vmstat(1) utility in its output lines after the first one. The lotsfree value defaults to 1/64-th of the total physical RAM in the system (decreased by certain small amounts of memory used for internal kernel needs). The default lotsfree value may be overridden in /etc/system, so it is a good idea to obtain its actual value from the kernel.

Most system parameters used in the phys_mem_avail() routine are only determined once, when it is called for the first time. The only value updated each time is freemem.

Note that you can also determine more information than the amount of free physical memory currently available. In the Solaris OS, the intensity of paging to disk (scan rate) increases linearly when freemem goes down below lotsfree and until it reaches minfree. Therefore, you can find out the percentage of maximum paging to disk with code like this:
   static size_t minfree;
   float percent_paging;

   if(pagesize == 0)
   {
...
     if((kn=kstat_data_lookup(sys_pagesp, "minfree")) == 0)
       perror("kstat_data_lookup(minfree)"),exit(1);
     minfree = kn->value.ul;
   }
...

   if(freemem > lotsfree)
     percent_paging = 0.0;
   else
     percent_paging = (float)(lotsfree - freemem) /
       (float)(lotsfree - minfree)*100.0;
For more information about the kstat(3KSTAT) facility, see Sun Product Documentation ([1]) and Solaris Internals ([2]), as well as the online article "Solaris Kernel Statistics," Parts One and Two ([4]). The article "The Solaris 8 Memory Architecture: Better, Simpler, and Faster" ([3]) and Princeton University Unix Systems group documentation ([5]) also provide useful information on the subject.

Under the Linux operating system, similar information can be determined using the sysinfo(2) system call. The freeram value in the resulting structure will contain the number of bytes of currently available free physical memory.
 
Detecting Paging to Disk From Your Program
You can use at least two methods under the Solaris OS to determine if the system is paging to disk at the moment or is close to paging to disk.

  1. Determine the amount of free physical memory available (as described above) and see if it is negative, which would indicate that your system is paging to disk or is about to start.

  2. Use microstate accounting for a particular process (your own process) to find the percentage of time spent by the process on waiting for data from memory (reported in the DFL column by prstat -m). A nonzero value means the system is paging to disk.
You can use both of these methods to check on each other. In this article, I'll use the first method.

Once you know that the system is paging to disk, one useful thing your application can do is issue a warning to the user. Indicate the fact that the system is paging to disk, and that it is most likely causing a significant performance degradation. Provide a numeric indication as well, such as the number of kilobytes by which the free memory has fallen below the lotsfree limit.

It may be a good idea for an application program to check for this condition every once in a while (say, once per minute). You may be able to do this in a special thread dedicated to this purpose. You can call sleep(3C) or nanosleep(3C) to suspend the current thread's execution between the calls to phys_mem_avail().

It would also be useful to outline the steps the user can take to attempt to eliminate paging to disk. Under the Solaris OS, the user can try the following, even without having to exit the application:
  • Close other applications that may be consuming a lot of memory.
  • Delete unnecessary files from /tmp (swapfs using memory).
Of course, adding RAM is another possible solution, although that would certainly require exiting the application first.

You can output such warnings into a certain log file or display them in separate window if you use a graphical user interface with your application.

A more sophisticated way to improve your application once you have this knowledge is to adjust the logic of your program to the low-memory situation dynamically once you detect paging to disk. For example, in addition to alerting the user, you may be able to switch to a slower but less memory-consuming algorithm in that case. This would make the application more self-adjusting and thus more intelligent.
 
Improving Application Performance by Using Available Memory
This is more difficult and less certain in a general case than simply warning the user that paging to disk is going on, but it is possible in many cases.

Generally, an application may not have enough information to make intelligent decisions regarding its memory usage. However, there are ways to obtain the missing information, at least in some cases:
  • You, as an application programmer, may simply know the situation of the user. For example, your application may require or assume a dedicated machine, not allowing any other user or application to use the system or at least to consume any significant resources on it while your application is running.

  • You may also know at a certain point that the application is very unlikely to request more memory than it has already consumed.
Once you have this extra knowledge, you can use it to decide which algorithms would provide the best performance under the circumstances. If calling phys_mem_avail() tells you there is enough physical memory at the moment, and if you can determine that you can count on that free memory being there long enough for your purposes, then you can trade this memory for performance by dynamically selecting a faster algorithm.

If you don't know the user situation and cannot safely assume what it is, you may be able to ask the user and give him/her a choice of which algorithm to use: faster and more memory-consuming or slower but less memory-consuming.

You can ask the user in different ways. In interactive applications, you can provide a dialog box, for example. In other cases, some kind of a configuration file might be more appropriate.

This situation resembles the manual versus automatic transmission dilemma in a car. There, too, an automatic transmission does not have enough information to make the most optimal decision in many cases. (For example, will you go uphill or downhill during the next minute? Will you increase or decrease speed?) A manual transmission allows the user to make these decisions, at the cost of being less automatic and less self-adjusting. Note that some "power" users enjoy having more control for themselves, both in case of their car transmissions and their computer applications. Therefore, providing such a choice to the application user is a good idea.
 
Example: Compute Population Counts
In the article "Once Upon a Time-Memory Tradeoff" (Reference [6]), Mark Stamp describes a simple example of a space/time tradeoff: computing population counts. (He also describes much more complex examples there, but I will only use the simplest one here.) Algorithms like this find applications in cryptography, among other fields.

The population count popcnt(x) of a given non-negative integer x is defined as the number of ones in the binary representation of x. For example, popcnt(13) is 3, because the binary form of 13 is 1101.

Note that this sample program does not consume much memory with any of these algorithms. Also, there are more clever and faster algorithms to count the number of bits in non-negative integers. Therefore, this sample program is not very practical. I am only using it here to demonstrate how effective trading the available physical memory for performance can be and how it can be accomplished in an application program.

Here is the sample program:

popcnt.c

The task is to compute the total of population counts for the first 100,000,000 numbers starting with 0.

The program computes the required total three times: first using an obvious brute-force approach, then by precomputing the first 256 values, and finally by precomputing the first 65536 values. See "Once Upon a Time-Memory Tradeoff" (Reference [6]) for the details of the algorithm and why precomputing a number of values and using a table lookup can improve performance in this case.

Compiling and running this sample program produces the following on a 360-MHz Ultra 10 workstation under the Solaris 9 OS. Note that for the purposes of this sample program, I am not using any compiler optimization, just to make the issues more clear. Of course, for real applications using compiler optimization is a very good idea.
% cc -c phys_mem_avail.c
% cc popcnt.c phys_mem_avail.o -lkstat
% a.out
SUM(Obvious_popcnt) = 1314447104
 Time:    33.02 sec
-----------------
Available physical memory: 246456 Kbytes
Memory required for storing 256 values:  1 Kbytes
Time to precompute 256 values:     0.00 sec
SUM(using 256 precomputed values) = 1314447104
 Time:     6.25 sec
-----------------
Available physical memory: 246456 Kbytes
Memory required for storing 65536 values:  256 Kbytes
Time to precompute 65536 values:     0.03 sec
SUM(using 65536 precomputed values) = 1314447104
 Time:     3.45 sec
% 
As you can see, precomputing and storing some numbers in memory, and then using a table lookup, have improved this program's performance this much:
33.02/3.45 = 9.57 times
You can also see from these results how using more memory to keep precomputed data has resulted in higher performance.

Invoking phys_mem_avail() and making sure there is enough available physical memory before using the faster algorithms has made this program more intelligent and robust, and the performance gain more certain.
 
Acknowledgments
I would like to thank Iain Bason, a Sun colleague, for the idea of using kstat(3KSTAT) to determine the currently available physical memory in the Solaris OS.
 
About the Author
Greg Nakhimovsky is a Sun Microsystems engineer working with application software vendors to make sure their products run well on Sun systems. He has over 20 years of industry experience developing, performance tuning, and troubleshooting technical computer applications.
 
References
[1] Sun Product Documentation
[2] Solaris Internals: Core Kernel Architecture by Richard McDougall and Jim Mauro
[3] The Solaris 8 Memory Architecture: Better, Simpler, and Faster
[4] Solaris Kernel Statistics - Accessing libkstat with C and Solaris Kernel Statistics, Part II: Accessing libkstat with Shell Scripts by Peter Boothby
[5] Princeton University Unix Systems Group documentation
[6] Once Upon a Time-Memory Tradeoff (PDF) by Mark Stamp
[7] Table Lookup from C++ In Action: Industrial-Strength Programming Techniques by Bartosz Milewski
[8] Hashing in Forth by Xan Gregg on the Forth Interest Group site
 
Rate and Review
Tell us what you think of the content of this page.
Excellent   Good   Fair   Poor  
Comments:
Your email address (no reply is possible without an address):
Sun Privacy Policy

Note: We are not able to respond to all submitted comments.

Oracle is reviewing the Sun product roadmap and will provide guidance to customers in accordance with Oracle's standard product communication policies. Any resulting features and timing of release of such features as determined by Oracle's review of roadmaps, are at the sole discretion of Oracle. All product roadmap information, whether communicated by Sun Microsystems or by Oracle, does not represent a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. It is intended for information purposes only, and may not be incorporated into any contract.