|
By Max Bruning, June 2006
|
|
|
Contents
Many developers are writing applications to run under the Linux operating system.
With the many new features of the Solaris 10 OS, and with the new emphasis Sun has placed on supporting the Solaris OS on AMD and Intel processor-based machines, developers are becoming interested in being able to develop their applications on the Solaris platform.
This article examines similarities and differences in the development environments
of both operating systems. Someone responsible for porting applications from Linux to the Solaris OS, or programmers with prior Linux experience that want to learn development on the Solaris OS, should benefit from this article.
In this article, the term "Solaris" refers to the Solaris 10 OS (and OpenSolaris), and "Linux" refers to Linux 2.6. Many of the details covered will also apply to
earlier versions of Solaris and Linux. The Linux distribution is meant to be
generic, though examples have been tested on SuSe 9.1.
Also, the article concentrates on applications written using the
C programming language, though C++ should behave the same.
Since Java technology-based applications should not be making function calls specific to Linux or the Solaris OS, they should be portable as is.
Introduction
This article discusses similarities and differences that will be visible to
application programmers and analysts on the Solaris OS and Linux.
It is not meant as an exhaustive description of differences, nor
is it meant to show that one OS is superior to the other.
Rather, the article tries to help developers experienced in one
of the OSes to work with the other OS as quickly as possible.
A simple application that is POSIX-compliant and doesn't make any system calls
or library functions specific to the Solaris OS or Linux should be portable between the OSes without changes. You should be able to write your app, compile for the Solaris OS or Linux, and simply recompile for the other OS, and it should work. Most of the system calls and library routines on both OSes will fall into
this category.
Many system calls in Linux exist as library functions in the Solaris OS, and
vice versa. For instance, sched_setscheduler() is
a system call in Linux and a library function that calls
the priocntl(2) system call in the Solaris OS.
The priocntl(2) system call does not exist in Linux, but Linux
does not support multiple schedulers beyond time share and real time.
The next section of this article groups system calls into functional
sections and compares what is available in each OS.
Most of the applications and toolkits from the Linux world will compile and
run without changes. These include gcc, emacs, MySQL, perl, and many others.
Precompiled binaries for many packages are available at http://www.sunfreeware.com.
A few articles are available comparing Linux and the Solaris OS, but most of them
are comparing older versions of both. You can
find them by searching for "Linux and Solaris comparison" on the web.
See the Seal Rock Research White Paper (pdf) on the Solaris OS and Linux, which does cover the Solaris 10 OS and 2.6 Linux. Migrating Solaris Applications to Linux is the beginning of several pages that discuss issues porting the Solaris OS to Linux.
Various administrative differences exist between the Solaris OS and Linux, and within Linux, between different distributions. The Solaris 10 OS has introduced the "Service Management Framework" (SMF), which is a big change from previous versions of Solaris. Coverage of system administration differences will not be handled in this paper, except where it affects developers.
System Calls and Libraries
Most of the system calls and libraries that exist in Linux also exist in the Solaris OS.
This section will cover system calls and library routines that are different between
the two systems. The system calls and library routines are categorized as follows:
The Solaris OS keeps a list of system calls in /usr/include/sys/syscall.h. Linux maintains the same information in /usr/include/asm/unistd.h.
(Note that both Linux and the Solaris OS have unistd.h and syscall.h files, and that in some cases, the files agree in content.)
Documentation for system calls is available in the Solaris OS and on Linux at /usr/share/man/man2.
(The Solaris OS has a symbolic link from /usr/man to the same place.)
Library routines are documented in various manual sections. See man intro.3 for
an overview of the library sections on Linux and on the Solaris OS. Note that the Solaris OS breaks down
the library routines more finely than Linux. For instance, aio_read()
is documented at aio_read(3RT) on the Solaris OS, while on Linux,
it is documented at aio_read(3).
The result of this is that when compiling a program using aio_read() on
the Solaris OS, one must include the real-time library via -lrt with the compilation/link
command, which is not necessary on Linux.
Both Linux and the Solaris OS come with over 200 different libraries, with more than 50,000 functions defined
within the libraries.
The following table lists some libraries on Linux and the Solaris OS.
Note that this is not meant to be a complete listing.
Also note that some of these libraries must be downloaded
and installed separately from normal installation of the system.
 |
libc |
libc |
The standard C library (POSIX,
SysV, ANSI, etc.) See man libc on Solaris OS. |
libucb |
libc |
UCB (University California Berkeley) compatibility library |
libmalloc |
libc |
There are several
different malloc libraries; the default is in libc. |
libsocket |
libc |
Socket library
(sockets are in libc on Linux). |
libxnet |
libc |
X/Open Networking
library |
libresolv |
libresolv |
DNS routines (and on Solaris OS,
inet_* routines) |
libnsl |
libnsl/libc |
Network services library (linux
- nis/nis+ routines) |
librpc |
librpc |
RPC functions |
libslp |
libslp |
Service Location Protocol |
libsasl |
libsasl |
Simple Authentication and
Security Layer |
libaio |
libaio |
Asynchronous I/O library |
libdoor |
|
Door support (door_create(),
door_return(), etc.) |
librt |
librt |
POSIX Real Time library |
libcfgadm |
|
Configuration administration
library |
libcontract |
|
Contract management library (see
man contract.4 on Solaris OS) |
libcpc |
|
CPU performance counter library
(on Linux, may need to install kernel module?) |
libdat |
|
|
libelf |
libelf |
ELF support library |
libm |
libm |
Math library |
The next sections take a closer look at some of the system calls and libraries.
We'll concentrate on what's different between the systems.
Sockets and Networking
Most of the socket and networking code should simply need to be recompiled for the OS you are using, and the resulting executable should work.
This section compares network-related system calls and library routines that are typically used on the Solaris OS and Linux.
socket()
The socket() routine, in addition to the AF_UNIX, AF_INET, and AF_INET6 domain arguments, has additional values on the Solaris OS and Linux.
On the Solaris OS, the AF_NCA domain is used to specify the Network Cache and Accelerator (see nca(1))
for use with a socket.
Most of the address families (domains) exist on both Linux and the Solaris OS.
Note: See /usr/include/sys/socket.h on the Solaris OS and
/usr/include/linux/socket.h for the possible
address families.
But you may need to download or write code to support some of the domains.
Linux has several additional domains documented on the socket(2) man page. The additional documented domains on Linux are:
-
AF_IPX - Novell IPX protocols (may be for SuSe only?).
-
AF_NETLINK - Kernel/user interface device, allows users to access kernel modules. Note: Other ways exist to do this on the Solaris OS (and on Linux for that matter).
-
AF_X25 - X25 protocol. On the Solaris OS, this domain is included with Solstice X.25 product.
-
AF_AX25 - Amateur radio AX.25 protocol.
-
AF_ATMPVC - Permanent Virtual Circuits over ATM.
-
AF_APPLETALK - See man ddp on Linux. Also exists on the Solaris OS but not documented.
-
AF_PACKET - See man packet.7 on Linux. Raw packet interface. On the Solaris OS, open
the NIC device and use getmsg(2)/putmsg(2) to receive/send raw packets using DLPI.
(See Data Link Provider Interface (DLPI), Version 2 for details on DLPI).
bind()
The Linux man page (man bind.2), includes some information about different address families
besides AF_INET and AF_UNIX. The Solaris man page is man bind.3socket.
listen()
On both Linux and the Solaris OS, the backlog argument (the second argument to listen())
refers to the queue length for established connections that are waiting to be accepted. The Linux man page says this, while the Solaris man page just refers to the "queue of pending connections".
accept()
Linux supports three connection-based socket types: SOCK_STREAM, SOCK_SEQPACKET,
and SOCK_RDM, whereas the Solaris OS only documents SOCK_STREAM. The Linux implementation does not inherit some socket flags. This may
differ from other implementations.
connect()
The Linux man page (man connect.2) documents SOCK_SEQPACKET, while the Solaris OS does not.
Linux breaks the association between a connectionless socket and connect()
by connecting to an address with sa_family in struct sockaddr
set to AF_UNSPEC. This behavior is not documented in the Solaris OS.
send()/recv()
As in the other socket library functions, these behave almost identically
between the systems. Linux has some additional flags argument documentation
on the man page.
shutdown()
No noticeable difference between the Solaris OS and Linux.
Networking Example
It can be useful to look at an application where some of the differences appear.
The tracedump program uses a packet capture
library (libpcap) to read Ethernet packets
at the user level. The code to read raw Ethernet is quite different between
the Solaris OS and Linux. (libpcap can also be used to examine the differences with other systems, such as FreeBSD, HP-UX, and AIX.)
The applicable code in libpcap is at pcap-linux.c
and pcap-dlpi.c. The DLPI code is used for Solaris, HP-UX, AIX, and other operating systems. Linux provides a mechanism for reading raw socket packets
via the standard socket calls. The Solaris OS uses the getmsg(2) and putmsg(2) calls to receive and send DLPI packets.
The following code demonstrates a way to do user-level
packet capture on a network interface in the Solaris OS.
This is followed by the analogous code in Linux.
This code is a (very greatly) simplified extraction from the libpcap
library.
#include <sys/types.h>
#include <sys/dlpi.h>
#include <sys/stream.h>
#include <stdio.h>
#include <errno.h>
#include <stropts.h>
#include <unistd.h>
#include <fcntl.h>
#include <string.h>
int
main(int argc, char *argv[])
{
register char *cp;
int fd;
dl_info_ack_t *infop;
union DL_primitives dlp;
dl_info_req_t inforeq;
dl_bind_req_t bindreq;
dl_attach_req_t attachreq;
dl_promiscon_req_t promisconreq;
struct strbuf ctl, data;
int flags;
char buffer[8192];
dl_error_ack_t *edlp;
fd = open(argv[1], O_RDWR); /* for instance, /dev/elxl0 */
/* attach to a specific interface */
attachreq.dl_primitive = DL_ATTACH_REQ;
attachreq.dl_ppa = 0; /* assume we want /dev/xxx0 */
ctl.maxlen = 0;
ctl.len = sizeof(attachreq);
ctl.buf = (char *)&attachreq;
flags = 0;
/* send attach req */
putmsg(fd, &ctl, (struct strbuf *)NULL, flags);
ctl.maxlen = sizeof(dlp);
ctl.len = 0;
ctl.buf = (char *)&dlp;
/* get ok ack, may contain error */
getmsg(fd, &ctl, (struct strbuf*)NULL, &flags);
memset((char *)&bindreq, 0, sizeof(bindreq));
/* the following bind might not need to be done */
bindreq.dl_primitive = DL_BIND_REQ;
bindreq.dl_sap = 0;
bindreq.dl_max_conind = 1;
bindreq.dl_service_mode = DL_CLDLS;
bindreq.dl_conn_mgmt = 0;
bindreq.dl_xidtest_flg = 0;
ctl.maxlen = 0;
ctl.len = sizeof(bindreq);
ctl.buf = (char *)&bindreq;
flags = 0;
/* send bind req */
putmsg(fd, &ctl, (struct strbuf *)NULL, flags);
ctl.maxlen = sizeof(dlp);
ctl.len = 0;
ctl.buf = (char *)&dlp;
/* get bind ack */
getmsg(fd, &ctl, (struct strbuf*)NULL, &flags);
promisconreq.dl_primitive = DL_PROMISCON_REQ;
promisconreq.dl_level = DL_PROMISC_PHYS;
ctl.maxlen = 0;
ctl.len = sizeof(promisconreq);
ctl.buf = (char *)&promisconreq;
flags = 0;
/* send promiscuous on req */
putmsg(fd, &ctl, (struct strbuf *)NULL, flags);
ctl.maxlen = sizeof(dlp);
ctl.len = 0;
ctl.buf = (char *)&dlp;
/* get get ok ack */
getmsg(fd, &ctl, (struct strbuf*)NULL, &flags);
promisconreq.dl_primitive = DL_PROMISCON_REQ;
promisconreq.dl_level = DL_PROMISC_SAP;
ctl.maxlen = 0;
ctl.len = sizeof(promisconreq);
ctl.buf = (char *)&promisconreq;
flags = 0;
/* send promiscuous on req */
putmsg(fd, &ctl, (struct strbuf *)NULL, flags);
ctl.maxlen = sizeof(dlp);
ctl.len = 0;
ctl.buf = (char *)&dlp;
/* get get ok ack */
getmsg(fd, &ctl, (struct strbuf*)NULL, &flags);
/* read and echo to stdout whatever comes to us */
while (1) {
data.buf = buffer;
data.maxlen = sizeof(buffer);
data.len = 0;
ctl.buf = (char *) &dlp;
ctl.maxlen = sizeof(dlp);
ctl.len = 0;
flags = 0;
getmsg(fd, &ctl, &data, &flags);
write(1, "\nCTL:\n", 6);
write(1, ctl.buf, ctl.len);
write(1, "\nDAT:\n", 6);
write(1, data.buf, data.len);
}
}
|
The Solaris code forms DLPI requests and gets DLPI responses to tell the interface
that the application wants a copy of all packets arriving at the interface.
The code in Linux is much simpler, as a socket(2) call allows one to specify raw packets. Linux does not use DLPI or STREAMS.
#include <errno.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <string.h>
#include <sys/socket.h>
#include <sys/ioctl.h>
#include <net/if.h>
#include <netinet/in.h>
#include <linux/if_ether.h>
#include <linux/if_packet.h>
#include <net/if_arp.h>
#include <stdio.h>
int
main(int argc, char *argv[])
{
int sock_fd = -1;
struct sockaddr_ll sll, from;
struct packet_mreq mr;
socklen_t fromlen;
int packet_len;
char buffer[8192];
sock_fd = socket(PF_PACKET, SOCK_RAW, htons(ETH_P_ALL));
memset(&sll, 0, sizeof(sll));
sll.sll_family = AF_PACKET;
sll.sll_ifindex = 0;
sll.sll_protocol = htons(ETH_P_ALL);
bind(sock_fd, (struct sockaddr *) &sll, sizeof(sll));
while (1) {
fromlen = sizeof(from);
packet_len = recvfrom(
sock_fd, buffer, sizeof(buffer), MSG_TRUNC,
(struct sockaddr *) &from, &fromlen);
write(1, buffer, packet_len);
}
}
|
Process/Processor Management
A process on both the Solaris OS and Linux is a running instance
of a program. In both the Solaris OS and in Linux (2.6), a process is
a container for an address space and one or more threads.
Every process in the system has a unique process ID (PID), which
remains unique for some time after the process dies.
Processes are created using fork(2) and its variants.
On Linux, processes (and threads) can also be created using clone(2),
but pthread_create(3) is more portable.
On the Solaris OS, the undocumented lwp_create() system call is somewhat analogous to clone(2).
vfork() performs similarly on both systems. The Solaris OS has fork1() and forkall(). In the case of fork1(), this causes the child process to only have the thread that executed the fork() call; in the case of forkall(), all the threads that were in the parent are replicated in the child. The default fork is fork1(). forkall() must be explicitly used.
forkall() does not exist in Linux, (i.e., Linux only supports fork1() semantics).
The ps -elfL command can be used on both the Solaris OS and Linux to see the threads
in a process. Both systems report the number of LWPs and the
lwpid for each thread in the process.
Note that an lwpid is unique across processes in Linux. In the Solaris OS, the lwpid is unique within the process.
In Linux, the process ID of a multithreaded process is actually a thread group ID.
The thread group ID is equivalent to the process ID of the main thread.
Sending a signal (via kill(1)/kill(2)) to any lwpid is equivalent to sending the signal to the process. In the Solaris OS, you send the signal to the pid. In both cases, if the default action is taken, the process typically exits and all threads are terminated.
See the man page for ps(1) for more details.
Both Linux and the Solaris OS support the notion of binding a process or thread to a processor. Linux allows binding to a set of processors for non-exclusive use of those processors. The Solaris OS allows binding to a set of processors for exclusive use, (that is, CPU fencing), but does not allow binding to a group for non-exclusive use (except via Solaris Zones?).
Linux does not have a mechanism for CPU fencing, though implementations can
be found on the web (see, for example, the CPUSETS for Linux page on the bullopensource.org site).
The Linux system calls that are processor affinity based are sched_setaffinity(2) and sched_getaffinity(2).
The Solaris OS has the following:
-
processor_bind(2) to bind/unbind LWPs or processes to a processor
-
pset_create(2) to set up a processor set
-
pbind(1) and psrset(1), which are command-line interfaces
For completeness, output of the ps(1) command, first on Linux, then
on the Solaris OS, is shown in the section on Threads.
On Linux and the Solaris OS, all forms of the exec system call
result in calling execve(2). The Solaris OS documents all six flavors
of exec(2) on the same manual page. The Linux man page exec(3) documents execv, execl, execle, execlp, and execvp. A separate page covers execve(2).
The /proc file system exists in slightly different variations on Linux
and the Solaris OS.
On both systems, /proc is a directory containing files whose names are the process IDs of the current active processes on the system. Each PID-named file is in turn a directory. /proc on Linux has various other directories besides processes.
Most of these deal with processors, devices, and statistics on the system.
On Linux, one looks in /proc to find information about processes, processors, devices, machine architecture, and so on. On the Solaris OS, the same kind of information is typically available by using a command.
For instance, prtconf(1) can be used to learn about machine configuration on the Solaris OS.
On Linux, this is done largely by looking at files in /proc.
The virtual address space used by processes can be examined using pmap(1)
on the Solaris OS, and by catting the /proc/pid/maps
file on Linux, as shown below.
See pmap(1) on the Solaris OS and proc(5) on Linux
for more details.
<-- on solaris, address space of this instance of bash -->
bash-3.00$ pmap -x $$
1043: /usr/bin/bash -i
Address Kbytes RSS Anon Locked Mode Mapped File
08045000 12 12 4 - rw--- [ stack ]
08050000 528 468 - - r-x-- bash
080E3000 76 72 8 - rwx-- bash
080F6000 124 108 40 - rwx-- [ heap ]
FED8E000 4 4 - - rwxs- [ anon ]
FEDA0000 4 4 - - rwx-- [ anon ]
FEDB0000 760 660 - - r-x-- libc.so.1
FEE7E000 24 24 8 - rw--- libc.so.1
FEE84000 8 8 - - rw--- libc.so.1
FEE90000 24 8 4 - rwx-- [ anon ]
FEEA0000 524 324 - - r-x-- libnsl.so.1
FEF33000 20 20 4 - rw--- libnsl.so.1
FEF38000 32 - - - rw--- libnsl.so.1
FEF50000 44 40 - - r-x-- libsocket.so.1
FEF6B000 4 4 - - rw--- libsocket.so.1
FEF70000 4 4 4 - rwx-- [ anon ]
FEF80000 144 132 - - r-x-- libcurses.so.1
FEFB4000 28 24 - - rw--- libcurses.so.1
FEFBB000 8 - - - rw--- libcurses.so.1
FEFC0000 4 4 - - r-x-- libdl.so.1
FEFC7000 140 140 - - r-x-- ld.so.1
FEFFA000 4 4 4 - rwx-- ld.so.1
FEFFB000 8 8 4 - rwx-- ld.so.1
-------- ------- ------- ------- -------
total Kb 2528 2072 80 -
bash-3.00$
|
For the equivalent on Linux, see Figure 1. Note that Linux shows the full
path name to libraries (the output has been edited to only show
the library name). To get the full path names to libraries on the Solaris OS, use
pldd(1).
 |
|
Figure 1: Examining Virtual Address Space Used by Processes in Linux
|
Threads
Linux and the Solaris OS support POSIX threads, Linux via
The Native POSIX Thread Library for Linux, and the Solaris OS as part of the standard C library.
See
Multithreaded Programming Guide, specifically, Chapter 5 Programming with the Solaris Software, for details of threads on the Solaris OS.
Also quite good is the white paper
Multithreading in the Solaris Operating Environment.
In addition to POSIX threads, the Solaris OS supports "Solaris threads".
The threads(5) man page describes the similarities and differences between the POSIX thread library and the Solaris thread library.
The implementations are interoperable and can be used with care
within the same application. The following is straight from the man page.
Similarities
Most of the functions in the libpthread and libthread libraries have a counterpart in the other corresponding library. POSIX function names, with the exception of the semaphore names, have a "pthread" prefix. Names for similar POSIX and Solaris functions have similar endings. Typically, similar POSIX and Solaris functions have the same number and use of arguments.
Differences
-
POSIX threads are more portable.
- POSIX threads establish characteristics for each thread according to configurable attribute objects.
- POSIX pthreads implement thread cancellation.
- POSIX pthreads enforce scheduling algorithms.
- POSIX pthreads allow for clean-up handlers for
fork(2) calls.
- Solaris threads can be suspended and continued.
- Solaris threads implement interprocess robust mutex locks.
-
Solaris threads implement daemon threads, for whose demise the process does not wait.
The following is a very simple MT program. Very few differences are found in the ways in which multithreaded applications work between the two OSes.
Of course, the underlying implementations have several differences.
#include <pthread.h>
#include <stdio.h>
void *fcn(void *);
int
main(int argc, char *argv[])
{
pthread_t tid;
pthread_create(&tid, NULL, fcn, NULL);
(void) printf("main thread id = %x\n", pthread_self());
pthread_join(tid, NULL);
}
void *
fcn(void *arg)
{
printf("new thread id = %x\n", pthread_self());
}
|
Use the following to compile and run the program on the Solaris platform:
bash-3.00$ cc simplepthread.c -o simplepthread
bash-3.00$ ./simplepthread
main thread id = 1
new thread id = 2
bash-3.00$
|
Using gcc on the Solaris platform gives the same results.
On Linux it appears thus:
max@linux:~/source> cc simplepthread.c
/tmp/cc8u7kZs.o(.text+0x1e): In function `main':
simplepthread.c: undefined reference to `pthread_create'
/tmp/cc8u7kZs.o(.text+0x4a):simplepthread.c: undefined reference
to `pthread_join'
collect2: ld returned 1 exit status
max@linux:~/source> cc simplepthread.c -lpthread -o simplepthread
max@linux:~/source> ./simplepthread
main thread id = 4015c6c0
new thread id = 4035cbb0
max@linux:~/source>
|
On Linux, the POSIX thread library needs to be explicitly linked. Note that
Solaris 9 and earlier versions also require this. In the Solaris 10 OS, POSIX threads are in the standard C library (libc.so).
Note also that the Solaris OS assigns thread IDs using a monotonically increasing integer starting at 1. Linux uses the user virtual address of
the pthread structure (structure used internally by the thread library).
Visibility to threads is provided on both systems by the ps(1) command,
and via the /proc file system.
See Figure 2 for the output of the ps(1) command on the Solaris platform and Figure 3 for the output on Linux. You'll see that, given the same options, the output is very similar between the machines.
 |
|
Figure 2: Output of ps(1) Command on Solaris Platform
|
 |
|
Figure 3: Output of ps(1) Command on Linux
|
The command shows state, user, PID, parent PID, LWP ID, number of LWPs (for user processes, this is the number of threads), scheduling class, scheduling priority, user virtual size, wait channel,
start time, tty, time spent running, and command.
Linux does not report ADDR, and the Solaris OS shows the (kernel) virtual address of the proc_t data structure, which the kernel uses to maintain the process.
Linux shows WCHAN as a symbol, while the Solaris OS shows it as an address.
In the Solaris OS, the WCHAN column is the address of a synchronization variable
on which the thread is blocked. On Linux, WCHAN is the routine in which the thread is sleeping.
To get the equivalent information in the Solaris OS, use ::threadlist -v inside of mdb -k.
Note that on a machine running a 64-bit kernel (that is, SPARC or AMD64 architecture based), the
ADDR and WCHAN fields will display a question mark (?). To see the values for these two fields, use ps -e -o addr,wchan,comm.
More likely, you are interested in what the application threads are doing.
For this, use pstack(1) on the process ID of interest.
There is a pstack on Linux, but it must be downloaded.
Search for it on http://rpmfind.net/linux/RPM/.
Note that it only gives the stack backtrace of one thread (the thread ID that is passed to it as an argument).
If you want a backtrace of all threads within a process, you need to pass the thread IDs as separate arguments.
<-- get user-level stack(s) of a process on Solaris -->
bash-3.00$ pstack `pgrep mozilla-bin`
21528: /usr/sfw/bin/../lib/mozilla/mozilla-bin -UILocale en-US
----------------- lwp# 1 / thread# 1 --------------------
fef68967 pollsys (896dac8, 9, 0, 0)
fef2b2aa poll (896dac8, 9, ffffffff) + 52
fe793242 g_main_context_iterate () + 39d
----------------- lwp# 2 / thread# 2 --------------------
fef68967 pollsys (fbf5bd04, 1, 0, 0)
fef2b2aa poll (fbf5bd04, 1, ffffffff) + 52
fede047d _pr_poll_with_poll (816fa0c, 1, ffffffff, fbf5bf64,
fc0558aa, 816fa0c) + 2d5
fede05f1 PR_Poll (816fa0c, 1, ffffffff) + 11
fc0558aa __1cYnsSocketTransportServiceEPoll6M_i_ (816f6b8) + 58
fc055f7d __1cYnsSocketTransportServiceDRun6M_I_ (816f6b8) + 18f
fc3d1262 __1cInsThreadEMain6Fpv_v_ (816eb60) + 32
fede1693 _pt_root (816fcc0) + 9e
fef67b30 _thr_setup (feec2400) + 51
fef67f40 _lwp_start (feec2400, 0, 0, 0, 0, 0)
----------------- lwp# 4 / thread# 4 --------------------
fef67f7b lwp_park (0, fa87deb8, 0)
fef620bb cond_wait_queue (825cfec, 816b8d0, fa87deb8, 0) + 3e
fef62462 cond_wait_common (825cfec, 816b8d0, fa87deb8) + 1e9
fef62691 _cond_timedwait (825cfec, 816b8d0, fa87df38) + 4a
fef62722 cond_timedwait (825cfec, 816b8d0, fa87df38) + 27
fef62761 pthread_cond_timedwait (825cfec, 816b8d0,
fa87df38) + 21
feddc598 pt_TimedWait (825cfec, 816b8d0, f1c) + b8
feddc767 PR_WaitCondVar (825cfe8, f1c) + 64
fc3d417e __1cLTimerThreadDRun6M_I_ (81e5108) + 16e
fc3d1262 __1cInsThreadEMain6Fpv_v_ (820d690) + 32
fede1693 _pt_root (820e6b0) + 9e
fef67b30 _thr_setup (fb520400) + 51
fef67f40 _lwp_start (fb520400, 0, 0, 0, 0, 0)
bash-3.00$
|
Here is an equivalent on Linux. It is interesting that programs like Mozilla and xemacs are stripped on Linux and not stripped on the Solaris OS.
max@linux:~> cd /proc/`pgrep mozilla`/task
max@linux:/proc/3991/task> pstack *
3991: /opt/mozilla/lib/mozilla-bin
(No symbols found)
0xffffe410: ???? (8803488, 8, ffffffff, 8803488, 9, 400fbea0) + 40
0x404b0a6d: ???? (8129258, 4035236c, 57f, 4011e4e6, 4048de14,
403513c4) + 20
0x404b0d07: ???? (814b898, 814b898, 0, 0, 415a8f64, 814b898) + 30
0x401dc11f: ???? (8106350, bfffee80, bfffede8, 807673e, 8084cf4, 0)
0x415c4006: ???? (8106350, 0)
0x414fbae4: ???? (8105ee8, 0, 8079c2c, bfffee90, 80a67b8,
40ad841c) + 1f0
0x08059b7c: ???? (80e7f08, bffff058, 40017068, 14, 4081ccf8,
1) + 90
0x08055a47: ???? (1, bffff134, bffff13c, 4081ccf8, 406eebd0,
400168c0) + 40
0x405f2500: ???? (8055840, 1, bffff134, 80557b0, 8055740,
4000d330) + 40000ed8
4001: /opt/mozilla/lib/mozilla-bin
(No symbols found)
0xffffe410: ???? (413eb7f0, 1, ffffffff, 18, 413eb7f8, 0) + 230
0x400c7439: ???? (818911c, 1, ffffffff, 40c5a0a8, ffffffff,
8188dec)
0x40bc8a52: ???? (8188dc8, 8188df4, 1, 8188dec, 8188f7c, 1) + 10
0x40bc8bcb: ???? (8188dc8, 413ebbb0, 40102ce0, 400d5238,
8189478, 0)
0x40a8da6b: ???? (81893f8, 8189478, 4000ca40, 40102be8, 0, 0)
0x400cb7a6: ???? (8189478, 413ebac4, 0, 0, 0, 0) + 54
0x400fa9dd: ???? (413ebbb0, 0, 0, 0, 0, 0) + bec144d4
4004: /opt/mozilla/lib/mozilla-bin
(No symbols found)
0xffffe410: ???? (40656756, 400d5238, 81ed160, 81ed2d0, 41ffba08,
400c5721) + 170fd55
crawl: Input/output error
Error tracing through process 4004
0x1afcdbf8: ????max@linux:/proc/3991/task>
|
Solaris threads are given a default user stack size of 1MB. For Linux,
the default stack size is 2MB (SuSe 9.1).
Synchronization
Both OSes support POSIX synchronization mechanisms, i.e., mutexes, condition variables,
reader/writer locks, semaphores, and barriers.
The underlying mechanisms rely on mutexes.
In Solaris, user-level mutexes are implemented using "adaptive" spin locks.
On Linux, the mechanism is the "futex", or fast user level mutex.
Both mechanisms avoid going into the kernel in the non-contention case, and should
give comparable performance and behavior.
The Solaris user-level adaptive spin mutexes are described in
Multithreading in the the Solaris Operating Environment (pdf).
Linux futexes are described in
Futexes Are Tricky (pdf).
The Solaris OS mechanisms lwp_park() and lwp_unpark(),
and Linux mechanisms futex_up() and futex_down(),
can be used by applications. However, I have not found any source code examples.
It is probably best to stick with the POSIX APIs.
If you want to compare relative speeds of the POSIX locking mechanisms (as well as
performance of various other library routines and system calls), I recommend
getting a copy of the libmicro micro benchmark and trying it out on both the Solaris OS and Linux. (You can download libmicro from the OpenSolaris site.)
Be aware that the upcoming Solaris 11 release (the latest build available through OpenSolaris and Solaris Express, code named Nevada), is a debug build, which will have an effect on any performance numbers you are seeing.
Memory Management
Without describing differences in the kernels' handling of memory, we can say that
at user level several different memory allocation (malloc) libraries exist,
most of which are available (or can be built) for either OS.
A comparison of some of the user-level memory allocators
can be found in the Sun Developer Network article
A Comparison of Memory Allocators in Multiprocessors. "A Memory Allocator" at http://gee.cs.oswego.edu/dl/html/malloc.html
contains a (dated) description of a memory allocator used on Linux.
More comments can be found in the source code.
Timers
At application level, the Solaris OS and Linux both offer POSIX timer routines,
including timer_create(), timer_delete(), and
nanosleep(). The Solaris OS has an additional timer, CLOCK_HIGHRES, that attempts to
use an optimal hardware source, and may give close to nanosecond resolution.
A CLOCK_HIGH_RES timer may give similar resolution
on Linux, but needs to be installed as a kernel patch
(see home page for the high resolution timers project at http://high-res-timers.sourceforge.net/ for details).
The following is example code that uses the CLOCK_HIGHRES timer
to fire on user-specified intervals for a user-specified duration.
The interval is specified in nanoseconds, and the duration in seconds.
When the program completes, it prints the number of times the timer fired, and
the number of times the timer was "overrun".
The "overrun" value is a count of the number of timer expirations that
occurred between the time a timer fired (causing a signal to be generated), and
the time the signal is handled (see timer_getoverrun(3RT).
Running the program real-time with too short an interval may cause
the system to hard hang.
#include <pthread.h>
#include <sys/types.h>
#include <stdio.h>
#include <stdlib.h>
#include <signal.h>
#include <time.h>
#include <errno.h>
#define DURATION 120 /* default time to run in seconds */
/* default .5 seconds in nanosecs */
#define INTERVAL (1000*1000*500)
void* timer_fcn(void* arg);
void* signaler_thd(void* arg);
/* Program globals */
extern int errno;
int duration = DURATION;
int interval = INTERVAL;
int
main(int argc, char *argv[])
{
sigset_t mask;
pthread_t wtid = 0;
pthread_t stid = 0;
int rval;
int n;
if (argc >=2) {
errno = 0;
if (argc == 2)
duration = strtol(argv[1], NULL, 0);
else if (argc == 3) {
interval = strtol(argv[1], NULL, 0);
duration = strtol(argv[2], NULL, 0);
}
if (errno || argc > 3 || interval <= 0
|| duration <= 0) {
fprintf(stderr, "Usage: %s [[interval] duration]\n",
argv[0]);
fprintf(stderr, "interval nsecs, duration seconds\n");
exit(1);
}
}
/* mask SIGALRM signals */
sigemptyset(&mask);
sigaddset(&mask, SIGALRM);
sigaddset(&mask, SIGUSR1);
rval = pthread_sigmask(SIG_BLOCK, &mask, NULL);
if(rval != 0) {
printf("%s: pthread_sigmask failed, errno = %d.\n",
argv[0], rval);
exit(1);
}
rval = pthread_create(&wtid, NULL, timer_fcn, NULL);
if (rval != 0) { /* Waiter create call create failed */
perror ("Waiter create");
printf ("Waiter create call failed: %d.\n", rval);
exit (1);
}
/* Do signaler thread */
rval = pthread_create(&stid, NULL, signaler_thd, &mask);
if (rval != 0) { /* Signaler call create failed */
printf ("Signaler call create failed: %d.\n", rval);
exit (1);
}
/* Wait for waiter and signaler to finish */
rval = pthread_join(stid, NULL);
if (rval != 0) { /* Signaler call join failed */
printf ("Signaler call join failed: %d.\n", rval);
exit (1);
}
rval = pthread_join(wtid, NULL);
if (rval != 0) { /* Waiter call join failed */
printf ("Waiter call join failed: %d.\n", rval);
exit (1);
}
printf("done\n");
exit(0);
}
pthread_mutex_t mp;
pthread_cond_t cv;
int time_expired = 0;
int timerentered;
int timeroverrun;
timer_t itimerid;
void *
timer_fcn(void *arg)
{
struct itimerspec value;
struct sigevent event;
value.it_interval.tv_sec = 0;
value.it_interval.tv_nsec = interval; /* nsec intervals */
value.it_value.tv_sec = 1; /* starting in 1 second */
value.it_value.tv_nsec = 0; /* plus 0 nanosecs */
event.sigev_notify = SIGEV_SIGNAL;
event.sigev_signo = SIGALRM;
event.sigev_value.sival_int = 0;
if (timer_create(CLOCK_HIGHRES, &event,
&itimerid) == -1) {
perror("timer_create failed");
exit(1);
}
/* the second arg can be set to TIMER_ABSTIME */
if (timer_settime(itimerid, 0, &value, NULL) == -1) {
/* else time value is relative to when the call is made */
perror("timer_settime failed");
exit(1);
}
pthread_mutex_lock(&mp);
while (time_expired == 0)
pthread_cond_wait(&cv, &mp);
printf("timerentered = %d\n", timerentered);
printf("timeroverrun = %d\n", timeroverrun);
pthread_mutex_unlock(&mp);
exit(0);
}
int timerset;
void *
signaler_thd(void *arg)
{
int signo;
while (1) {
signo = sigwait(arg);
if (signo == SIGALRM) {
if (!timerset) {
struct itimerspec value;
struct sigevent event;
timer_t endtimerid;
++timerset;
value.it_interval.tv_sec = 0;
value.it_interval.tv_nsec = 0;
value.it_value.tv_sec = duration; /*wait duration secs*/
value.it_value.tv_nsec = 0; /* plus 0 nanosecs */
event.sigev_notify = SIGEV_SIGNAL;
event.sigev_signo = SIGUSR1;
event.sigev_value.sival_int = 0;
if (timer_create(CLOCK_HIGHRES, &event,
&endtimerid) == -1) {
perror("timer_create failed");
exit(1);
}
/* the second arg can be set to TIMER_ABSTIME */
if (timer_settime(endtimerid, 0, &value, NULL)
== -1) {
perror("timer_settime failed");
exit(1);
}
} else { /* if (!timerset) */
++timerentered;
timeroverrun += timer_getoverrun(itimerid);
}
} else { /* SIGUSR1 */
struct itimerspec value;
struct sigevent event;
/* cancel the interval timer */
value.it_interval.tv_sec = 0;
value.it_interval.tv_nsec = 0; /* nanosecond intervals */
/* setting the following to 0 should stop the timer */
value.it_value.tv_sec = 0;
value.it_value.tv_nsec = 0; /* plus 0 nanosecs */
event.sigev_notify = SIGEV_SIGNAL;
event.sigev_signo = SIGALRM;
event.sigev_value.sival_int = 0;
pthread_mutex_lock(&mp);
if (timer_settime(itimerid, 0, &value, NULL) == -1) {
perror("timer_settime failed");
exit(1);
}
++time_expired;
pthread_cond_signal(&cv);
pthread_mutex_unlock(&mp);
}
}
}
|
And here are some examples of running the compiled code.
<-- realtime library and best optimization -->
bash-3.00$ cc timerex1.c -lrt -o timerex1 -O -fast
bash-3.00$ ./timerex1 <-- only root can use high res timer
timer_create failed: Not owner
bash-3.00$ su
Password:
<-- default interval is .5 seconds, duration is 120 seconds -->
# ./timerex1
timerentered = 240 <-- timer fired every .5 seconds
timeroverrun = 0
# ./timerex1 1000000 10 <-- interval is 1 msec for 10 secs
timerentered = 9912
timeroverrun = 88
# priocntl -e -c RT ./timerex1 1000000 10 <-- run it real time
timerentered = 10000 <-- timer fired once each msec for 10 secs
timeroverrun = 0
# ./timerex1 100000 10 <-- interval is 100 usecs for 10 seconds
timerentered = 99615 <-- we missed a few
timeroverrun = 386
# priocntl -e -c RT ./timerex1 100000 10 <-- try real time
timerentered = 99871 <-- almost 1 every 100 microseconds
timeroverrun = 129
# ./timerex1 10000 10 <-- interval is 10 microseconds
timerentered = 485905 <-- here we miss over half
timeroverrun = 514125 <-- (sig handler takes > 10 usecs?)
<-- using RT 1 usec interval causes hang on my machine -->
# priocntl -e -c RT ./timerex1 1000 10
|
IPC
Both the Solaris OS and Linux support System V IPC (shared memory, message queues, and semaphores).
Both systems also support pipes and the real-time shared memory operations (shm_open(), shm_unlink(),
and so on).
Both systems support the tmpfs file system (using memory and swap space for files).
The Solaris OS places /tmp, /var/run, and /etc/svc/volatile in tmpfs. Linux uses /dev/shm. Both systems allow other mount points to be added.
Here are the steps for using tmpfs on the Solaris OS; steps for Linux are shown below. Note that "swap" on the Solaris OS uses memory as well as disk (if needed). In other words, files created in /tmp are stored in memory. If memory gets full, the pageout daemon may write data from /tmp to swap space on disk.
# mkdir /foo
<-- create a tmpfs file system using swap on /foo
# mount -F tmpfs swap /foo
# df -h /foo
Filesystem size used avail capacity Mounted on
swap 652M 0K 652M 0% /foo
# df -h /tmp
Filesystem size used avail capacity Mounted on
swap 652M 52K 652M 1% /tmp
#
|
And here are the analogous steps on Linux.
linux:/home/max # mkdir /foo
<-- tmpfs also uses swap space and memory -->
linux:/home/max # mount tmpfs /foo -t tmpfs
linux:/home/max # df -h /foo
Filesystem Size Used Avail Use% Mounted on
tmpfs 248M 0 248M 0% /foo
linux:/home/max # df -h /dev/shm
Filesystem Size Used Avail Use% Mounted on
tmpfs 248M 16K 248M 1% /dev/shm
linux:/home/max #
|
It might be interesting to run the libmicro benchmarks mentioned earlier
in the article to get some idea of relative performance between the systems.
Signal Handling
The Solaris OS and Linux treat signals similarly. Some signals exist in the Solaris OS and not in Linux, and vice versa. Also, some of the same signals use different signal numbers. Both OSes recommend using sigaction(2) over signal() to catch signals, and the use of sigwait() to handle asynchronous signals in multithreaded applications.
The sigwait(3) manual page on Linux has a BUGS
section.
The Linux signal handling differs from the POSIX standard.
POSIX states that an asynchronously delivered signal (a signal sent externally
to the process), is handled by any thread that does not have the signal currently blocked.
In Linux, asynchronous signals may be sent to specific threads (signals can be
sent to the thread ID via kill(1)). The Solaris OS implements the POSIX standard for this.
There is no way to send a signal to a specific thread externally to the process. One
can send a signal via kill(1) to the process, not to a specific thread
within the process.
Some of the differences are described in "Building Applications with the Linux Standard Base" at http://lsbbook.gforge.freestandards.org/sig-handling.html.
Note that this page may not be entirely accurate. For instance, the page says that Linux
sets SIGBUS to SIGUNUSED because
there is no "bus error" in Linux. However, the Linux man page for mmap(2) documents receiving SIGBUS when accessing a memory range that does not correspond to a valid location in the file that mmap was used with.
(The Solaris OS does the same).
On both the Solaris OS and Linux, signals are handled when a non-held, non-ignored signal is found pending for a thread returning from kernel to user mode.
On both systems, SIGKILL and SIGSTOP take priority
over other signals. Otherwise, on Solaris signals are handled in
an undocumented order (lowest signal number first). On Linux, signals
are handled in the order they are delivered (again, excepting SIGKILL
and SIGSTOP).
On the Solaris OS, to see the signal settings for a running process,
use psig.
bash-3.00$ psig $$ <-- signal disp for current shell
954: /usr/bin/bash -i
HUP caught termination_unwind_protect 0 HUP,INT,
ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
VTALRM,XCPU,XFSZ,LOST
INT caught sigint_sighandler 0
QUIT ignored
ILL caught termination_unwind_protect 0 HUP,INT,
ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
VTALRM,XCPU,XFSZ,LOST
TRAP caught termination_unwind_protect 0 HUP,INT,
ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
VTALRM,XCPU,XFSZ,LOST
ABRT caught termination_unwind_protect 0 HUP,INT,
ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
VTALRM,XCPU,XFSZ,LOST
EMT caught termination_unwind_protect 0 HUP,INT,
ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
VTALRM,XCPU,XFSZ,LOST
FPE caught termination_unwind_protect 0 HUP,INT,
ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
VTALRM,XCPU,XFSZ,LOST
KILL default
BUS caught termination_unwind_protect 0 HUP,INT,
ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
VTALRM,XCPU,XFSZ,LOST
SEGV caught termination_unwind_protect 0 HUP,INT,
ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
VTALRM,XCPU,XFSZ,LOST
SYS caught termination_unwind_protect 0 HUP,INT,
ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
VTALRM,XCPU,XFSZ,LOST
PIPE caught termination_unwind_protect 0 HUP,INT,
ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
VTALRM,XCPU,XFSZ,LOST
ALRM caught termination_unwind_protect 0 HUP,INT,
ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
VTALRM,XCPU,XFSZ,LOST
TERM ignored
USR1 caught termination_unwind_protect 0 HUP,INT,
ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
VTALRM,XCPU,XFSZ,LOST
USR2 caught termination_unwind_protect 0 HUP,INT,
ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
VTALRM,XCPU,XFSZ,LOST
CLD blocked,caught 0x807d4d7 0
PWR default
WINCH caught 0x807e182 0 <-- not all syms are present
URG default
POLL default
STOP default
TSTP ignored
CONT default
TTIN ignored
TTOU ignored
VTALRM caught termination_unwind_protect 0 HUP,INT,
ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
VTALRM,XCPU,XFSZ,LOST
PROF default
XCPU caught termination_unwind_protect 0 HUP,INT,
ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
VTALRM,XCPU,XFSZ,LOST
XFSZ caught termination_unwind_protect 0 HUP,INT,
ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
VTALRM,XCPU,XFSZ,LOST
WAITING default
LWP default
FREEZE default
THAW default
CANCEL default
LOST caught termination_unwind_protect 0 HUP,INT,
ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
VTALRM,XCPU,XFSZ,LOST
XRES default
JVM1 default
JVM2 default
RTMIN default
RTMIN+1 default
RTMIN+2 default
RTMIN+3 default
RTMAX-3 default
RTMAX-2 default
RTMAX-1 default
RTMAX default
bash-3.00$
|
As far as I can tell, there is no easy way to do this in Linux,
but someone has probably implemented a kernel patch/module
to give you the information.
Certainly it should be do-able with User Mode Linux.
Conclusions
Generally, if you are developing a POSIX-compliant application
on Linux or the Solaris OS, the application should port to the other OS
simply by recompilation. Of course, many applications will have
parts that are not addressed by POSIX. For instance, device
ioctl(2) handling tends to be OS (and, of course, device)
specific.
Getting documentation for the Solaris OS is reasonably straightforward, since most
of the documentation is at http://docs.sun.com.
Getting documentation for Linux is sometimes simple (search on the web), and
sometimes not so simple.
You'll find that Linux typically offers multiple ways to do the same thing
(different implementations of threads, for example).
My impression is that much of the Linux documentation is
in the source code itself. This is fine if you have access to
all the source code.
You do have access to all of the source code, but it is not all
in one place. In fact, it seems scattered all over the place.
Sun's source is currently available all in one place
(http://www.opensolaris.org),
but not all of it is there.
I expect that over time, developers will add software to OpenSolaris
that may not be available in the OpenSolaris source tree.
This article touched on some of the visibility tools
available on the two systems, but did not get into much detail.
Prior to Sun's coming out with OpenSolaris, Linux advocates could
always point to the source as a differentiator when it
came to visibility as to how things work.
Now, with OpenSolaris and tools such as DTrace, Linux will have to play catch up. And at the rate of change of Linux, I'm sure it won't take long.
I'm looking forward to both systems benefiting from each other's
good features, and learning from their mistakes.
|
|