Sun Java Solaris Communities My SDN Account Join SDN

Article

Optimizing Build Times Using Parallel "make"

 
By Morgan Herrington, July 16, 2004  
Contents
 
Overview
Introduction
Rationale
Putting It into Practice
Problem 1: Implicit Dependencies
Problem 2: Reuse of Temporary Files
Problem 3: Resource Exhaustion
Problem 4: Serial Tools
Problem 5: Old Versions of gmake and NFS
Summary
Resources
 
Overview
Most C/C++ development environments depend on some version of the make utility to manage the build process, yet many engineers don't take advantage of its parallel capabilities to reduce compile times. This article will explain how to use this feature and provide explanations and solutions for the most common pitfalls.
 
Introduction

Unlike the procedural programming languages we typically use (that is, C++, perl, and Java), the dataflow language of make doesn't specify a particular order for operations. Instead, each step is taken when the necessary dependencies have been completed. This property permits parallel execution of all non-dependent operations, however, you must attend to a few details in order to avoid inconsistent or even incorrect results.

In more sophisticated development environments, parallel builds can be distributed across multiple machines or even multiple networks (which are sometimes distinguished as distributed rather than a parallel builds). This may require non-trivial preparation (that is, setting up a grid) which is normally addressed by an administrator or a build master so that each engineer doesn't need to be aware of the fine details of the implementation.

Here we'll address the simpler case where there is an existing serial build environment, on a single machine, and you just want to decrease your build times. Note, however, that most of the issues discussed for this simpler situation must also be addressed for the more general environment as well.

 
Rationale

On a multiprocessor machine, the advantage of parallel make seems obvious since the additional CPUs can perform multiple compilations faster than a single CPU. Somewhat less obvious, however, a parallel make can be faster even on a uniprocessor because the CPU and I/O demands of multiple compiles can be overlapped (improving the overall throughput). For the simplest situation where a single source file is changed, using a parallel make won't help. However, whenever multiple files (or headers used by multiple sources) have changed, then a parallel build should decrease the total build time.

 
Putting It into Practice

While most make variants support a parallel build capability, this article will discuss GNU make because it is generally available and widely used. Other make utilities provide special syntactic shortcuts for some of the suggestions illustrated below; however, using those features will render your Makefiles less portable.

A parallel build with gmake can be invoked using the -j flag, that is:

    $ gmake -j 4

The invocation syntax to specify the number of jobs may differ for other parallel make utilities like dmake, pmake, qmake, or mwmake, but the behavior is similar.

Your results will vary based on the particular compiler, options, and language being compiled, as well as whether the sources are local or remote. A common rule-of-thumb is to request the number of parallel jobs to be approximately 1.5 times the number of available CPUs on the machine.

If this initial invocation works (and usually, it will), then you should start seeing reduced build times on uniprocessor machines and even more of a reduction on multiprocessor machines. However, if it does not work, then the following sections may help explain some problems that are specific to parallelization, how to recognize them, and how to fix them.

 
Problem 1: Implicit Dependencies

Possibly the most common failure is caused by an unstated dependency that has inadvertently been introduced in the Makefile (that is, a Makefile bug). These can go unnoticed for serial builds, but will cause a parallel build to fail.

For example, consider an application where a module implementing the menus for a user interface is automatically generated by scanning the other object files, looking for functions matching a particular naming convention. The offending make rules might look something like:

    OBJECTS=app.o helper.o utility.o ...

    application: ${OBJECTS} menu.o
        cc -o application ${OBJECTS}  menu.o

    # Automatically generate menu.c from the other modules
    menu.o:
        nm ${OBJECTS} | pattern_match_and_gen_code > menu.c
        cc -c menu.c

The subtlety here is that for a serial build, make will work from left to right on the dependency list for application. First it will create all of the objects in ${OBJECTS}, and then it will apply the rules to build menu.o. There is no problem in the serial build because by the time menu.o is processed, all of the necessary objects in ${OBJECTS} will already be in place. For a parallel build, however, make is free to process all of the dependencies in parallel, so the creation and compilation of menu.o could be started before all of the objects are ready.

To make matters worse, the failure may be intermittent. Because this is a parallel race condition, sometimes it will work correctly and sometimes it will fail. A naive user might just learn to distrust parallel make without realizing that there is a bug in the Makefile.

You can recognize this situation because a serial build will succeed, but a parallel build will fail (with the specific failure being that some dependencies are not created). If you retry the build, it will then succeed.

Out of frustration, some engineers resort to using the compound command "gmake;gmake" to force the parallel build to complete. However, a more appropriate fix is to make the dependencies explicit, allowing the built-in dataflow analysis of make to process all targets in the correct order. In the following example, notice that ${OBJECTS} was added to the dependency list for meno.o:

    application: ${OBJECTS} menu.o
        cc -o application ${OBJECTS}  menu.o

    menu.o: ${OBJECTS}
        nm ${OBJECTS} | pattern_match_and_gen_code > menu.c
        cc -c menu.c

 
Problem 2: Reuse of Temporary Files

Another common problem can arise from the reuse of intermediate files. For example, consider an application which uses yacc to generate two distinct parsers. Without considering the issue of parallel builds, the Makefile author might inadvertently allow yacc to use the same (default) intermediate source file for both targets:

    application: application.o parser1.o parser2.o 
        cc -o $* $<

    parser1.o: parser1.y
        yacc parser1.y
        cc -o $* -c y.tab.c

    parser2.o: parser2.y
        yacc parser2.y
        cc -o $* -c y.tab.c

This problem becomes even more subtle when using the default make rules. In this case, rather than being explicitly visible (like the previous example), this intermediate file would only be referenced from a system default file (for example, on the Solaris Operating System, /usr/share/lib/make/make.rules) which has a generic build rule like the following:

    .y.o:
        $(YACC.y) $<
        $(COMPILE.c) -o $@ y.tab.c
        $(RM) y.tab.c

In either case, this works without any problem for a serial build because the intermediate file, y.tab.c, is used and then discarded by each individual build rule. However, when executed in parallel, the two rules conflict (if both try to write to the same intermediate file at the same time).

This particular example is easy to fix because yacc provides an option, -b, to rename the intermediate file so that it will be unique between invocations. Similarly, if the intermediate file is being generated by a shell script, these are usually easy to generalize. In the worst-case scenario, one workaround is to build one of the targets in a unique (possibly temporary) subdirectory:

    parser2.o: parser2.y
        mkdir _temp; cd _temp; yacc ../parser2.y
        cc -o $* -c _temp/y.tab.c
        $(RM) -rf _temp

 
Problem 3: Resource Exhaustion

If your machine is under-configured to handle the amount of parallelism that you've requested, you may run out of either virtual or physical memory.

In the first case, you'll see the message:

    Fatal error: fork failed: Not enough space 

And in the second, you will see a significant slowdown in compile speed (because the compiler is being forced to page to disk).

In both cases, the solution is to reduce the amount of parallelism (or add memory).

A slightly different, but related, situation is when your parallel build consumes more than your fair share of a multi-user machine. In that case, gmake provides an option, -l, to limit parallelism based on a load average upper limit.

 
Problem 4: Serial Tools

In some cases, the compiler itself can inhibit parallel builds. In particular, older versions of the Sun C++ compiler serialize when accessing the template cache.

The fix for this problem is to avoid using the template cache by:

  • Upgrading to Sun Studio 8 software, which uses a different template mechanism
  • Working with options to the older compiler (for example, -instances=static), which can be used to bypass the template cache. (See the C++ User's Guide for instructions and restrictions.)
 
Problem 5: Old Versions of gmake and NFS

Very old versions of gmake will sometimes fail for parallel builds (but succeed for serial builds) when invoked in an NFS-mounted directory. The failure will manifest itself as a report that either of these has occurred:

  • One of the sources cannot be created:
        gmake: *** No way to make target 'some_source.c'.  Stop.
        gmake: *** Waiting for unfinished jobs....
    
    

  • stat was interrupted:
        gmake: stat: some_source.c: Interrupted system call
    

However, if you manually inspect the directory using ls, you will see that the source file does exist. If gmake is reinvoked, more of the build will succeed, but it might require multiple invocations to complete the build.

To fix this problem, install a more recent version of gmake, for example version 3.79 (or newer).

 
Summary
This optimization is usually easy to implement and should noticeably reduce build times on a uniprocessor and dramatically reduce them on a multiprocessor. The cleanup usually does not complicate the Makefile and is a necessary first step for more sophisticated distributed builds.
 
Resources
 
Rate and Review
Tell us what you think of the content of this page.
Excellent   Good   Fair   Poor  
Comments:
Your email address (no reply is possible without an address):
Sun Privacy Policy

Note: We are not able to respond to all submitted comments.

Oracle is reviewing the Sun product roadmap and will provide guidance to customers in accordance with Oracle's standard product communication policies. Any resulting features and timing of release of such features as determined by Oracle's review of roadmaps, are at the sole discretion of Oracle. All product roadmap information, whether communicated by Sun Microsystems or by Oracle, does not represent a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. It is intended for information purposes only, and may not be incorporated into any contract.