Solaris Live Transcripts Index October 18, 2001
Chat Title: Techniques for Optimizing Applications: High Performance Computing This is a moderated chat. adele: Hello, and welcome to Solaris Live! Our guests today are Rajat P. Garg and Ilya Sharapov, authors of "Techniques for Optimizing Applications: High Performance Computing". Rajat or Ilya, can you give us an introduction to your book and the topics it covers? rajat: The book is primarily written for developers of computationally intensive programs who want to optimize their applications on Sun UltraSPARC systems. We cover a full spectrum of topics: from measuring performance to compiler optimizations, linking optimized libraries, source code modifications and extensive discussion on program parallelization methods. test: What tools are discussed? ilya: We have a few chapters on Sun compilers and discuss various features and options for serial optimization and parallelization. Special attention is payed to performance monitoring tools and profiling tools as they apply to serial and parallel programs. test: Do you have examples of how optimization can be applied?
rajat: Yes, we have numerous examples in the book on how a particular optimization trick works. One of our objectives was to have a complete example program demonstrating a specific technique. For example, a program that shows how
Joseph F. McGrath: In one example, the lines rajat: Joe, this is an example of strength reduction. It actually was motivated by a similar optimization in a spec benchmark. The floating point divides are typically more expensive than floating point multiplies and additions. The various tools that we discuss in the book will help identify such tuning opportunities. In particular, I would like to mention the analyzer tool. You can use this tool and, using the annotated disassembly/source feature, find the locations in the source where hotspots are. Identifying the hotspot is the tough part. test: What optimizations at the source code level are discussed in your book? ilya: We discuss general optimization techniques, such as optimization for memory hierarchies; for example, cache blocking or reducing cache conflicts, aliasing optimizations, and optimization related to data alignment. Special focus is on loop optimizations, such as loop tiling/unrolling, fusion/fission, peeling, etc. oscar: Is the information in the book also useful for system administrators? rajat: Oscar, Yes, there is information in the book that might be of use to system administrators. Specifically, we cover topics such as a description of features of our product lines (hardware, software), Solaris commands to identify system configuration (such as processor speed, size of memory, size of caches, solaris kernel settings, versions of compilers, HPC clustertools, etc. We do not cover detailed sys-admin topics such as networking setup, disk management etc. The primary target audience of the book is application developers.
Bill Walster: Hi Rajat and Ilya. Bill Walster here, so you know the kind of question I have. I have recently been playing with a little example that most programmers will think is obvious: Suppose I am computing the expression Bill Walster: P.S. to my question: I should have mentioned that there is a difference in accuracy when rounding errors are made in the process of performing the computation. If everything is machine representable, then there is no problem.
ilya: Hi Bill, Yes, we do address potential roundoff problems, as well as general issue of correctness of the results. We have a section on IEEE arithmetic and discuss where the problems can come from. For each of the optimization techiques (e.g. compiler optimizations, such as test: I am porting a Fortran program from Cray computers (where default size of real and integer variables is 8-bytes). How can I promote real and integer variables to 8-bytes on the Sun platform without hand-changing the source?
rajat: Test: We have a compiler option that facilitates porting of code where the default sizes of basic data-types need to be changed. For your specific case, you can use
Bill Walster: Answer to "test". There is a comand line option for the compiler
ilya: Bill, yes test: Half of the book covers parallelization; is this a manual for parallelizing applications? ilya: It isn't really a manual; we mostly discuss optimization aspects of parallel programming assuming that the reader has some parallelization skills, or at least a good reference handy. But we do provide theoretical background and perspective for parallelization and parallel performance. We also talk about the usage of compilers and other tools for parallelizing applications and monitoring parallel performance. Joseph F. McGrath: A generalization of mine on parallel processing has been questioned lately. In scientific and engineering applications, the approaches to parallelization on shared-memory and distributed-memory platforms have been narrowing down to OpenMP and MPI, respectively. That is, all other specifications and standards are falling by the wayside. Am I correct? Which others continue to be widely used? Which others show promise for the future? rajat: Joe: Yes: we seem to be converging towards using MPI for message-passing programs and OpenMP for compiler-directive based parallelization on shared address space systems. The one other approach which still continues to be in use is explicit multithreading via the use of P-threads. This is primarily used in C (& C++) applications since the API's were developed for C language. Once some of the current deficiencies of the OpenMP standard are addressed (such as support for dynamic task creation/destruction, signal and exception handling), I think people will move their P-threads applications to OpenMP based applications. In the future, the approaches which provide explicit support to take advantage of non-uniformity of memory accesses in a shared address space cache-coherent system but maintain simplicity of programming will be the ones to become popular. In my opinion, OpenMP (with such extensions) is promising. ilya: In case some of the participants plan to go to Supercomputing 2001, we'll have a copy of the book on display in the Sun booth there, as well as many experts who can talk about HPC on Sun platforms. rajat: A question to participants: What are your impressions of the book? Are there any topics missing that should be covered or topics that we should have covered in more detail? test: Why is the pointer alias analysis important in c programs and what options, if any, are available in Sun C compilers?
ilya: Answer to the pointer aliasing question: In C programs, the pointer variables can point to overlapping regions of memory leading to ambiguous data dependencies in the program. As a result, the compiler threats operations through potentially aliased questions conservatively. This may lead to suppressing many optimizations that otherwise could be performed and additional load and store instructions in the code. New Sun compilers include two options that can improve compiler's alias disambiguation analysis. Bill Walster: Do you have anything in your book about how to get performance from Java codes? There continues to be a lot of interest in Java and I know of at least one person who is working on techniques to get better performance from Java applications than one might think is possible. rajat: Bill: The book is primarily written for tuning applications written in Fortran and C. Although a lot of techniques can be applied to C++ programs, we do not cover any optimization methods for Java programs. test: Do you discuss different parallelization approaches? rajat: Yes: we have chapters that discuss explicit multithreading (P-threads), compiler-directive (OpenMP) parallelization and message-passing (MPI). For each of these approaches, we discuss the advantages and disadvantages as well as specific programming models. test: What do the terms spatial and temporal locality mean? ilya: These terms refer to the way in which memory accesses take place in a program. Temporal locality implies that a data item used now is likely to be used again soon, while spatial locality implies that if a data item is referenced, then data in a neighboring location is also likely to be used in the computation. adele: This is all we have time for today. Thank you very much for the participation from our audience. Rajat and Ilya, do you have any parting comments? ilya: Thank you all for participation and for the questions! Further questions about our book or Sun products for HPC can be sent to blueprints@sun.com. We'll be glad to respond. Thanks again! rajat: Thank you Adele for moderating the chat and to all the participants for asking insightful questions. We hope that you find the book useful and would very much like to get your feedback. Specifically, we would like to know what topics should be covered in additional detail and what topics (if any) do not belong at all in the book. Also, if you find any bug/errors in the code samples, we would appreciate hearing about it so we can correct them. adele: Rajat and Ilya, thank you very much for being our guests today. Your book, "Techniques for Optimizing Applications: High Performance Computing" is reviewed on Solaris Developer Connection at http://soldc.sun.com. We wish you great success with it. The transcript for this chat will be available at http://soldc.sun.com/developer/chat/. October 18, 2001 | ||||||||
|
| ||||||||||||