• Performance Prediction (Wednesday 1:30-3:00PM)
    Room A110/112
    Chair: Richard F. Barrett, Los Alamos National Laboratory

    • Title: On Using SCALEA for Performance Analysis of Distributed and Parallel Programs
    • Authors:
      Hong-Linh Truong (University of Vienna)
      Thomas Fahringer (University of Vienna)
      Georg Madsen (Technical University of Vienna)
      Allen D. Malony (University of Oregon)
      Hans Moritsch (University of Vienna)
      Sameer Shende (University of Oregon)
    • Abstract:
      In this paper we give an overview of SCALEA, which is a new performance analysis tool for OpenMP, MPI, HPF, and mixed parallel/distributed programs. SCALEA instruments, executes and measures programs and computes a variety of performance overheads based on a novel overhead classification. Source code and HW-profiling is combined in a single system which significantly extends the scope of possible overheads that can be measured and examined, ranging from HW-counters, such as the number of cache misses or floating point operations, to more complex performance metrics, such as control or loss of parallelism. Moreover, SCALEA uses a new representation of code regions, called the dynamic code region call graph, which enables detailed overhead analysis for arbitrary code regions. An instrumentation description file is used to relate performance information to code regions of the input program and to reduce instrumentation overhead. Several experiments with realistic codes that cover MPI, OpenMP, HPF, and mixed OpenMP/MPI codes demonstrate the usefulness of SCALEA.

    • Title: Modeling and Detecting Performance Problems for Distributed and Parallel Programs with JavaPSL
    • Authors:
      Thomas Fahringer (University of Vienna)
      ClŪvis Seragiotto J™nior (University of Vienna)
    • Abstract:
      In this paper we present JavaPSL, a Performance Specification Language that can be used for a systematic and portable specification of large classes of experiment-related data and performance properties for distributed and parallel programs. Performance properties are described in a generic and normalized way, thus interpretation and comparison of performance properties is largely alleviated. Moreover, JavaPSL provides meta-properties in order to describe new properties based on existing ones and to relate properties to each other.

      JavaPSL uses Java and its powerful mechanisms, in particular, polymorphism, abstract classes, and reflection to describe experiment-related data and performance properties. JavaPSL can also be considered as a performance information interface based on which sophisticated performance tools can be built or other tools can access performance data in a portable way.

      We have implemented a prototype performance tool that uses JavaPSL to automatically detect performance bottlenecks for MPI, OpenMP, and mixed OpenMP and MPI programs. Several experiments with realistic codes demonstrate the usefulness of JavaPSL.

    • Title: Predictive Performance and Scalability Modeling of a Large-Scale Application
    • Authors:
      D. J. Kerbyson (Los Alamos National Laboratory)
      H. J. Alme (Los Alamos National Laboratory)
      A. Hoisie (Los Alamos National Laboratory)
      F. Petrini (Los Alamos National Laboratory)
      H. J. Wasserman (Los Alamos National Laboratory)
      M. Gittings (SAIC and Los Alamos National Laboratory)
    • Abstract:
      In this work we present a predictive analytical model that encompasses the performance and scaling characteristics of an important ASCI application. SAGE (SAIC's Adaptive Grid Eulerian hydrocode) is a multidimensional hydrodynamics code with adaptive mesh refinement. The model is validated against measurements on several systems including ASCI Blue Mountain, ASCI White, and a Compaq Alphaserver ES45 system showing high accuracy. It is parametric - basic machine performance numbers (latency, MFLOPS rate, bandwidth) and application characteristics (problem size, decomposition method, etc.) serve as input. The model is applied to add insight into the performance of current systems, to reveal bottlenecks, and to illustrate where tuning efforts can be effective. We also use the model to predict performance on future systems.