/mpitrace

library for measuring communication in distributed-memory parallel applications that use the standard Message-Passing Interface (MPI)

Primary LanguageCMIT LicenseMIT

This repository contains a collection of tools that are aimed at analysis
of distributed-memory parallel applications written with MPI.  The profiling
interface provided by MPI makes it possible to collect detailed information
about messaging, and it also provides a convenient place to enable other
performance tools, including program sampling via interrupts, or collection
of aggregate values for hardware counter events.  The methods to build and
use these tools are described in separate directories.  A brief overview is
sketched here.  Please consult the README files in each directory for more
information.

================================================================================
Tool : libmpitrace.so   ;   directory = src

purpose : Collect and report information on MPI calls, task placement,
          memory utilization, user and system time.

outputs : text files : mpi_profile.jobid.rank
          optional binary program-sampling outputs : vmon.out.jobid.rank

requires : mpicc with the underlying C compiler set to gcc.

optional : Enable program sampling via the profil() routine.
           Program-sampling requires GNU binutils development files.
           Note : a different program sampling method using hardware counters
           is preferred ... see the section on libhpmprof.so.

build : cd src
        ./configure   (builds only the MPI wrappers)
   or   ./configure --with-vprof --with-binutils=/path/to/binutils
        make libmpitrace.so

typical use :  export LD_PRELOAD=/path/to/libmpitrace.so
               mpirun --np 2048 your.exe
               unset LD_PRELOAD

================================================================================
Tool : libmpihpm.so   ;   directory = src

purpose : Provides the same MPI information as libmpitrace.so, plus
          enables collection and reporting of aggregate values for
          hardware counters.  By default counts are reported from
          MPI_Init() to MPI_Finalize(), but one can instrument the 
          code with calls to HPM_Start("label"); HPM_Stop("label");
          to collect counts for specific code sections.

outputs : text files : mpi_profile.jobid.rank      ... MPI data
                       hpm_job_summary.jobid.group ... counter data

requires : mpicc with the underlying C compiler set to gcc.
           PAPI include and library paths, and a suitable set of
           hardware counters for your system's CPUs.

build : cd src
        ./configure --with-hpm=core   --with-papi=/path/to/papi
   or   ./configure --with-hpm=uncore --with-papi=/path/to/papi
         make libmpihpm.so

typical use :  export LD_PRELOAD=/path/to/libmpihpm.so
               mpirun --np 2048 your.exe
               unset LD_PRELOAD

================================================================================
Tool : libhpmprof.so   ;   directory = hpmprof

purpose : Provides the same MPI information as libmpitrace.so, plus
          enables interrupt-based program sampling via hardware counters.
          This library is the preferred method for program sampling for
          systems that enable user-level access to hardware counters.

outputs :  text files : mpi_profile.jobid.rank      ... MPI data
           binary files : hpm_histogram.jobid.rank  ... pc sampling data

requires : mpicc with the underlying C compiler set to gcc.
           PAPI include and library paths with a suitable set of
           hardware counters for your system's CPUs, and GNU binutils
           development files.

build : cd hpmprof
        ./configure --with-binutils=/path/to/binutils --with-papi=/path/to/papi
        make libhpmprof.so

typical use :  export LD_PRELOAD=/path/to/libhpmprof.so
         mpirun --np 2048 your.exe
         unset LD_PRELOAD
         bfdprof your.exe  hpm_histogram.jobid.rank > source_profile.txt
         annotate_objdump your.exe  hpm_histogram.jobid.rank > asm_profile.txt

================================================================================
Tools : bfdprof  and  annotate_objdump   ;   directory = bfdprof

purpose : These tools are required to analyze outputs generated by either of
          the program-sampling methods.  The bfdprof utility provides function
          and statement-level profile data, and the annotate_objdump utility
          provides profile data at the assembly level.

outputs : text files

requires : GNU binutils development files.

build : cd bfdprof
        ./configure --with-binutils=/path/to/binutils
        make

typical use : bfdprof  your.exe  hpm_histogram.jobid.rank >source_profile.txt
        annotate_objdump your.exe  hpm_histogram.jobid.rank > asm_profile.txt

================================================================================
Alternate builds of libmpitrace.so

directory : ctx   Adds the ability to separately report MPI profile data from
                  different code regions.  The user must annotate the source
                  code and mark start/stop boundaries for each code block of
                  interest.

directory : nvtx  Adds NVIDIA nvtx range markers around entry and exit of each
                  MPI function for graphical display using NVIDIA's visual
                  profiling tools.  This is intended to add insight into the
                  timelines for MPI calls along with GPU kernel execution.