*****************************************************************************
**                      TAU Portable Profiling Package                     **
**                      http://tau.uoregon.edu                             **
*****************************************************************************
**    Copyright 1997-2008						   **
**    Department of Computer and Information Science, University of Oregon **
**    Advanced Computing Laboratory, Los Alamos National Laboratory        **
*****************************************************************************

Change log:
------------
Version 2.17 changes (from 2.16):
1. Added support for IBM BG/P (-arch=bgp).
2. Added a new tool for generating wrapper libraries, tau_wrap.
3. Improvements in Eclipse plugin for external tool support.
4. Improvements in paraprof and perfexplorer.
5. Improvements for SiCortex support and tauex. 
6. Added support for atomic events in TAU library layered over VampirTrace.
7. Added a Posix I/O wrapper (-iowrapper) for tracking volume and bandwidth of I/O.
8. Added an MPI wrapper library for Windows Cluster 2003. 
9. Added support for Scalasca 1.0. Works with both Kojak and Scalasca. 
10. Added Opari in TAU.
11. Added IBM BG/P metadata for torus node information in profiles.
12. PerfExplorer adds support for user-defined events and improvements in custom charts.
13. Posix I/O tracking implemented without need for enabling profiling (for tracing).
14. Improved tau_inc.pl for generating include lists for Scalasca/Kojak based on callpath profiling. 
15. Eclipse TAU plugin has support for two stage communication analysis.
16. Added -BGPTIMERS for IBM BG/P. Compatible with -BGLTIMERS.
17. Env vars TAU_VERBOSE, TAU_SYNCHRONIZE_CLOCKS, TAU_PROFILE_FORMAT (snapshot)
18. GCC 4.3.0 compatibility
19. Added bandwidth and bytes written info for MPI I/O write routines.
20. Added support for GNU, PathScale and PGI compilers on Cray XT systems [ORNL].
21. ParaProf can now generate selective instrumentation files.
22. TAU_THROTTLE = 0 disables throttling of events. Use TAU_VERBOSE=1 to see it.
23. perfdmf_configure now stores weka.jar files in ~/.ParaProf directory.
24. Added support for DyninstAPI 5.2.
25. Bug fixes for tau_instrumentor, context events, and tau2slog2.
26. Added support for pointer based profiling API (examples/profilercreate/README) [LSU].
27. -spec option for tau_instrumentor allows generic timer instrumentation support [FZJ].
28. Paraprof allows new windows for multiple metrics to compare data [SiCortex].
29. Posix I/O tracking now uses context events instead of user-defined events.
30. Added support for compiler based instrumentation for Intel 9.1, 10.x, GNU, and PathScale compilers. 
31. Added extensions in PerfExplorer to support CQoS analysis, drawing charts from script [CCA]. 
32. Bug fixes in paraprof for selective instrumentation, printer support [NASA].
33. taucxx, taucc, tauf90 now use -optCompInst by default, tau_[cxx,cc,f90].sh use -optPDTInst by default.
34. Added -opari support in installtau. 
35. Fixes for IBM BGL/BGP configuration.
36. Added support for tracking memory utilization and headroom in Python [ALCF].



Version 2.16 changes (from 2.15):
1. Added a new tool for correcting network time drifts in traces (tau_timecorrect).
2. Added support for an Eclipse analysis wizard and a graphical instrumentor.
3. Added support for 3D stereo visualizations in ParaProf.
4. Added support for a source browser in ParaProf.
5. Added support for generating source code information in tau_instrumentor.
6. Added support for Perflib based instrumentation and perf2tau [Jeff Brown, LANL]. 
7. Updated KTAU support in TAU for registering fork for kernel profiling [ANL]. 
8. Added tau_validate tool for checking if the TAU library is built correctly [UTK].
9. Added support for loading multiple ppk files in paraprof on the commandline [ORNL]. 
10. Enhancements in ParaProf and PerfExplorer for using metadata. [PERI]
11. Added support for capturing date and other cpu information in profiles. [LLNL]
12. Added support for Vampirtrace [LLNL]. 
13. Added support for Scalasca 0.5, and KOJAK 2.2 [FZJ, UTK]. 
14. Enhancements to Eclipse PTP plugin to support PAPI counter selection. [UTK]
15. Supports PDT v3.10 with EDG v3.8 C++/C parsers [LLNL].
16. Added support for application signatures [RENCI].
17. Added support for SiCortex Linux platform [SiCortex].
18. Added support for tracking leaks and dynamic memory allocation/deallocations in Fortran.
19. Improved tau memory tracking module to handle multi-line statements in Fortran.
20. Python profiler overhead is greatly reduced.
21. tauex script added for switching between libraries.
22. -optShared option added in tau_compiler.sh for linking in TAU's shared objs.23. Easy to use TAU API (TAU_START("string"), TAU_STOP("string")) introduced.
24. Paraprof enhancements include support for Cube 3.
25. PAPI threads (configure -papithreads) and PAPI Domains added for x86 linux.
26. Clock synchronization in traces.
27. Metadata fields in ppk files.
28. Custom charts with XML metadata in PerfExplorer. 
29. TAU portal scripts to upload data to perfdmf database.
30. Added support for persistent communication events in traces. 
31. Added support for KTAU OS level shared counter coupling.
32. Eclipse/PTP updates for accessing TAU options and build configurations.
33. Added support for tracking Fortran I/O. 
34. Added support for accessing multiple databases, configuration of databases,
    and context event displays in paraprof. 
35. Added support for -arch=mips32 for SiCortex 32 bit compilation.
36. Updates for Epilog on Cray XT3 support using TAU.
37. Updates for Lahey 64 bit Fortran under Linux. 
38. Added support for generating and viewing profile snapshots.
39. Added support for specifying phases and timers (static/dynamic) in the 
    instrumentation specification file (see examples/timerphase).
40. Updates for Eclipse/PTP plugin for supporting external tools such as 
    VampirTrace, Kojak and Perfsuite using TAU's tool plugin. 
41. Added Support for Cray Compute Node Kernel for XT4 (-arch=craycnl).
42. Updates for tauex to include tau_load.sh functionality for generating
    MPI performance data for shared library MPI. 
43. Added signal handlers (SIGUSR1 and SIGUSR2) to dump performance data and
    toggle instrumentation (enable/disable instrumentation) respectively.
44. Full compiler names and -show option is available for compilers scripts. 
45. Added support for reading in OMPP profiles in paraprof. 
46. Added support for Intel 10.x compilers, NAGWare Fortran, and g95 compilers.


Version 2.15 changes (from 2.14):
1. Added support for phase and comparative displays in ParaProf [UO]
2. Updated PerfExplorer [UO]
3. Added suport for Eclipse CDT, FDT [LANL]
4. Added support for OTF (tau2otf) [LLNL]
5. Added support for runtime throttling of events (TAU_THROTTLE) [UCAR]
6. Added support for ORC Open64 compiler [U. Houston/NCSA]
7. Added support for Solaris on x86_64 (Opterons) [SUN]
8. Added support for nested OpenMP calls [SUN, Aachen]
9. Added support for Cray XT3 and SHMEM wrapper [PSC] 
10. Added support for multi-platform traces and a trace writer library [UFL, ORNL]
11. Added support for top level timer in OpenMP [UCAR]
12. Added support for PAPI on BGL and XT3 [ANL, PSC]
13. Added support for converting TAU traces to profiles. 
14. Added support for converting TAU callpath profiles to phase profiles [LLNL].
15. Enhancements to Paraprof. 
16. Better support for Intel compilers for linking C and Fortran codes to TAU [NOAA]. 
17. Added support for FreeBSD [ARL]. 
18. Added support for Eclipse PTP [LANL]. 
19. Added support for scripting in Paraprof using Jython to create custom views [LLNL].
20. Added support for Python 2.4 with instrumentation for C calls [LLNL]. 
21. Added support for loop level instrumentation for C and C++ [UTK, LANL]. 
22. Added support for parameter based profiling (-PROFILEPARAM) [UTK].
23. Added support for tau_load.sh for runtime MPI library instrumentation [UTK]. 
24. Added support for outer-loop level instrumentation for Fortran [UTK, LLNL].
25. Added a new tool: tau_ompcheck that completes OpenMP Fortran directives [NCAR]. 
26. Added support for preprocessing Fortran sources in tau_compiler.sh (-optPreProcess) [GSFC].
27. Added support for invoking tau_ompcheck in tau_compiler.sh [NCAR]. 
28. Added support for DB2 and Derby in PerfDMF [UTK, LLNL]. 
29. Added support for Infiniband MPICH on Opterons [NERSC]. 
30. Added support for Cray XT3 Memory headroom information and Cray Timers [PSC]. 
31. Added support for GNU Gfortran parser in PDT for tau_compiler.sh [LANL]. 
32. Added support for parameter based profiling (-PROFILEPARAM) for workload characterization [UTK].
33. Added support for upgrading from one version of TAU to another (upgradetau) [NERSC]. 
34. Added support for automatic instrumentation of pthread programs using PDT [Walt Disney]. 
35. Added Java TAU trace writer library [U. Reading]. 
36. Added support for gotos in outer-loop level instrumentation [UTK].
37. Added support for automatic MPI library level instrumentation using tau_poe [UTK]. 
38. Updated tau_ompcheck [NCAR]. 
39. Better support for instrumentation and parsing of Fortran programs [Goddard].
40. PerfExplorer enhancements (normal probability plots, event data, distribution info of events)
41. Automatic memory leak detection (-optDetectMemoryLeaks) for C/C++ malloc/free [UTK]. 
42. TAU Portal (tau.nic.uoregon.edu) to access database. 


Version 2.14 changes (from 2.13):
1. MPI-2 support and Fortran wrappers added. 
2. Support for Oracle database in PerfDMF. 
3. VTF support for multiple PAPI counters in Vampir/VTF format trace files. 
4. Improvements in Paraprof displays and database connectivity. 
5. Improvements in tau_compiler.sh to automatically instrument applications.
6. Added support for phase based profiling and dynamic timers. 
7. Introduced vtf2profile tool to get profiles from VTF3 traces.
8. Added histograms, full callgraph, not-normalized displays to paraprof. 
9. Added support for PathScale compilers and -exec-prefix option. 
10. Improved support for locking of performance data in multi-threaded apps. 
11. Added 3D displays in Paraprof.
12. Added support for SLOG2 traces (to use TAU with Jumpshot) [ANL]. 
13. Added bettter support for configuring for BG/L (-arch=bgl) [ANL].
14. Added support for depth limit profiling and tracing (-DEPTHLIMIT) [ORNL].
15. Changes to the MPI wrapper library (for S3D) [ORNL]. 
16. TAU_MPI_MESSAGE_SIZE now reports sizes for MPI_Send, Recv, Allreduce, etc.[ORNL].
17. Added support for Charm thread library [UIUC, LLNL].
18. Added support for gfortran compiler (-fortran=gfortran). 
19. Added support for reverse callpaths in paraprof [LLNL].
20. Added support for storing trials in paraprof [UTK].
21. Added support for user defined context events (callpaths) [ANL]. 
21. Added support for measuring memory headroom available (-PROFILEHEADROOM, examples/headroom) [ANL].
22. Added tau2elg trace conversion tool to convert to Epilog trace format [UTK].
23. Added search options to paraprof windows [LLNL]. 
24. Added support for -MPITRACE option for Kojak [UTK].
25. Paraprof has text table window now for callpath profiles [LLNL]. 
26. Changes to TAU_COMPILER to support Opari in Kojak [UTK].
27. Fixed bugs in tau2elg to support Kojak v 2.1 and 2.1.1 [FZJ].
28. Fixed a bug in TAU_COMPILER (when opari is not used) [UTK]]. 
29. Added support for cube (importer) in paraprof [UTK].
30. Added support for PGI v6.0 compilers.
31. Added Jumpshot/Slog2 package to TAU [ANL].
32. Added support for trace files > 2GB in TAU and VTF3 [TACC].
33. TAU no longer needs merged pdb files from PDT's F95 parser [UTK].
34. Enhancements in Paraprof to choose metrics for summary table, std. dev [LLNL].
35. TAU_COMPILER does not need -optReset for IBM xlf90 to eliminate -D* flags.
36. TAU scripts (tau_[cxx,cc,f90].sh) for use on commandline [UFL]. 
37. TAU Java Eclipse plugin [LANL]. 
38. Updated documentation.
39. Added PerfExplorer performance data mining and knowledge discovery framework [LLNL].
40. Enhancements in MPI libraries for scalability [LLNL]. 
41. Phase based profiling allows you to identify phases in paraprof.
42. Added tau_setup GUI for TAU installations [LANL].


Version 2.13 changes (from 2.12):
1. Paraprof enhancements.
2. TAU MPI wrapper library layer enhancements [CCA].
3. Better support for autoinstrumentation of F95 source code using PDT [LANL].
4. Support for autoinstrumentation of Java using JDK 1.3 and 1.4.x JVMPI.
5. Introduced the TAU Trace Input Library (TIL) [VNG, TUDresden].
6. Added support for detecting papi wallclock timer overflow [LLNL]. 
7. Added support for Power4 Linux 64 bit compilation (-arch=ibm64linux) [LLNL].
8. Paraprof enhancements for groups and multiple counters with multithreaded loading [LANL].
9. Added TAU Instrumentation Language for enhancing tau_reduce. 
10. Added support for RTTI with g++ [ITT].
11. Added support for PAPI 3 so that TAU works with both PAPI 2 & 3 [UTK].
12. Paraprof enhancements for callpath profiling [LLNL].
13. Timer overhead measurements for callpath profiling. 
14. Compensation of timing overhead introduced. 
15. Malloc/free wrappers pinpoint memory allocation bugs (examples/malloc) [LLNL].
16. Added memory utilization tracking (examples/memory) [LLNL].
17. Added muse user defined events with TAU interrupt handlers [LANL].
18. Paraprof improvements (clickable callpaths, image, XML support) [LLNL]. 
19. Fuzzy matching of file names in tau_instrumentor (/home/foo.cpp ./foo.cpp) [TACC].
20. Added support for TAU_TRACK_MEMORY_HERE() [LLNL].
21. Improvements in PerfDMF and ParaProf's ability to connect to database [LLNL].
22. Added support for native PAPI events (setenv COUNTER1 PAPI_NATIVE_<nm>) [LLNL].
23. Added support for DyninstAPI v4.1 [UMD]. 
24. Added support for VTF3 binary trace generation library for Vampir.
25. Added hardware performance counters and other user defined events to trace.
26. Introduced hierarchical trace merging using tau_merge (both offline/online).
27. Added -PROFILEMEMORY option that tracks memory at each routine entry [LLNL].
28. Improved support for MySQL and PostgreSQL databases in PerfDMF. 
29. Added automated trace merge/convert with tau2vtf using TAU_TRACEFILE env. 
30. Added $(TAU_COMPILER) shell script/makefile variable for automatic instr.

  
 

Version 2.12 changes (from 2.11):
1. Enhancements in jracy for supporting multiple counter data [LLNL].
2. Improved memory handling and drawing speeds in jracy [LLNL].
3. Configuration changes for LAM MPI, PAPI, Tru64 [Utah, NCSA, LANL].
4. Added support for Python bindings [CACR, LLNL]. 
5. Added MPI shared library examples [CACR]. 
6. Added support for building multiple configurations (installtau) [LANL, LLNL].
7. Added support for Python under AIX and OSX [LLNL].
8. Bug fixes for IA-64 and Intel 7.1 compiler [NCSA].
9. Added TAU_CALLPATH_DEPTH env. variable specification for callpath profiling [LLNL].
10. Added support for -arch=ibm64. It suppports PAPI 64 bit/Power4. [UTK]
11. Bug fixes for shared libraries with MPI, g++/KCC under AIX 5.1. [LLNL]
12. Introduced paraprof profile browser (jracy symlinks to paraprof). [ASCI]
13. Added support for dumping profiles in python using a prefix. [LLNL]
14. Added support for DyninstAPI 4.0 including binary rewriting. [U. Maryland]
15. Added support for KOJAK's implementation of Opari and EPILOG. [FZJ]
16. Added support for file level selective instrumentation (PDT, Dyninst). [Utah] 
17. Fixed Apple's OS X sscanf bug for reading long doubles in pprof. 
18. Added support for DyninstAPI under AIX. [NERSC]
19. Added support for Cray X1 and AMD Opteron (ASCI Red Storm). [Cray]
20. Added support for MAGNET/MUSE. [LANL]
21. Added support for Performance Database [ASCI]. 
22. Added support for Multiplecounters with CRAY_TIMERS, MUSE and message size [CCA]. 


Version 2.11 changes (from 2.10):
1. Added -i header option for tau_instrumentor [CASC]. 
2. Added -LINUXTIMERS option for low overhead Linux wallclock time [CACR].
3. Added -c|-c++|-fortran options to tau_instrumentor [CACR].
4. Lowered the overhead of timers of disabled profile groups [CACR].
5. Added support for PAPI v2.1 [CACR]. 
6. Updated PCL bindings. 
7. Added support for selective instrumentation [CACR]. 
8. Added support for multiple counters [CACR]. 
9. Added support for Paraver trace visualizer (CEPBA) in tau_convert. 
10. Opari and PDT related changes (examples/opari/pdt_f90) [FZJ].
11. Added support for online access to performance data [CACR]. 
12. Added support for LINUXTIMERS for PGI and other Linux compilers [FZJ].
13. Changes to online access API [CACR].
14. Improved jracy GUI [ALPS]. 
15. Added support for EPILOG tracing package [FZJ]. 
16. Added support for Hitachi SR8000 [FZJ]. 
17. Added support for browsing by profile groups in jracy [ALPS, LLNL]. 
18. Made some modifications to Paraver trace format conversion [CEPBA]. 
19. Added support for NEC SX-5 [HLRS]. 
20. Added support for -mpilibrary option [LLNL].
21. Added support for g++ 2.96/3.x for tau_merge/tau_convert [ST].
22. Fixed a problem with the MPI wrapper library for Intel IA-64 compilers [NCSA].
23. Added support for tracking message sizes using user defined events [Rutgers].
24. Added support for low overhead, high resolution timers under IA-64 Linux [NCSA].
25. Added support for alternative returns in PDT based C instrumentation [PETSc, ANL].
26. Added a new tool - tau_reduce for reducing instrumentation overhead.  
27. Added support for callpath profiling.
28. Fixed pprof to support exclusive percentage in callpath profiling.
29. Changes for CCA, jracy & DyninstAPI on IRIX, Sun. 


Version 2.10 changes (from 2.9):
1. Better support for C instrumentation [HDF5].
2. Fixes for IBM.
3. Added support for multiple instrumentation requests per line [CACR F90/C++].
4. Added support for detecting threaded versions of MPI at configuration.
5. Made some modifications for PDT v2.1 [CACR C++/C].
6. Added jracy, TAU's new Java based profile browser to replace racy.
7. Added support for specifying a fortran compiler during configuration.
8. Added support for auto-detection of mpi libs and include dirs (-mpi).
9. Added IBM specific libs for MPI so we don't have to use mpCC, mpKCC [CACR].
10. Added TAU_LDFLAGS to MPI Makefiles [CACR].
11. Added support for enabling/disabling profile groups at runtime [PDT, CACR]. 

Version 2.9 changes (from 2.8):
1. Better support for mixed model programming
2. Changes for KCC and KAP/Pro.
3. Added support for MPI with DyninstAPI.
4. Added support for selective profiling in Java (-XrunTAU:exclude=java,sun)
5. Java RMI support changes.
6. Introduced TAU Java source instrumentation API.
7. Added support for enabling and disabling group level instrumentation.
8. Added support for PCL 2.0.
9. Fixed tau_instrumentor for PDT 1.3 using SGI CC and examples.
10. Fixed F90 bug on string concatenation.
11. Changed TauGroup_t to 64 bits (unsigned long).[Mapping addresses].
12. Added TAU_SHLIBS so DSO's are created everytime.
13. Support for incremental profile dumps.
14. PAPI on Solaris and other platforms requires linking with a static library.
15. Added support for Compaq Alpha (cxx, cc, f90).
16. Fix for MPT 1.4 under IRIX 6.5.
17. Changes in tau_merge to support Uintah.
18. Added support for SGI sproc threads.
19. Added support for dumping profile data in a consistent state (profile snapshot).
20. Added support for Opari OpenMP directive rewriting tool [EWOMP'01].
21. Improved MPI wrapper library support [Uintah].
22. Added support for gcc-3.0 (pprof).
23. Added a bug fix for Vampir (tau_convert -pv -longsymbolbugfix) [SAMRAI].
24. Changed Opari options (omperf to pomp name change).
25. Added support for dynamically assigning group names [SAMRAI].
26. Added support for evaluating perturbation of TAU_DB_DUMP() [Uintah].
27. Added support for C in tau_instrumentor.
28. Fixed RtsLayer bug for PDT based instrumentation of multi-threaded C++ 
    applications.
29. Added -noinline flag to tau_instrumentor to suppress instrumentation of 
    inlined functions [POOMA].
30. Added support for F90 in tau_instrumentor. 
31. Added support for abnormal exit in C [UPS].
32. Added support for Opari-1.1 [flush_enter/exit calls].
33. Added MPI wrapper layer for SGI Fortran [SAGE]. 
34. Made changes to SGI Fortran MPI layer [MPI_Init].
35. Added IA-64 support (threads, PDT, MPI ...) using RH 7.1 gcc 2.96.

Version 2.8 changes (from 2.7):
1. Added support for PAPI (Perf. API for accessing HW Perf. Counters).
2. Added better support for Dyninst.
3. Added support for CPUTIME (pthread/Linux). 
4. Added support for multi-language programming for Java + C (JNI). 
5. Added support for mpiJava. 
6. Added support for tracing all MPI interprocess communication (incl. async.)
7. Added support for PAPIWALLCLOCK (with -papi=<...>) for low overhead timers.
8. Added support for PAPIVIRTUAL (with -papi=<...>) for user time using PAPI.
9. Added support for OpenMP and OpenMPI (PGI, KAP, IBM, SGI)
10. More compilers: IBM xlC, xlc, xlf90 on SP (See INSTALL file)

Version 2.7 changes (from 2.6):
1. Added Support for JAVA (JDK 1.2+).
2. Added support for DYNINST Dynamic Instrumentation Package from U. Maryland.
3. Added support for SUN 5.0 CC, F90 compilers
4. Added support for Microsoft Windows. 
 
Version 2.6 changes (from 2.5):
1. TAU Mapping API introduced.
2. More platforms: Cray T3E with F90, Alpha/Linux, Intel/Linux 
   with PGI and Fujitsu compilers (C++/C/F90)
3. Added support for threadsafety in Fortran/C. 
4. Added support for Program Database Toolkit for instrumenting C++ 
   sources using tau_instrumentor 
5. Added support for Performance Counter Library for accessing Hardware
   Performance Counters on Cray, Intel, Alpha, UltraSparcs, MIPS, and 
   IBM Power platforms
6. TAU MPI wrapper library introduced for profiling MPI routines. 
7. Added NAS Parallel Benchmark 2.3 LU & SP suites as Fortran90/MPI examples.

Version 2.5 changes (from 2.4):
1. Automatic instrumentation support using DUCTAPE.
2. Changes in directory structure and configuration.
3. Integrated with POOMA and SMARTS.

Version 2.4 changes (from 2.3):
1. Added support for SMARTS and Tulip user level threads.
2. Added support for Fortran and F90 API.
3. Added threadsafe user defined events.
4. Added threadsafe trace library.

Version 2.3 changes (from 2.2):
1. Added pthread support.
2. Added C-API support with the same lib/API.
3. Introduced User Events

Version 2.2 changes (from 2.1):
1. Added callstack profile viewing tool
2. Blitz++ compatibility changes.

Version 2.1 changes (from 2.0):
1. Better colors in racy
2. Support for T3E.
3. Support for Tcl/Tk 8.0 as the default.
4. Introduced Callstack profiling.
5. Blitz specific changes. 

Version 2.0 changes (from 1.0):
1. Introduced Tracing.
