How to profile FPC code.

Short guide on how to profile view3dscene (or any other FPC binary, actually)
with valgrind. This reflects some Kambi preferences and tastes.
Do not follow it blindly, your real guide is documentation of
FPC+valgrind and valgrind/callgrind manual.

http://valgrind.org/docs/manual/cl-manual.html
http://wiki.lazarus.freepascal.org/Profiling

------------------------------------------------------------------------------
Compile for valgrind:

- Make sure you use proper options when compiling with FPC.

  - If you use our build tool for compilation
    ( https://github.com/castle-engine/castle-engine/wiki/Build-Tool )
    then just use --mode=valgrind command-line option.
    This will pass correct options to FPC underneath.

  - Otherwise, make sure:
    - You MUST use -gv option, this adds stuff necessary for valgrind.
    - You SHOULD use -gl (line info) to get line number information.
    - You SHOULD NOT use -Xs (strip debug info), it would strip useful
      function info from your exe.

  - If you compile from command-line using direct "fpc ..." command
    and "@castle-fpc.cfg", then search in ../castle-fpc.cfg for
    valgrind options and uncomment them.
    Be sure to also comment out -Xs inside.

  - Otherwise, just add
    -gv
    -gl
    to your ~/.fpc.cfg. Remember to remove them later.

  - Be sure that you don't strip symbols.
    If you use castle-fpc.cfg to compile, comment out -Xs option inside.

- Clean units (to recompile all for valgrind), like
    make clean
  in CGE directory. Otherwise, you will not get profiling info inside CGE
  routines.

- Compile your project as usual. Compile a release version (with -dRELEASE),
  you want to profile released code (otherwise you may find
  serious time eaters in stuff like range or overflow checking,
  and they will skew your results).

------------------------------------------------------------------------------
For speed profiling:

Use valgrind's callgrind tool.
Running program through callgrind adds an enormours slowdown,
especially with instrumentation (this is when actual measurements take place).
So it's advised to start without instrumentation, and only turn it on
for the interested code part.

valgrind --tool=callgrind --instr-atstart=no ~/bin/view3dscene

# from other shell:
callgrind_control -i on
callgrind_control -i off

# investigate the report:
kcachegrind

------------------------------------------------------------------------------
For memory usage profiling:

- Try using CMem: add CMem as first unit on uses clause of the main program,
  or compile with -FaCMem. This avoids possible complications of default
  FPC memory manager (that may be lazy about releasing memory to OS,
  in order to gain some speed; although this didn't have any noticeable bad
  impact on memory use in my tests, so FPC memory manager does a good job;
  but still it's worth checking.)

  Note that valgrind (-gv) option conflicts with using CMem explicitly.
  That is because, as far as I know, using -gv means that CMem
  is already, implicitly, added (looks like CMem is required for valgrind to
  duplicated in uses clause. correctly interpret results).
  So "-gv -FaCMem" will say that Cmem is duplicated in uses clause.
  So actually any one of "-gv" or "-FaCMem" (but don't use both)
  will result in your program being tested with CMem memory manager.

- Use valgrind's massif tool.
  Compile like described previously for callgrind (with -gv), then run like

    valgrind --tool=massif --run-libc-freeres=no ... ~/bin/view3dscene

  Actually, we have a script that wraps a couple of useful massif
  options for you in ../../scripts/massif_fpc . Just copy it to your $PATH,
  like ~/bin/, and then execute

    massif_fpc ~/bin/view3dscene

  Afterwards investigate the resulting massif.out.xxx file, by

    ms_print massif.out.xxx
