NDP library
===========

Home:             http://haskell.org/haskellwiki/GHC/Data_Parallel_Haskell
Darcs repository: http://darcs.haskell.org/packages/ndp

Building
--------

The NDP library works ONLY with GHC. You really want to use HEAD, which
includes many improvements to the simplifier which are crucial for good
performance.

The recommended way to build the NDP library is by building the latest GHC
from the darcs repository (http://hackage.haskell.org/trac/ghc/wiki/Building),
then unpacking the sources (or, preferably, getting the latest version from
the darcs repository) in ghc/libraries and then doing

cd ndp
make boot
make
cd examples
make

Writing NDP programs
--------------------

Import Data.Array.Parallel.Unlifted (sequential combinators) and/or
Data.Array.Parallel.Unlifted.Parallel (parallel combinators). The subdirectory
examples contains several NDP programs.

Before invoking any parallel combinators you must initialise the gang threads,
usually by calling

  Data.Array.Parallel.Unlifted.Distributed.setGang <T>

where T is the number of threads. Normally, you want to run the program with
+RTS -N<T> as well as you won't get any parallelism otherwise. This is a
regrettable hack which will go away eventually.

Compiling NDP programs
----------------------

The file ndp.mk shows the options you'll usually want for compiling NDP
programs. You don't need -fbang-patterns unless you actually use them. You
might get away with omitting -fmax-simplifier-iterations6; however, sometimes
GHC will really need that many iterations to fully optimise NDP programs.

NDP on NUMA machines
--------------------

NUMA machines, for instance multiprocessors based on AMD Opterons, might
benefit from interleaved memory allocation. Under Linux, this can be achieved
by running

  numactl --interleave=all <cmd>

where cmd is the NDP program you want to run. Also, explicitly setting the
processor affinity (taskset in Linux) might be helpful.

Implemented functionality
-------------------------

At the moment, the library only really supports flat arrays. Segmented arrays
are provided but you must really know what you are doing to get efficient
code. A lot of combinators (especially for segmented arrays) are still
missing.

Haddock documentation
---------------------

Haddock doesn't support various language features used by the library at the
moment and can't be used to generate documentation. In principle, haddock.ghc
from http://darcs.haskell.org/SoC/haddock.ghc should be able to process the
library but it doesn't work for me at the moment.

