
Description of benchmarks:
==========================
1 -- transposeT: transposition of matrix implemented using array backpermute
  
2 -- transposePT: transposition implemented using Unlifted.backpermute.

3 -- transposeDFT: transpose using array default backpermute. This is just a test
     compare the performance of default backpermute and backpermute. Strangely enough,
     slightly faster(!)  than backpermute. 

4 -- transposeDT: delayed version of transposeT
  
5 -- transposePDT: delayed version of transposePT

6 -- relaxT: for each element of a matrix, calculate the average of element value with
     its four neighbours, using backpermuteDft

7 -- relaxDT: delayed version of relaxT

8 -- mmT: matrix-matrix multiplication, strict

9 -- mmDT: matrix-matrix multiplication, delayed

10 -- hmDT: dq matrix multiplication, delayed

11 -- hhmDT: hierarchical dq matrix multiplication, delayed




