Design of a multi-language tool build, called gprmake.
Main TN: D215-001

The purpose of this tool is handle automatically builds for most languages,
based on the GNAT Project files.

Similarly, other GNAT tools will need to be reviewed wrt multi-language support,
in particular:

- gnatclean: verify that gnatclean handles multi-language objects/projects.
  See D612-002.

- gnatname: add support for non Ada languages in gnatname
  by taking the Languages attribute and the Naming package into account.
  -> a new TN should be created

- gnatls: should probably support non Ada objects/dependency files, and
  give info about the status of the corresponding sources.
  -> a new TN should be created

This document is focussing on the general project-related aspects to properly
handle multi-language projects, and more particularly into the implications
wrt gprmake.

Requirements
============

- Be extensible to new programming languages without having to modify
  gprmake itself, including languages only mentioned in the project files.
- Avoid as much as possible hard coding of languages and/or behavior.
  Provide useful defaults to ease set up and use with GNU compilers so
  that gprmake can either be very simple to use in the default cases (by
  not having to change dozens of settings), and also be configured for many
  different languages and toolchains.
- Handle source dependencies automatically
- Handle different compilers for different languages
  - including Ada: Ada should be a first class citizen of course, but not a
    special case as it is now.
  - not restricted to GCC
- Support all GNAT project file features on all languages when this makes
  sense. In particular, but not exclusively:
  - Provide language-specific compilation/link options
  - Provide file-specific compilation options
  - Provide conditional sections based on configurable variables
  - Handle building and use of libraries (static, shared)
- Use a process as similar as possible for each language
- Should be independent of the compiler version (in particular of the
  gnat version) as much as possible.
- Be as compatible as possible with gpr2make
  This is not a hard requirement since gpr2make has always be presented as
  a 'beta' tool, but is still desirable to ease the transition of our
  current users.

Note: a prototype implementation of gprmake exists, which mimics closely the
implementation of gpr2make+Makefile.generic. In the document below, we will,
for simplicity, refer to the whole gpr2make+Makefile.generic+Makefile.prolog+
gprcmd as 'gpr2make', without distinguishing which file does actually what.

In the document below, we give hints about default values and default behavior.
These should be taken as guidelines, and not as strong requirements.
For example, when we mention calling 'gnat bind', we mean 'as if we were
calling gnat bind, following the same semantics'. Having an equivalent built-in
implementation, or calling another tool with equivalent semantics is also
possible of course.

Since gpr2make was based on GNU makefiles, and gprmake is a tool written in
Ada, some decisions made for gpr2make do not necessarily make sense for gprmake.
This document is a good opportunity to fix these.

Although the behavior of the tool will be similar, the design and
implementation is expected to be different.

A long term goal with gprmake is to make the 'gnat' and 'gprmake' tools the
two main tools in the GNAT technology, and have gnatmake become less visible
(therefore possibly moving some functionalities form gnatmake to gprmake).

Related tools: gnatmake, gcc, make, ant

Process
=======

- The tool will take a GNAT project file as its input, parse it,
  and generate calls to underlying tools to handle the compilation, bind and
  link process.

- sources files are recompiled when
  - no dependency info is available for this file (case of a new source file)
  - one of the dependencies has a timestamp more recent than the corresponding
    object file (as done by the 'make' utility), when using makefile fragments
    (see below).
  - as done by gnatmake when using .ali files

- We do not want to get too deep in the handling of extra build processes (e.g.
  build of documentation, automatic code generation tools, extra preprocessing,
  ...), so this is intentionally left outside the control of this tool, to
  keep the process straight and simple.

  However, a simple mechanism will be provided to perform such basic operations,
  as a way to generalize the current use of 'gnatmake' as the Ada driver
  (as opposed to calling separate 'gcc' for other languages and checking the
  dependencies). See below for more information.

Output created by gprmake
=========================

- For the convenience of tools launched, gprmake will set an environment
  variable to the full path of the current project file being handled, so that
  this info is shared easily among tools.

  This environment variable could be called GNAT_PROJECT

  Note: this raises the interesting question of whether some gnat tools
  (e.g. the gnat driver) should support this environment variable, and if set
  and no -P switch is specified on the command line, use this project instead.
  This will make things simpler for gprmake, in particular for setting
  the Builder_Driver attribute described below.

- Each -Xname=val gprmake switch is translated into setting the environment
  variable "name" to "val"

- (optionally) dependency files in the object directory

- object files, results of compilations

- temporary files, results of bind

- executable and libraries, results of link

Input needed from gprmake
=========================

- Specify compiler command
  Currently specified using IDE'Compiler_Command (<language>)

  IDE is not the best package to store this info now that we are using it
  in non IDE tools and that we are generalizing the notion of multi-language
  support.

  -> Consider a transition path for existing project files created by GPS,
  such as displaying a warning

  Should the GNAT compiler command be "gcc" or "gnatmake" ?
  It's currently "gnatmake" because that's what GPS needed and what was
  convenient for gpr2make.

  -> The compiler command should be "gcc", since there's no reason to make
  a distinction between how Ada and non-Ada files are compiled. However,
  from a practical point of view, being able to delegate to gnatmake the
  process of compiling Ada files has also its advantages, in particular it
  insulates gprmake from incompatible changes in e.g. ALI format.

  Therefore, in order to support and generalize the use of gnatmake for Ada
  files on one hand, and on the other hand, support the invocation of
  'external tools' to handle build of a set of sources, we're adding a new
  Builder_Driver (<language>) that will, if specified, be
  called to check sources and compile them if needed. In this case, gprmake
  won't perform any timestamp check nor compilation for the sources of the
  designated language.

  That being said, having gprmake read ali files directly *and* being
  compatible with most GNAT versions is a desirable long term goal that should
  be kept in mind.

Setting of default values for existing project attributes
=========================================================

gprmake will set default values for the file extensions in the package
Naming for various languages, as already done for Ada (.ads/.adb)

Proposed default values:

"c": .h/.c
"c++": .h/.cpp
"fortran": .f for body (there's no notion of spec in fortran)
"objective-c": .m
"java": .java (no notion of spec)
"assembler": .s (no notion of spec)

Note is .cc or .cpp the best default for C++ sources ? Input needed
from C++ experts.

Addition of new attributes in the project file
==============================================

In order to ease the reuse of these attributes in various projects via
package renaming, we introduce a new package tentatively called
Language_Processing with the following attributes:

  - Builder_Driver (<language>): If set, use this tool to check
    dependencies of sources for a given language, and rebuild them if
    needed.
    Defaults to ("<prefix>gnat", "make", "-c") for "ada".
    Defaults to () otherwise.

    where <prefix> is an optional prefix computed based on the name
    of the gprmake tool when launched (e.g. calling powerpc-elf-gprmake
    will set the default Builder_Driver to
    ("powerpc-elf-gnat", "make", "-c").

    Note: the default for Ada assumes that the GNAT_PROJECT environment
    variable has been set and is recognized by the gnat driver.
    Alternatively, we could provide a wrapper that would recognize it,
    and call gnatmake -c -P $GNAT_PROJECT directly.

    This tool is called for each subproject where Builder_Driver is set,
    and assumes that Builder_Driver will build recursively all the dependent
    projects. This means for instance that if Builder_Driver is set on the
    root project, no compilation for this language will occur.
    The current directory is set to the subproject's directory
    before each call to provide a straightforward environment (for instance,
    to ease calling "make -f relative-make-file-path")

  - Compiler_Kind (<language>): Kind of compiler used for a given
    language for the current project. Recognized values are:
    - "GNU" (the default)
    - possibly others in the future, such as "Diab"
    This attribute is used by gprmake to set various defaults and change the
    way the build is handled in some cases.
    For now, any value other than "GNU" is considered as non-gnu compiler
    (e.g. "", or "unknown").

  - Compiler_Driver (<language>): name and switches of the tool that knows
    how to compile files of the given language for the current project.
    Defaults to ("<prefix>gcc", "-c", "-x", <language>) for GNU compilers.
    This tool is called inside the project's object directory.

    For non GNU compilers, defaults to ("cc", "-c")

    Also, if no options are specified, the following default options
    are appended:

    - ("-c", "-x", <language>) for GNU compilers
    - ("-c") for non GNU compilers

    This allows for ease of use, such as: ("gcc") instead of having to
    specify ("gcc", "-c", "-x", "c")

    In addition, if only one empty option is specified, it means that no
    default option should be appended, e.g: ("jgnat", "") will call jgnat
    without any "-c".

  See below 'Handling of dependencies' for more info on the
  following attributes:

  - Dependency_Option (<language>): list of switches to
    be used to tell the Compiler_Driver to generate a dependency file
    as part of the compilation process, e.g: ("-Wp,-MD,"),
    appended by Object_Dir & "file".d
    Defaults to ("-Wp,-MD,") for "c" and "c++" if the Compiler_Kind is set
    to "GNU".
    Defaults to () otherwise

    Note: the object_dir+file.d is concatenated directly to the last
    option, so that you can use either a single switch such as
    -MD,file.d using ("-MD,"), or two separate switches such as
    -dependency file.d using ("-dependency", "").

    If set, the Compute_Dependency attribute is ignored.

  - Compute_Dependency (<language>): Command used to compute
    the dependency for a given language. This is a list of strings, the
    first string being the command itself, and the remaining strings are
    the options. The path to the filename will be appended to the command,
    and the source path will be set as for the compiler (using CPATH and
    Include_Option).
    - Defaults to:
      (Compiler_Driver (<language>), "-M") & Default_Switches (<language>)
      for c & c++.
    - Defaults to () otherwise.

    If set and Dependency_Option isn't, this command is called right
    after a source file has been recompiled to create or update its
    dependencies.

  -> If Dependency_Option and Compute_Dependency are both undefined or set
     to the empty value, gprmake will consider that sources for this language
     have no dependencies and will only compare the timestamp of the object
     and its corresponding source file.

    Dependency files are assumed to be generated in Makefile format, see
    'Handling of dependencies' below for more details.

  - Include_Option (<language>)
    List of switches to specify a source search dir, defaults to () if
    Compiler_Kind = "GNU"; defaults to ("-I") otherwise.

    If Include_Option is not empty, each source dir is put on the
    command line, prepended with Include_Option, e.g:
    if Source_Dirs is set to ("/a/b", "../foo"), gprmake appends
    ("-I/a/b", "-I../foo") to the compilation switches, translating if
    needed relative paths to absolute paths, resolved based on the
    project file's directory.

    Clarification: the source dir is concatenated directly to the last
    option of Include_Option, so as to allow both forms:
    -I/path: ("-I")
    -include /path: ("-include", "")

  - Binder_Driver (<language>)
    Defaults to ("<prefix>gnat", "bind") for "ada"
    Defaults to () for other languages

    See binder section below for more info.

  - Linker_Driver (<language>)
    Defaults to (<prefix>gnat", "link") for Ada
    Default to Compiler_Driver (<language>) for other languages

    See linker section below for more info.

  - Default_Linker
    Defaults to (), meaning use the built-in linker.

    See linker section below for more info.

Handling of source path
=======================

The source path is extracted from the Source_Dirs attribute.
Depending on the compiler, gprmake will need to:

    - use compiler switches (-I/path)
    - set an environment variable (ADA_INCLUDE_PATH, CPATH)

    gprmake will always set the CPATH env variable to the current
    subproject's source path before calling Compiler_Driver.
    This list of source dirs is concatenated and separated by the
    appropriate Path_Separator (":" under unix, ";" under Windows), and
    append the previous value of CPATH, if any.
    Relative source paths are translated to absolute paths, based on the
    project file's directory.

    CPATH is chosen because it's already recognized by gcc and therefore no
    wrapper is needed when using gnu compilers.

    Note: if needed, wrappers can use this variable to set others (e.g.
    export ADA_INCLUDE_PATH="$CPATH") or use another mechanism (such as
    creating an input file).

    A new attribute Include_Option is added to handle the common case
    of source path specified on the command line.

Handling of object path
=======================

See handling of bind phase below.

Handling of dependencies
========================

  - how are dependencies handled in other languages ?
    and verify that the proposed scheme would allow handling of these languages

    - fortran: does not really have the notion of dependencies between files,
    basically each program is self contained, so the handling of dependencies is
    a simple matter of comparing the timestamp of the object file against the
    timestamp of the source file (correspond to having no
    Dependency_Option/Compute_Dependency attribute).

    - java: it seems that recompiling java files is handled either by IDEs
    directly or using tools such as 'ant', which is a make-like tool written in
    Java. Using 'ant' is in line with the Builder_Driver attribute
    proposed. Similarly with sun's javac.

    - assembler: no dependencies with assembler files.
    With a GNU toolchain, using default values should work out of the
    box: "gcc" as the Compiler_Driver, and ("-c", "-x", "assembler") for the
    Compiler'Compilation_Option. No dependency files will be created.

  - for simplicity, gprmake will, at least in a first step, only understand
    Makefile fragments for handling dependencies.

    In the future, it will make sense to handle other formats
    directly, in particular ALI files, so that compilation of Ada files can be
    handled without using gnatmake, and have Ada be handled very much
    like C or C++ (instead of calling gnatmake).
    -> This means that the handling of depencies should be properly isolated
    so that it can easily be adapted to support multiple dependency formats
    in the future.

    makefile syntax recognized:

    source: src1 src2 [...]

    A "\" followed by a newline sequence is ignored, and the next line is
    considered part of the source dependency list.

    The above syntax means that "source" depends on src1 and src2 and [...]

    Below we assume that .ali files are also supported to handle dependencies:

    After a compilation launched by gprmake is finished, gprmake will look
    in the object directory for either a source-file.ali or source-file.d file.
    If none exist, gprmake will assume that there are no extra dependencies
    for source-file other than the object file itself.

    If source-file.ali exists, gprmake will take it into account to handle
    dependencies, following current gnatmake rules.

    Otherwise, if source-file.d exists, gprmake will take it into account,
    following standard make rules.

    Question: should we try to generalize the .d and .ali extensions and
    handling above ? It's not clear how one could specify other dep formats in
    a project file without having to modify gprmake, except by providing
    a shared lib that would contain a dependency handler, following a
    documented API... For now, only consider .d and .ali

    Note: if some compiler (e.g. proprietary Ada compiler) does not provide
    the capabilities of generating a Makefile, the idea would be to provide a
    wrapper that would extract the proprietary information
    (from e.g. the Ada 83 library) and generate the information in the
    expected format (either .d or .ali file).
    Another approach would be to use gcc -c -gnatc to generate .ali files.
    In this case, Compute_Dependency would be set to e.g.
    ("gcc", "-c", "-gnatc") or a wrapper is some post processing is needed.

  Note: as opposed to gpr2make which needed to post process the Makefile
  fragments (*.d files), gprmake does not need to modify these files, since
  the extra capability can be handled by the tool directly. Namely, handling
  of added/removed files in the project.

Ability to specify that a given project should not be built
===========================================================

  This is needed so that external scripts/makefiles can be provided
  instead, and have other gpr-related tools such as GPS still be able to
  take advantage of the source list associated to a given project (for
  instance, to provide source navigation in GPS).

  There is also a similar need to specify that a library project is read-only,
  and should not be rebuilt (such as an installed library).

  ??? to be completed:

  -> Are the two needs the same ?

  -> What should the builder do during the link phase ?
  - either not do anything (assuming the objects/libs are specified manually
    in the Linker'Default_Switches)
  - or look for the lib (if lib project) or object files, and take them into
    account as usual.
    -> Should an error be emitted if there are missing libs/objects ?

  Proposal from Arno:
  This can be achieved by specifying an empty value to Builder_Driver: ("")

  Proposal from Vincent:

  I don't think that using this switch is appropriate, because it is
  supposed to be a switch indexed by the language (Builder_Driver
  (<language>)).

  How about a new project level attribute:

     for Library_Externally_Built use "true"; 

  The only valid values (case-insensitive) would be "true" and "false",
  the default of course being "false".

  -> Of course, the name would need to be generalized to non libraries to
  work (e.g. Externally_Built).

Handling of bind phase between languages
========================================

  At bind time, the environment variable LPATH is set to
  the list of object dirs of the project hierarchy, separated by
  Path_Separator so that binder and linker drivers can take advantage of it.

  GNAT_PROJECT is set to the root project file and never changed
  until gprmake ends (since it can also be used during the link phase).

  Since languages other than Ada either do not have a bind phase, or this
  phase is hidden (e.g. collect2 with g++, or done at run time in java), it
  is not clear what the behavior should be for other hypothetical languages
  that would require a bind phase. In any case, the simple approach below
  is proposed, which has the advantage of being compatible with Ada, and
  is an attempt at generalizing it. If we find that some other language needs
  binding and the approach below if not sufficient, we will revisit this issue.

  If non null, Binder_Driver (<language>) is called for the root project,
  for each main for each language supported by the project hierarchy.
  This tool is called from the root project's object directory.

  If no main is specified and the project is not a library project, no
  bind is performed.

  If no main is specified and the project is a library project, bind is
  performed with no main specified.

  The name of the main object is specified if a main is available.

  The following changes are needed for "gnat bind" to be compatible with this
  approach:

  - support the GNAT_PROJECT env var
  - support object-file instead of ali file, and use corresponding ali file
    internally
  - when no object file is specified, consider all ali files from the project
    hierarchy for the bind phase

Handling of link phase between languages
========================================

  Examples of current special cases, that are generalized below:
  - C: use Compiler_Driver ("c") (e.g. gcc)
  - C++: use Compiler_Driver ("c++") (e.g. g++)
  - Ada: use gnatlink with GNAT
  - C & C++: use Compiler_Driver ("c++")
  - C & Ada: use gnatlink with GNAT
  - Ada & C++: use gnatlink --LINK=Compiler_Command ("c++")

  The Default_Linker is called in the object directory of the main
  project. This attribute is empty by default: (),
  meaning that a built-in driver is used, which behaves as follows:

  - for projects with a single language, call Linker_Driver (<language>)
  followed by ("-o", <executable name>),
  followed by the object of the main unit, followed by all the library
  corresponding to library project files (and -L options),
  followed by lib<root_project>.a if one has been created,
  followed by Linker'Default_Switches (<language>) or
  Linker'Switches (<main>) if defined.

  executable name is computed based on the project info and defaults, as
  done by gnatmake for Ada projects.

  Note: for Ada, since gnatlink expects an ali file rather than an object
  file, we need to enhance gnat link (and possible gnatlink as well) as
  follows:

  - consider that if no .ali is specified, then the
    first .o file should be used instead if a corresponding .ali file exists,
    and use this .ali file.

  - for projects with multiple languages excluding Ada, call
    Linker_Driver (<language of the main unit>) followed by the same arguments
    as single language projects.

  - for projects with Ada & C++ (& optionally other languages):
    use Linker_Driver ("ada") & ("--LINK=" & Linker_Driver ("c++"))

  - for projects with Ada & languages others than C++: use Linker_Driver ("ada")

  Note: if needed, "--LINK=" & Linker_Driver (language-x) can be added to
  the root projects' Linker'Default_Switches to cover other simple cases.

  For more exotic cases, an alternate (non built-in) Default_Linker
  can be used. Typically, it is expected that this non built-in default driver
  will parse the project file (using the GNAT_PROJECT env variable) and then
  act accordingly.
  Note: except in some very special cases, we do not expect users to actually
  provide an alternate Default_Linker, but since linking is a tricky
  phase and requires lots of fiddling, it seems a good idea to keep this
  general option available. In general, we would expect that AdaCore would
  be contracted to develop a small linker driver based on the gnat project
  sources to solve a particular need.

  At link time, when some sources or executable are not up-to-date,
  an archive is built containing the object files of each language
  source for which no Builder_Driver is set, and recreated in the object
  directory of the main project as lib<root_project>.a.
  If Builder_Driver is set for all supported languages, no archive is created
  nor used (rationale: some systems do not support creating empty archives).

  Rationale: it is expected that when Builder_Driver is set, the specific
  builder driver will keep track of the object files required itself, and
  that they will be handled either by Linker_Driver, or via
  Linker'Default_Switches.
  Concerning the approach of building a single big archive, see also file
  link-order.txt for more info on why we're doing it.

  If a previous archive exists, it is removed before creating a new one, so
  as to avoid picking obsolete object files coming from deleted or moved source
  files.

  Linker'Default_Switches (<language>) or Linker'Switches (<executable>)
  are taken into account by gprmake and added to the end of the link options,
  after the lib<root_project>.a file.

  For each main specified in the project file or on the command line,
  gprmake will call the linker as described above.

  If no main is specified, no link is performed except for library projects,
  where the library is created (as done by gnatmake currently).

Command line option supported by gprmake
========================================

  In order to keep gprmake as generic as possible, options such as
  -ccargs or -cxxargs should be avoided.

  Current gprmake options are:

  gprmake -P<project file> [opts]  [name] {[-cargs opts] [-ccargs opts]
          [-cxxargs opts] [-largs opts] [-gargs opts]}

  name is zero or more file names

  -c       Compile only
  -f       Force recompilations
  -k       Keep going after compilation errors
  -o name  Choose an alternate executable name
  -Pproj   Use GNAT Project File proj
  -q       Be quiet/terse
  -u       Unique compilation. Only compile the given files
  -v       Verbose output
  -vPx     Specify verbosity when parsing Project Files
  -Xnm=val Specify an external reference for Project Files

  -cargs opts    opts are passed to the Ada compiler
  -ccargs opts   opts are passed to the C compiler
  -cxxargs opts  opts are passed to the C++ compiler
  -largs opts    opts are passed to the linker
  -gargs opts    opts directly interpreted by gprmake

  The first set of options is kept as is.
  A clarification: each name is considered as a main unit as far as bind and
  link phases are concerned. If no main is specified, all the mains listed
  in the project are used instead. If no main is listed in the project
  file, all project files are compiled, and if some project files are library
  projects, they are are linked to produced the corresponding libraries.

  The -cargs, -ccargs and -cxxargs options are replaced with:

  -cargs:<lang> opts

  where <lang> is an arbitrary, case-insensitive, value. The switches are
  appended to all the compiler switches for <lang>. In particular,
  Compiler'Default_Switches (<language>), as well as file specific options.

  For compatibility with gnatmake, -cargs is equivalent to -cargs:Ada

  For example, to specify -O2 -g for Ada, one of these will do:
     -cargs:Ada -O2 -g
     -cargs:ada -O2 -g
     -cargs -O2 -g

  In addition and for consistency, we give access to the Builder and Binder
  options:

  -margs:<lang> opts

  same as -cargs, for Builder_Driver (<lang>) (the 'make' driver), if any.

  Typical usage from GPS: gprmake -P project -margs:ada -d

  -bargs:<lang> opts (with -bargs equivalent to -bargs:Ada)

  same as -cargs, for Binder_Driver (<lang>), if any.

  -largs is kept as is. Options following -largs apply to the linker whatever
  the language of the main.

  -gargs is kept as is.

  In addition, a new switch "-d" is added, to display the build progress,
  in a way similar to what gnatmake -d does now, for the benefit of
  GPS.

Unanswered questions
====================

  - How to handle default values ?
    Lots of default values are suggested in this document.
    Would it be worth having a meta-configuration file used to set/change
    these default values ? This issue can be resolved in a later phase.

