[[title
                 Towards Data Structures for Tla 2.0
                                 ---
                               draft A
]]

  `tla 1.x' uses `./src/tla/libawk' to provide a bunch of "awk-like"
  data structures.  `libawk' is good, as far is it goes, but it isn't
  suitable for a thoroughly librified `libarch'.
 
  [[blockquote

    */problems with the current `libawk':/*

    /bogus error handling -- / `libawk' makes little attempt to
    propogate errors to callers in an orderly way: it assumes it
    is running in a one-shot (short-lived) process and is free
    to exit on error.

    /leaky abstraction barrier -- / programs using the current
    `libawk' too often wind up refering to libawk strings as `t_uchar
    *' (which is incompatible with the Unicode plans) or, even worse,
    explicitly freeing, allocating, more modifying
    supposed-to-be-opaque fields of `libawk' data structures.
    The API isn't quite a clean abstraction.

    /missing functionality -- / 2.0 needs Unicode support which would
    be hard to retrofit onto `libawk'.  While working on `1.x', I
    sorely missed some minor generalizations of `libawk' such as
    number valued list and table entries.

    /awkward memory managment (no pun intended) -- / Programs must
    explicitly free all `libawk' data structures allocated as 
    "stack locals".   It is easy to flub this, at least along some
    execution paths -- the result would be a memory-leaking 2.0
    library.

  ]]

  Here is what I plan to take the place of `libawk' in `tla 2.0'.


* Namespaces

 The central data structure used by `libarch' in `tla-2.0' is called a
 `namespace'.

 ^*Roughly*^ speaking, a `namespace' is a kind of dictionary: a
 dynamically modifiable collection of named variables.   Programs
 create and delete variables within a namespace.   Programs read and
 write the values of variables in a namespace.

 However, namespaces have considerably more structure than your
 average dictionary:

** Namespaces at a Glance

 A */namespace/* is a data structure which maps *variable names* to
 *locations*.  What is a "variable name"?  What is a "location"?

 A */location/* is mutable storage for a single */scalar value/*.  The
 set of scalar values includes *strings*, *numbers*, *symbols*,
 *booleans*, and the value *`nil'*.  A location works similarly to a
 /C/ variable or structure field of scalar type: programs can read the
 value stored there;  programs can store a new value there.

 [[tty
        Need a picture here.
        A "location" pictured as a box, 
        containing a scalar.
 ]]


 A */variable name/* is comprised of at least a */scope number/* and
 */identifier/*.  The scope number is a small integer, the identifier
 is a string (constrained to conform to "identifier name syntax").

 A namespace contains 1 or more dynamically allocated scopes.   Each
 scope is a disjoint namespace: the same identifier may be bound to
 two distinct locations, in two different scopes.


*** Simple Variables

 A */simple variable name/* is comprised of *only* a scope number
 and identifier.   Given just a the identifier naming a simple
 variable, and its scope, programs can find its location and 
 therefore both read and modify the value of the variable.

 [[tty
	Need a picture here.
        A scope, containing only simple variables, 
        pictured as a 2-col table with variable names
        in the left column, scalar values in the right.

        A namespace pictured (for now) as an array
        of such scopes.
 ]]


*** Non-Simple Variables: Lists and Tables

 Some identifiers, however, are bound to more complex variables.

 A */list variable/* is an identifier bound within a namespace 
 scope to a */dynamically resizable 1-d array of locations/*.
 
 To find a location within a list variable, a program must supply
 a scope, an identifier name, and an *integer list offset*.

 Similarly, a */table variable/* is an identifier bound (within
 a given scope) to a */dynamically resizable array of rows, each
 row being a dynamically resizable array of columns, each column
 within a row being a single location/*.

 To find a location within a table variable, a program must supply
 a scope, an identifier name, an *integer row offset*, and
 an *integer column offset*.

 [[tty
	Picture

        Picture scopes as a table:

	     name:		binding:

	     simple_var		[ 42 ]

	     list_var		size=2
				[ "hello" ]
                                [ "world" ] 

	     table_var		n_rows=2
				row[0]= [ "hello" ] [ "world" ] 
				row[1]= [ "hello" ] [ "sailor" ] [14]


        A composite of several of those forming a namespace.

	Some variable names with arrows pointing to the addressed
        location.

	E.g.   "simple_var" points to the box around 42
		"table_var[1][1]" points to the box around "sailor"


   ]]

*** Fancy Tricks With Scopes

  The `namespace' data structure provides *somewhat* efficient
  operations to:

  *Clear a scope -- * Remove all bindings from an indicated scope. 
  (All names in the scope are, in effect, simple variables whose 
  current value is `nil'.)

  *Push a scope -- * Clear the scope but, first, remember the old
  values on a stack.

  *Pop a scope -- * The inverse of pushing a scope.

  There are some */"standard scopes"/*, intended to be used with
  these operations, used to implement a simple (albeit heavyweight)
  "calling convention" based on namespaces.   The standard
  scopes are:

  [[tty
     environment_scope
     global_scope
     params_scope
     locals_scope
     returns_scope
  ]]



** Overview of Using Namespaces

 The `namespace' data structure will be used in `tla-2.0' to provide a
 uniform and completely "reflective" API to `libarch'.

 Using `tla-1.x', a "client program" of `libarch' has little choice
 but to run `tla' as a subprocess.   To invoke a `libarch' entry
 point today, a program has to build an `argv' array of the
 parameters, `fork' and `exec' the command, wait for the command and 
 collect any return parameters.

 Using `namespaces', a `2.0' client will do something very similar, 
 but considerably easier in the details:

 To run a command, a client can:  (1) allocate a `namespace';  (2)
 initialize the `namespace' by setting variables to reflect parameters
 to the command (and the name of the command to run);  (3) call
 `arch_run';  (4) read back exit status and results from the
 `namespace';  (5) free the namespace.

 (There is also the possibility of namespaces persisting across
 multiple `libarch' invocations, of course.)



** /Function:/ {*`alloc_namespace'}

  /Prototype:/

    [[prototype
      ssize_t alloc_namespace (void);
    ]]

  /Description:/

    [[blockquote

      Normally, return a small positive integer: the "namespace
      descriptor" for a newly allocated namespace.

      Upon a recoverable allocation failure (a retry might succeed if
      the allocator permits it), return 0.

      Upon catastrophic failure, return a value less than 0.  Here and
      elsewhere, a catastrophic failure (usually indicated by a return
      value less than 0) indicates that most calls into `libarch' are
      no longer safe.  Callers receiving a "catastrophic error" return
      value should, persumably, arrange to make an emergency exit from
      their process as quickly as possible.
    ]]

** /Function:/ {*`free_namespace'}

  /Prototype:/

    [[prototype
      int free_namespace (ssize_t nspace);
    ]]

  /Description:/

    [[blockquote

      `nspace' should be a descriptor previously returned by
      `alloc_namespace'.

      Free the indicated namespace and release all associated
      resources.

      Normally, return 0.

      Return a negative value upon a catastrophic error.

      Should not, currently, return a positive value.
    ]]


* Scopes

  Each `namespace' data structure can contain *multiple, disjoint
  identifier name mappings at once*.  Each disjoint mapping is called
  a */namespace scope/*.

  In other words, a namespace contains `N' scopes.  Every identifier
  name can be bound to a variable in each of those `N' scopes.  Two
  scopes can contain separate variables for a single name.



** Function: {*`namespace_create_scope'}

  /Prototype:/

    [[prototype
      ssize_t namespace_create_scope (ssize_t nspace);
    ]]

  /Description:/

  [[blockquote

    Return a positive integer which serves as the name for a new
    disjoint identifier mapping within `nspace', a namespace
    previously returned by `alloc_namespace'.

    Return -1 for catastrophic failures and 0 for potentially
    transient failures (such as some kind of allocation failure).

    There is no correpsonding function to */release/* a previously
    allocated scope.  Programs are not expected to create large
    numbers of scopes.
  ]]


** Standard Scopes

  This section defines {*`namespace_statics'}, {*`namespace_globals'},
  {*`namespace_locals'}, {*`namespace_params'}, {*`namespace_returns'}.

  The `namespace' library provides some standard, built-in 
  scopes.   The integer identifiers for these scopes are 
  the same in all `namespace' instances:

  /Prototypes:/

    [[prototype
      ssize_t namespace_environment (void);
      ssize_t namespace_globals (void);
      ssize_t namespace_locals (void);
      ssize_t namespace_params (void);
      ssize_t namespace_returns (void);
    ]]


  /Description:/

  [[blockquote

    Return the scope number for each of the 5 standard scopes.
  ]]



** Scope Lists

  Every scope represents a mapping from identifiers to variables.
  In fact, scopes have additional structure beyond that.

  Let's call a simple mapping from identifiers to mappings a
  */symbol table/*.

  A *scope* then has two parts: */a list of symbol tables/* and
  a */current offset into that list/*.

  In pseudo-code, we might declare a `scope' data structure
  this way:

  [[tty
      struct scope
      {
        int current_list_pos;
        list_of<struct symbol_table> symbtabs;
      }
  ]]

  If a user asks for the variable named `X' in scope `S', then `X' is
  looked up in in the symbol table `S.symbtabs[S.current_list_pos]'.


** Using Scope Lists as Stacks

*** Function: {*`namespace_push_scope'}

  [[prototype
      int namespace_push_scope (ssize_t nspace, ssize_t scope);
  ]]

  Allocate a new symbol table and append it to the `symbtabs'
  list of the indicated scope.   Set the `current_list_pos'
  of that scope to point to this newly appended symbol table.
  The new symbol table is initially empty (no identifiers bound
  to variables).

*** Function: {*`namespace_pop_scope'}

  [[prototype
      int namespace_pop_scope (ssize_t nspace, ssize_t scope);
  ]]

  Discard the last element of the `symbtabs' list of the
  indicated `scope'.   Set the `current_list_pos' pointer of
  the scope to point to the new last element of `symbtabs'.

  If this operation would otherwise leave the `symbtabs' list
  empty, instead, the list is reinitialized to contain a single
  symbol table, initially containing no bindings.


** Randomly Accessing Scope Lists

*** Function: {*`namespace_n_scope_elements'}

  [[prototype
      ssize_t namespace_n_scope_elements (ssize_t nspace, 
                                          ssize_t scope);
  ]]

  Return the number of elements in the `symbtabs' list of
  the indicated scope.


*** Function: {*`scope_set_symbtab'}

  [[prototype
      ssize_t namespace_set_symbtab (ssize_t nspace, 
                                     ssize_t scope,
                                     ssize_t symbtab_list_pos);
  ]]


  Change the `current_list_pos' field of the indicated scope.
  I.e., change which scope in the `symbtabs' list is used, by
  default, to look up variable names.



* Variables, Indexes, and Locations

  \//So.//\ Namespaces contain scopes.  Each scope is a dynamic list
  of symbol tables plus a "current symbol table" index.  Each symbol
  table maps identifiers to variables.   Please make sure you have
  absorbed enough from the preceeding sections to understand the
  description in this paragraph before continuing.

  We're left with at least two questions: *What are identifiers?* and
  *What are variables?*.


** Identifiers

  */Identifiers/* are represented as ASCII strings, beginning
  with an alphabetic character, containing only alphabetic,
  numeric, and underscore characters.

  In `namespace' APIs, identifiers are usually passed as
  `t_uchar *' pointers to 0-terminated strings.   


** Variables and Locations: Singletons, Lists, and Tables

  Namespace */variables/* are */containers for one or more
  mutable ^locations^/*.

  Each */location/* holds a */scalar value/*.  A scalar value can be a
  number, (immutable) string, symbol, boolean, or the `nil' value.

  */Singleton variables/* consist of just a single location.
  They hold a single scalar value.   To access the scalar value stored
  in a singleton variable, you need only the variable's name.

  */List variables/* consist of a dynamically sized ordered collection
  of locations.  New locations can be prepended to, appended upon, or
  inserted into the list.  Locations can be deleted, too, from
  arbitrary positions within the list.  To access a scalar value
  stored in a list variable, you need both the variable's name and an
  integer list element index.

  */Table variables/* consist of a "dynamically sized ordered
  collection of list of locations" (*whew!*).  In plainer english, a
  table variable is a resizable list of rows, and each row is a
  resizable list of columns.  Each element of a column is a separate
  location, containing some scalar value.  To access a scalar value
  stored in an table variable, you need the variable's name, an
  integer row index, and an integer column index.



*** Lists and Tables *Not* Values

  Don't make the mistake of thinking that a *list variable*
  is a *variable whose value is a list*.

  There is no such thing as a *value which is a list*: all values
  in namespaces are immutable scalars.   Lists can be modified
  and are composite values, containing `N' locations, each containing
  a separate scalars.

  Think instead that some variables happen *list structured* (or
  *array structured* or whatever) -- instead of consisting of a single
  location, they happen to consist of a modifiable list of locations.
  The *list* in this equation is part of the variable -- not part of
  the value stored in the variable.

  Got it?

  \\/Note:/\\ Please pay special attention to the 
  function {`namespace_copy'}, documented below.
  Understanding it's semantics is vital to understanding
  how to use namespaces effectively.

*** Function: {*`namespace_rename'}

  /Prototype:/

    [[prototype 
      int namespace_rename (ssize_t nspace,
                            t_uchar * old_name,
                            ssize_t old_scope,
                            t_uchar * new_name,
                            ssize_t new_scope);
    ]]

  /Description:/

  [[blockquote

    Change the name and scope of a variable.  If the old and new names
    or scopes differ, the old name becomes (in effect) a singleton
    variable bound to nil and the new name is bound to the variable
    formerly bound to the old name.
  ]]


*** Function: {*`namespace_copy'}

  /Prototype:/

    [[prototype
      int namespace_copy (t_uchar * to_name,
                          ssize_t to_scope,
                          ssize_t nspace,
                          t_uchar * from_name,
                          ssize_t from_scope)
    ]]

  /Description:/

  [[blockquote

    If the `from' variable is a singleton variable, then make the `to'
    variable a singleton variable containing an equal scalar value.

    If the `from' variable is a list or table variable, then the `to'
    variable is made to be a *reference* to that same list or table.
    By *reference*, I mean that modifications made to either variable
    are visible as modifications to both -- they refer to the same
    underlying list or table.

    Although two variables can refer to the same list or table,
    nevertheless, each list or table specifically "belongs" to one
    variable in particular.  If that variable is destroyed or
    converted to some other kind of variable, then the list or table
    is destroyed.  When that happens, all *other* variables that refer
    to the same list or table are implicitly converted into
    singleton variables, containing the value `nil'.

    In other words, if you `namespace_copy' variable `A' to variable
    `B', and `A' was a list variable at the time, then:

    [[blockquote

      \1.\ modifications to the `A' list effect the `B' list and vice
           versa.

      \2.\ if `A' is destroyed or is converted to some other kind of
           variable, then `B' becomes a singleton variable,
           initialized to the value `nil'.

      \3.\ if `B' is destroyed or is converted to some other kind of
           variable, on the other hand, `A' is uneffected.

    ]]

    In effect, `A' has been copied to `B' *by reference* with the
    caveat that, using our `namespace' interfaces, the representation
    of references are "safe" (e.g., can't result in de-referencing
    invalid pointers).
  ]]

  
*** Namespace "Addresses" (aka *Indexes*)

  Locations within a namespace are analogous to byte locations
  within the memory of a general purpose computer:  they can contain
  a simple "scalar" value and, *they have an address*.

  *Namespace location addresses* are the topic of this section.

  To avoid confusion over the word "address", the actual name we use
  for *namespace location addresses* is */namespace indexes/*.


**** Type {*`t_namespace_index'}

  /Prototype:/

    [[prototype
      typedef <unspecified> t_namespace_index;
    ]]

  /Description:/

  [[blockquote

    The type of address-like namespace indexes.

    A namespace index functions similarly to an address: given a
    namespace and a namespace index, a unique (although possibly
    non-existent) location is refered to.

    Given an index (and its namespace), a program can read and write
    the contents of the named location --- in that way, an index
    functions similarly to a pointer.

    Unlike pointers, namespace indexes are reliably bounds checked.
    If your program has bugs, dereferencing or changing the location
    named by an index might return unexpected data or store data in an
    unintended part of the namespace --- but at least the namespace
    data structure will remain internally consistent.  You won't wind
    up dereferencing an invalid C pointer, for example.

  ]]

**** Function {*`namespace_index'}

  /Prototype:/

    [[prototype
      int namespace_index (t_namespace_index * index_ret,
                           ssize_t nspace,
                           t_uchar * var_name,
                           ssize_t scope);
    ]]

  /Description:/

  [[blockquote

    Fill in `*index_ret' with an index that refers to the singleton
    location bound to `var_name' in the indicated scope.

    Normally, return 0.

    Upon catastrophic error, return a value less than 0.

  ]]


**** Function {*`namespace_list_index'}

  /Prototype:/

    [[prototype
      int namespace_list_index (t_namespace_index * index_ret,
                                ssize_t nspace,
                                t_uchar * var_name,
                                ssize_t scope,
                                ssize_t list_pos);
    ]]


  /Description:/

  [[blockquote

    Fill in `*index_ret' with an index that refers to the list element
    location bound to `var_name' in the indicated scope, at list
    offset `list_pos'.

    Normally, return 0.

    Upon catastrophic error, return a value less than 0.

  ]]


**** Function {*`namespace_array_index'}

  /Prototype:/

    [[prototype
      int namespace_array_index (t_namespace_index * index_ret,
                                 ssize_t nspace,
                                 t_uchar * var_name,
                                 ssize_t scope,
                                 ssize_t row,
                                 ssize_t col);
    ]]

  /Description:/

  [[blockquote

    Fill in `*index_ret' with an index that refers to the array
    element location bound to `var_name' in the indicated scope, at
    array position `row, col'.

    Normally, return 0.

    Upon catastrophic error, return a value less than 0.

  ]]



** Setting and Getting Scalars Stored in Locations

  Namespace indexes give us a way to translate location names
  within a namespace into a form of "address" for the indicated 
  location.   The functions in this section let you read or write
  the scalar stored in a given location.

  Scalar values may be numbers, strings, symbols, booleans, or the
  value `nil'.

*** The Value `nil'

**** Function: {*`namespace_is_nil'}

  /Prototype:/

    [[prototype
      int namespace_is_nil (ssize_t nspace, 
                            t_namespace_index index);
    ]]


  /Description:/

    [[blockquote
      Return 1 if the indicated location exists and contains `nil',
      0 otherwise. 
      
      Return a value less than 0 upon catastrophic error.
    ]]

**** Function: {*`namespace_store_nil'}

  /Prototype:/

    [[prototype
      int namespace_set_to_nil (ssize_t nspace, 
                                t_namespace_index index);
    ]]

  /Description:/

    [[blockquote
      Store `nil' in the location indicated by `index'.

      If the indicated location does not currently exist, 
      return 1, otherwise return 0.

      (Except) return a value less than 0 for catastrophic errors.

    ]]

*** Number Values

**** Function: {*`namespace_is_number}

  /Prototype:/

    [[prototype
      int namespace_is_number (ssize_t nspace, 
                               t_namespace_index index);
    ]]


  /Description:/

    [[blockquote
      Return 1 if the indicated location exists and contains a number,
      0 otherwise. 
      
      Return a value less than 0 upon catastrophic error.
    ]]

**** Function: {*`namespace_set_to_int32'}

    /Prototype:/

    [[prototype
      int namespace_set_to_int32 (ssize_t nspace, 
                                  t_namespace_index index,
                                  t_int32 new_value);
    ]]

  /Description:/

    [[blockquote
      Store `new_value' in the location indicated by `index'.

      If the indicated location does not currently exist, 
      return 1, otherwise return 0.

      (Except) return a value less than 0 for catastrophic errors.

    ]]


**** Function: {*`namespace_get_int32'}

  /Prototype:/

    [[prototype
      int namespace_get_int32 (t_int32 * n_ret,
                               ssize_t nspace, 
                               t_namespace_index index);
    ]]

  /Description:/

    [[blockquote
      Retrieve the value stored in the location addressed 
      by `index', presuming that that location exists and 
      contains a number representable as a 32-bit integer.
      Return 0 in this case.

      If the location does not exist or contains a non-number,
      return a value greater than 0.

      Upon catastrophic error, return a value less than 0.

    ]]


 *** Boolean Values

**** Function: {*`namespace_is_boolean'}

  /Prototype:/

    [[prototype
      int namespace_is_boolean (ssize_t nspace, 
                                t_namespace_index index);
    ]]


  /Description:/

    [[blockquote
      Return 1 if the indicated location exists and contains a boolean,
      0 otherwise. 
      
      Return a value less than 0 upon catastrophic error.
    ]]

**** Function: {*`namespace_set_to_boolean'}

    /Prototype:/

    [[prototype
      int namespace_set_to_int32 (ssize_t nspace, 
                                  t_namespace_index index,
                                  int new_value);
    ]]

  /Description:/

    [[blockquote
      Store `!!new_value' in the location indicated by `index'.

      If the indicated location does not currently exist, 
      return 1, otherwise return 0.

      (Except) return a value less than 0 for catastrophic errors.

    ]]


**** Function: {*`namespace_get_boolean'}

    /Prototype:/

    [[prototype
      int namespace_get_boolean (int * bool_ret,
                                 ssize_t nspace, 
                                 t_namespace_index index);
    ]]

  /Description:/

    [[blockquote

      Retrieve the 0-or-1 value stored in the location addressed by
      `index', presuming that that location exists and contains a
      boolean. Return 0 in this case.

      If the location does not exist or contains a non-boolean, return
      a value greater than 0.

      Upon catastrophic error, return a value less than 0.

    ]]


*** Symbol Values

**** Function: {*`namespace_is_symbol'}

  /Prototype:/

    [[prototype
      int namespace_is_symbol (ssize_t nspace, 
                               t_namespace_index index);
    ]]


  /Description:/

    [[blockquote
      Return 1 if the indicated location exists and contains a symbol,
      0 otherwise. 
      
      Return a value less than 0 upon catastrophic error.
    ]]

**** Function: {*`namespace_set_to_symbol'}

  /Prototype:/

    [[prototype
      int namespace_set_to_symbol (ssize_t nspace, 
                                   t_namespace_index index,
                                   t_uchar * symbol);
    ]]

  /Description:/

    [[blockquote
      Store `symbol' in the location indicated by `index'.

      `symbol' should be a string returned by `identifier_intern' (in
      `libhackerlab').  It is an /undetected error/ if it is not.
      Therefore, most programs should stick to
      `namespace_set_to_symbol_str'.

      If the indicated location does not currently exist, 
      return 1, otherwise return 0.

      (Except) return a value less than 0 for catastrophic errors.

    ]]


**** Function: {*`namespace_set_to_symbol_str'}

  /Prototype:/

    [[prototype
      int namespace_set_to_symbol (ssize_t nspace, 
                                   t_namespace_index index,
                                   t_uchar * symbol_name);
    ]]

  /Description:/

    [[blockquote

      Intern the symbol named by 0-terminated `symbol_name' and store
      the resulting symbol in the location indicated by `index'.

      If the indicated location does not currently exist, 
      return 1, otherwise return 0.

      (Except) return a value less than 0 for catastrophic errors.

    ]]


**** Function: {*`namespace_get_symbol'}

  /Prototype:/

    [[prototype
      int namespace_get_boolean (t_uchar * identifier_ret,
                                 ssize_t nspace, 
                                 t_namespace_index index);
    ]]

  /Description:/

    [[blockquote

      Retrieve the symbol value stored in the location addressed by
      `index', presuming that that location exists and contains a
      symbol. Return 0 in this case.

      If the location does not exist or contains a non-symbol, return
      a value greater than 0.

      Upon catastrophic error, return a value less than 0.

    ]]



*** String Values


**** Function: {*`namespace_is_string'}

  /Prototype:/

    [[prototype
      int namespace_is_string (ssize_t nspace, 
                               t_namespace_index index);
    ]]


  /Description:/

    [[blockquote
      Return 1 if the indicated location exists and contains a string,
      0 otherwise. 
      
      Return a value less than 0 upon catastrophic error.
    ]]


**** Function: {*`namespace_set_to_string_str'}

  /Prototype:/

    [[prototype
      int namespace_set_to_string (ssize_t nspace, 
                                   t_namespace_index index,
                                   t_uchar * str);
    ]]

  /Description:/

    [[blockquote

      Store a copy of the 0-terminated string `str' in the location
      indicated by `index'.

      If the indicated location does not currently exist, 
      return 1, otherwise return 0.

      (Except) return a value less than 0 for catastrophic errors.

    ]]


**** Function: {*`namespace_get_string_str_n'}

  /Prototype:/

    [[prototype
      int namespace_get_boolean (t_uchar * str_ret,
                                 ssize_t * len_ret,
                                 ssize_t nspace, 
                                 t_namespace_index index);
    ]]

  /Description:/

    [[blockquote

      Retrieve the string value stored in the location addressed by
      `index', presuming that that location exists and contains a
      string. Return 0 in this case.

      If the location does not exist or contains a non-string, return
      a value greater than 0.

      Upon catastrophic error, return a value less than 0.

    ]]



* Namespace Buffers

  `libhackerlab' provides the module `hackerlab/buffers' -- a data
  structure for edittable strings supporting "markers".

  In particular, `hackarlab/buffers/buffers.h' provides for "buffer
  sessions" -- flat namespaces of explicitly allocated and 
  freed buffers.

  Every namespace has an associated buffer session:

** Function: {*`namespace_buffer_session'}

  /Prototype:/

    [[prototype
        ssize_t namespace_buffer_session (ssize_t nspace);
    ]]

  /Description:/

    [[blockquote
       Return the buffer session id associated with the indicated
       namespace or a value less than 0 upon error.

       Return values less than 0 do *not* signal catastrophic
       errors.   This function can not result in a catastrophic 
       error.
    ]]


* Namespace Graphs

 `libhackerlab' provides the module `hackerlab/graphs/ -- 
 a data strcuture for edittable directed graphs.

 The namespace data structure permits programs to allocate graphs
 which, if not otherwise freed, are guaranteed to be freed when
 the namespace itself is freed:

** Function: {*`namespace_alloc_graph'}, {*`namespace_free_graph'}

  /Prototypes:/

    [[prototype
      ssize_t namespace_alloc_digraph (ssize_t nspace);
      int namespace_free_digraph (ssize_t nspace, ssize_t digraph);
    ]]

  /Description:/

    [[blockquote
      Allocate (or free) a digraph associated with namespace `nspace'.

      Such graphs are automatically freed, if they have not already
      been explicitly freed, when the namespace is freed.
    ]]


* Namespace Descriptors and Subprocesses

  *function prototypes not provided*

  Similarly, namespaces provide for certain file descriptors to be
  automatically closed and for certain subprocesses to be killed and
  reaped when a namespace is freed.


* Virtual Threads

 Recall that, within a namespace, a *scope* consists of
 `symbtabs', a list of symbol tables and `current_list_pos', 
 an index into the symbol table list.

 Operations such as `namespace_push_scope' allow us to use scopes
 as a kind of "call frame stack".   A function can save part of its
 caller's bindings, install their own, then later restore the caller's
 bindings (for example).

 The gist is that within each scope, there can be multiple symbol
 tables, and which symbol table is current can change over
 time.

 We can usefully repeat that abstraction at the next higher level.
 Instead of just saving and restoring individual symbol tables 
 (aka, independent collections of bindings), we can instead save
 and restore entire sets of scopes.

 A */namespace thread/* is a data structure for holding a saved set of
 scopes.   Programs can move the current values of any selected subset
 of a namespaces scopes to a thread object.   In the namespace, the
 moved scope is replaced by an empty scope, containing no bindings.
 Programs can also restore the values of scopes from a thread: that
 discards the scopes replaced by those being restored and it leaves
 the thread object "empty".

** Function: {*`namespace_alloc_thread'}, {*`namespace_free_thread'}

  /Prototypes:/

    [[prototype
      ssize_t namespace_alloc_thread (ssize_t nspace);
      int namespace_free_thread (ssize_t nspace, ssize_t thread);
    ]]

  /Description:/

    [[blockquote
      Allocate (or free) a namespace thread within `nspace'.
    ]]


** Function: {*`namespace_freeze'}, {*`namespace_thaw'}

  /Prototypes:/

    [[prototype
      int namespace_freeze (ssize_t nspace,
                            ssize_t thread,
                            int n_scopes,
                            ssize_t * scope_v);

      int namespace_thaw (ssize_t nspace, ssize_t thread)
    ]]

  /Description:/

    [[blockquote
       Save (or restore) the indicated scopes in a namespace
       thread.
    ]]


* Error Codes

  *not written yet*

  /notes:/

  rbcollins and I talked about 

  [[tty
     struct error_code
     {
       ssize_t error_class;
       ssize_t error_index_in_class;
     };
  ]]

  The APIs above assume single integer error codes, divided into
  negative and positive codes.

  The APIs are returning `error_index_in_class'.   If the caller
  knows what class of errors the callee can produce (and we 
  callees, by convention, to produce only one class of
  error each) then the caller can form the complete `struct
  error_code'.

  If the caller *doesn't* know the error class, then
  it is significant if the error code is non-0 and 
  sometimes significant if a non-0 code is positive
  or negative.



* Basic Namespace Utils

  The earlier sections have built up quite a bit of structure in 
  namespaces.

  This section describes a set of "namespace utility functions"
  that can be built on the above.   It would be tedious to 
  make a complete list of all desirable utility functions ..
  just a few samples to illustrate:

** List Operations

  Namespace variables can be list variables.   Given that
  it's convenient to have functions like:

  [[tty

    int namespace_list_append (t_namespace_index append_to_list,
                               ssize_t nspace,
                               t_namespace_index append_from_list);
  ]]

  which, if both indexed variables are lists, appends a copy of
  the `from_list' to the `to_list'.


** Relational Operations

  Given two variables which are tables:

  [[tty

    int namespace_join (t_namespace_index output_table,
                        ssize_t nspace,
                        ssize_t join_column,
                        t_scalar_comparison_fn (*cmp)(),
                        void * cmp_rock,
                        t_namespace_index table_a,
                        t_namespace_index table_b,
                        t_join_field_spec output_field,
                        ...)

  ]]

  and so forth.


  E.g., basic string/list/table ops.

  form of int fn (output_var_specs, nspace, input_var_specs + params);


* Entry Points and Calling Conventions

  A thoroughly "librified" `libarch' should include provisions which
  makes it entry points easily accessible to the run-time environments
  of scripting languages (and the like).

  Such access to entry points generally requires:

  [[blockquote

    /1. A facility for finding and invoking entry points by symbolic
        name./ Many of the most convenient ways to make `libarch'
        entry points available as functions in a scripting language
        involve having the ability to look up the list of (symbolic
        names for) available entry points at run time, and to be able
        to invoke an entry point given only its name.


    /2. A facility for mashalling parameters and collecting return
        values from entry points, using a generic mechanism./
        If every entry point in `libarch' has its own /C/ function
        type, then calling those entry points from a scripting
        language involves a lot of (programming) work.  Each 
        such entry point must be "wrapped", either by hand or using
        a tool such as Swig.   It is simpler if there is a generic
        way to collect the arguments for or return values from a
        `libarch' entry point;  in other words, if scripting languages
        binding to `libarch' can get by with a single, generic 
        wrapper that works for all entry points rather than `N+1'
        wrappers, one for each separate entry point.
        

    /3. Useful invarients and error handling./  `libarch' entry 
        points need to make reasonably strong and universal 
        guarantees.  For example, absent a catastropic error,
        they should neither leak resources nor ever leave the
        internal `namespace' data structures in an inconsistent
        state.

  ]] 

  
  How can we do that?


** Just What *is* an Entry Point

  For simplicity, I take the view that `libarch' in `tla-2.0' should
  function as a kind of extra-fancy *turing machine*.   Recall that a 
  turing machine has two parts: a finite state machine defining the
  computational steps the machine can take;  an "infinite tape" which
  serves as the "memory" for the computation run by the turing
  machine.

  In `libarch''s case, I regard a `namespace' data structure as taking
  the place of our "infinite tape".  Namespaces are similar to turing
  tapes in many ways: they have a simple topological structure and are
  divided up into locations, each of which contains a scalar value.

  Namespaces are different from Turing tapes in some of their
  arbitrary details.   For example, namespaces divide their storage
  into "scopes" and each scope has a list of symbol tables.   Thats'
  much more complicated than Turing's 1-D tape but the added
  complexity also adds realism:  symbol tables are easier to program
  than the 1-d tape, even if they are logically equivalent;  scopes
  are cheap to implement and handy, even if on a 1-d tape they would
  be absurdly expensive to simulate.   A namespace is a 1-d tape
  modified in response to a bunch of pragmatic considerations.

  If a `namespace' takes the place of the infinite tape, then the
  entry points in `libarch' take the place of the finite state
  machine.

  Indeed, although `libarch' may include some static data as a
  performance optimization, from the perspective of its API, `libarch'
  in 2.0 will be completely "stateless" --- all persistent state
  between `libarch' calls will be stored in a namespace, not in
  `libarch' itself.

  A collection of stateless /C/ entry points is, indeed, a form of
  finite state machine.



*** A Single Entry Point

  `librach' *can* get by with a single entry point (although doing so
  is not *literally* proposed).

**** Function: {*`arch_run'}

  /Prototype:/

    [[prototype
      int arch_run (ssize_t nspace,
                    int (*poll)(void *),
                    void * poll_rock);
    ]]

  /Description:/

    [[blockquote
      Perform a single `libarch' state transition.   Usually this
      means invoking the command selected by the current state
      of the namespace `nspace'.

      As a side effect, the indicated namespace is modified to 
      reflect the results of the state transition.

      Normally return 0.

      Returns a value less than 0 upon catastropic error.

      Returns a value greater than 0 upon recoverable error
      (such there being no currently defined transition).

      The `poll' parameter may be 0 or a user-supplied "poll
      function".  `arch_run' is free to (not required to)
      periodically call `poll'.  If `poll' returns non-0,
      `arch_run' will attempt to return to its caller as 
      quickly as possible, even if that entails returning
      a (recoverable) error.
    ]]


  Provided that a client program suitably modifies the namespace
  `nspace' before each call to `arch_run', that can be a complete
  interface.  (I'm assuming, of course, that the client as the
  separate `namespace' interface available to set up parameters before
  calling `arch_run' and read back results after `arch_run' returns.)


** The `arch_run' Calling Convention

  Upon a call to `arch_run':

*** The Parameter Variable `argv'

  The namespace variable named `"argv"' in the standard
  scope `namespace_params()' */must/* be initialized 
  much like an `argv' you would pass to `exec(2)':

  The namespace `argv' variable */must/* be a list variable.

  `argv[0]' */must/* be the symbolic name of a `libarch' "state
  transition" entry point.   Roughly, this should correspond to 
  a `tla 1.X' subcommand name.

  `argv[1..n]' */may/* contain a list of options and arguments
  to the entry point named in `argv[0]'.

*** The Standard Error Buffer

  The namespace variable named `"stderrbuf"' in the 
  standard scope `namespace_globals()' */may/* contain 
  a non-negative integer.  If so, that integer is taken to
  be the buffer id of the *"standard error buffer"*, allocated
  from the namespace's buffer set.

  Within `libarch', code that wants to generate an error message
  should *prepend* that message to the `"stderrbuf"'.  If the
  buffer is not empty before prepending the new message, `libarch'
  code should first prepend `"\n\n---\n\n"' to the buffer.


*** The Standard Error Buffer

  Similarly, the namespace variable named `"stdoutbuf"' in the 
  standard scope `namespace_globals()' */may/* contain 
  a non-negative integer.  If so, that integer is taken to
  be the buffer id of the *"standard output buffer"*, allocated
  from the namespace's buffer set.
  
  `libarch' code can generate "normal output" by appending to this
  buffer.


*** The Standard Input Buffer

  Similarly, the namespace variable named `"stdinbuf"' in the 
  standard scope `namespace_globals()' */may/* contain 
  a non-negative integer.  If so, that integer is taken to
  be the buffer id of the *"standard input buffer"*, allocated
  from the namespace's buffer set.
  
  `libarch' code can read "default input" by consuming the contents
  of this buffer.


*** The Return Variables `retv' and `status'

  The namespace variable `"retv"' in the standard scope
  `namespace_returns()' is used symmetrically to `"argv"'.

  Upon return from `arch_run', `"retv"' will be a list variable,
  containing the 0 or more "returned values" from the entry
  point (regarded as a function call).

  Upon return, the variable `"status"', also in the
  `namespace_returns()' scope, will be set to an integer
  value: the same integer returned from `arch_run'.


*** Callee-Preserves Locals

  Upon return, `libarch' will not have changed any variable values in
  the standard scope `namespace_locals()'.

  If `libarch' wants to use the `locals' scope internally, it will
  generally do so by "pushing" (`namespace_push_scope') that scope
  on entry to `arch_run' and popping the scope, to return the caller's 
  bindings, before return.


*** Callee-Preserves Parameters

  The `namespace_params()' is preserved similarly to the `locals'
  scope.


*** Registering Commands

  The `0' element of the namespace variable `"argv"' in the
  scope `namespace_params()' contains the name of the command
  to be invoked by `arch_run'.

  How is that name translated into an actual choice of which code
  to run?

**** Function: {*`arch_register_command'}

  /Prototype:/

    [[prototype
        int arch_register_command (t_uchar * name, 
                                   int (*fn) (ssize_t nspace, void * rock),
                                   void * fn_rock);
    ]]

  /Description:/

    [[blockquote
      Remember that `fn' (provided the `fn_rock' argument) implements
      the `libarch' entry point of the indicated `name'.
    ]]

**** Listing Commands

   */Not Illustrated:/ functions for listing the available commands
   and perhaps conventions for linking them to help messages and into
   help categories.


** `arch_run' Illustrated

  If your module defined a new `libarch' entry point (say, `"my-id"')
  then during initialization, you'll need something like:

*** `arch_run' Initialization Illustrated

  [[tty

    if (0 != arch_register_command ("my-id",
                                    my_id_fn,
                                    (void *)0))
      ... uh-oh, catastropic initialization error ...;

  ]]

*** `arch_run' Client Interface Illustrated

  A `libarch' client can call your new entry point in a style
  remeniscent of using `fork' and `exec':

  [[tty

    ssize_t namespace;
    t_namespace_index argv0;
    t_namespace_index retv0;
    t_uchar * my_id;
    ssize_t my_id_len;

    namespace = alloc_namespace ();
    if (namespace < 0)
      ... catastropic error ...;

    if (0 != namespace_list_index (&argv0,
                                   namespace,
                                   "argv", namespace_params()
                                   0))
      ... catastropic error ...;

    if (0 != namespace_set_to_string (argv0,
                                      namespace,
                                      "my-id"))
      ... catastropic error ...;

    if (0 != arch_run (namespace))
      ... some kind of error during the run of `my-id' ...;

    
    if (0 != namespace_list_index (&ret0,
                                   namespace,
                                   "retv", namespace_returns()
                                   0))
      ... catastropic error ...;

    if (0 != namespace_get_string_str_n (&my_id, &my_id_len,
                                         namespace, retv0))
      ... catastropic error ...;

    /* We just called `my-id' with no parameters and got back
     * the string value `my_id', of length `my_id_len'.
     * 
     * That string pointer remains valid until the namepace
     * binding of `"retv"' changes, for any reason.
     */
  ]]

*** `arch_run' Internal Interfaces Illustrated
  
  Finally, here is what your implementation of `my-id'
  might look like:

  [[tty

    int
    my_id_fn (ssize_t nspace, void * rock)
    {
      t_uchar * id_string;
      int answer;

      if (1 != namespace_list_length (nspace, "argv", namespace_params()))
        {
          ... my id was called with bogus parameters ...;
          ... spew an error message to `stderrbuf' ...;
          ... then return with a non-0 exit code: ...;

          return arch_return_from_run (2);
        }

       id_string = low_level_call_to_compute_my_id ();
       if (!id_string)
         {
            ... spew an error message to `stderrbuf' ...;
            return arch_return_from_run (1);
         }

       answer = namespace_list_set_to_string (nspace, 
                                              "retv",
                                              namespace_returns(),
                                              0,
                                              id_string);
       free (id_string);

       return answer;
    }

  ]]


 `my_id' turns out to be a particularly simple example.
 A more complicated example might, for example, need to use
 namespace local variables.   That would involve calling
 `namespace_push_scope' to save the scope `namespace_locals()'
 on entry, and calling `namespace_pop_scope' to restore that
 scope before returning.

* Copyright

 /Copyright (C) 2004 Tom Lord/

 This program is free software; you can redistribute it and/or modify
 it under the terms of the GNU General Public License as published by
 the Free Software Foundation; either version 2, or (at your option)
 any later version.

 This program is distributed in the hope that it will be useful,
 but /WITHOUT ANY WARRANTY/; without even the implied warranty of
 /MERCHANTABILITY/ or /FITNESS FOR A PARTICULAR PURPOSE/.  See the
 GNU General Public License for more details.

 You should have received a copy of the GNU General Public License
 along with this program; if not, write to the Free Software Foundation,
 Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.

 See the file `COPYING' for further information about
 the copyright and warranty status of this work.



[[null
  ; arch-tag: Tom Lord Mon Nov 29 08:31:58 2004 (communications/tla-2.0-data.txt)
]]

