DESIGN
======

What follows is my best effort in giving the big-ascii-picture
of what happens when `smd-pull` is run. `smd-push` simply 
swaps `smd-server` and `smd-client`. Note that the sync direction is 
from `smd-server` to `smd-client`, so running them on the opposite hosts
inverts the sync direction.

    Your mail server           Your laptop
    ----------------           -----------
    
    
          --- sync direction ---> smd-pull
                                    |
                                    |
    smd-server ------- ssh ----- smd-client
        |                           |
        |                           |
      mddiff                      mddiff



END USER tools 
==============

smd-pull and smd-push
---------------------

The idea is quite simple. If `===` is a double pipe (a pair of pipes, one
for `stdin` and one for `stdout`), `smd-pull` simply performs the following

    smd-client $CLIENTNAME $MAILBOX === tee log === \
      ssh $SERVERNAME smd-server $CLIENTNAME $MAILBOX

The `tee` command is used only for logging, and if $DEBUG is `false` it is
replaced by `cat`. Viceversa `smd-push` performs what follows

    smd-server $CLIENTNAME $MAILBOX === tee log === \
      ssh $SERVERNAME smd-client $CLIENTNAME $MAILBOX

They are both implemented in `bash`, since their main activity is to
redirect standard file descriptors and call other tools, check their exit
status and eventually notify the user with an extract of their logs.

smd-loop
--------

The idea is to mimic cron, but retry a failed sync attempt if the 
given error is transient. `smd-client` and `smd-server` output TAGS
that specify if the occurred error needs human intervention or not, and
also suggest some actions, like retry. `smd-loop` understands these tags,
and gives a second chance to a command that fails with an error that does not
require human intervention and for which the suggested action is retry.

It is implemented in `bash`, since it is mostly a while true loop. Arrays
(non POSIX shell compliant) are used to record failures, and give only a
second chance to every `smd-push` or `smd-pull` command.

smd-applet
----------

To write an hopefully eye-candy applet for GNOME, the language Vala was an
intriguing choice, since it is based on smart and sound ideas (that is
to avoid the C++ non-standardized calling conventions) to provide a modern
object oriented programming language built around gobject and glib. Bindings
for GTK+, GConf, libnotify, etc... are available, and require no compiled 
glue code, just bare text `.vapi` files. 

If you are used to languages where writing bindings is not a trivial task,
you'd better look at Vala, where bindings are simple by design.

SERVER/CLIENT interaction
=========================

A server software (`smd-server`) and a client software (`smd-client`) are
respectively used to transmit the diff generated by `mddiff` and eventually
mails header or body, and to apply a diff eventually requesting necessary
data to the other endpoint.

Since they mostly implement policies, like deciding if a diff can be
applied or not, are implemented in an high level scripting language called
[Lua](http://www.lua.org).  The language choice is almost arbitrary, there
are no strong reasons for adopting Lua instead of python or others, but its
installation is pretty small and it executes quite fast. Moreover, its
syntax is particularly simple, making it understandable to non Lua experts
too. Finally, I find it elegant.

They send and receive data on their standard input and output channels,
delegating to external tools the transmission of data across a network, and
optimizations like compressing the data, or encrypting it.
[OpenSSH](http://www.openssh.com/) can do both, and is adopted by
`smd-pull` and `smd-push` to connect `smd-client` to `smd-server`.

A simple protocol defines how `smd-client` requests data to `smd-server`
and how `smd-client` notifies `smd-server` that all changes have been
applied correctly.

The protocol
------------

The protocol is line oriented for commands, chunk oriented for data
transmission.

1. Both client and server send the following two messages, and check that
   they are equal to the ones sent by the other endpoint

    protocol NUMBER
    dbfile SHA1

   This part of the protocol is called handshake

2. The server sends the output of `mddiff` (that is line oriented)
   and then the following message to conclude the first phase of the protocol,
   now the client is  expected to reply

    END

3. The client, from now on, can at any time send the following (alternative)
   messages

    ABORT
    COMMIT

   The former informs the server that the client was unable to apply the
   diff generated by `mddiff`, while the latter informs the server that all
   changes were applied successfully.

4. In response to a `COMMIT` message, se server will transmit an `xdelta`
   patch the client has to apply to its db file.

5. The client replies with `DONE` to complete the synchronization

6. After point 2. and before point 3. the client can send the following
   commands to the server, that can reply transmitting data or with
   `ABORT`

    GET NAME
    GETHEADER NAME
    GETBODY NAME

### Transmission

The server can transmit data or refuse. In the latter case it just sends
`ABORT`. In the former case it sends
 
    chunk NUMBER
    DATA

First it declares with `chunk` the number of bytes to be sent, then 
its sends the data.

MAILDIR DIFF
============

Maildir diff (`mddiff`) computes the delta from an old status of a maildir
(previously recorded in the db file) and the current status, generating a
set of commands (a diff) that a third party software can apply to
synchronize a (remote) copy of the maildir.

How it works
------------

This software uses sha1 to compute snapshots of a maildir, and computes a
set of actions a client should perform to sync with the mailbox status.
This software alone is unable to synchronize two maildirs. It has to be
supported but an higher level tool implementing the application of actions
and data transfer over the network if the twin maildir is remote.

To cache the expensive sha1 calculation, a cache file is used.  Every run
the program generates a new status file (appending .new) that must
substitute the old one if generated actions are committed to the other
maildir. Cache files are specific to the twin maildir, if you have more
than one, you must use a different cache file for each of them.

The db file (say db.txt) is paired with a timestamp (db.txt.mtime) that
is used to store the timestamp of the last run and files whose mtime
does not exceed this timestamp will not be (re)processed next time
mddiff is run.

The db file format
------------------

The db file is composed by two files, a real database file (extension .txt)
and a timestamp (extension .txt.mtime). The latter contains just a number
(date +%s). The former is line oriented, every line has 3, space separated,
fields:
- the sha1 sum of the header
- the sha1 sum of the body
- the name of the file

The commands
------------

From now on, name refers to a file name, hsha to the sha1 sum of its header
and bsha to the sha1 sum of its body.

- `ADD name hsha bsha` is generated whenever a new mail message is found,
  and there is no mail message with a different name but the same body.
- `COPY name hsha bsha TO newname` is generated if a new message is found,
  that the mailbox contains a copy of it. In case mail has been moved,
  this message is followed by a `DELETE` command.
- `COPYBODY name bsha TO newname newhsha` is generated when a new file is 
  created, and that file has the same body of an already existent file. 
  In case mail has been moved, this message is followed by a `DELETE` command.
  This happens when a new message is moved to another directory and marked
  in some way changing its header (for example when a new message is 
  moved to the trash bin)
- `DELETE name hsha bsha` is emitted when a message is no longer present.
- `REPLACEHEADER name hsha bsha WITH newhsha` is emitted whenever 
  a message that was already present has a different header but the same body.
- `REPLACE name hsha bsha WITH newhsha newbsha ` is emitted whenever the body
  (and eventually the header) of mailmassage change. This never happens
  in practice, since MUAs should do a copy of the edited message, not replace 
  it.

Easy to parse output messages
=============================

`smd-pull` and `smd-push` prefix all error messages with `ERROR:`, but
what follows is meant to be read by a human being. To make other tools able to
parse and react to error messages, a more formal output is given.
A single line, prefixed with `TAGS:` is output. It can be followed by
`error::` or `stats::`, that denote an error message or a statistical one 
respectively. Then a list of improperly called tags is output. Their meaning 
should be easy to guess.

    <M>    ::= "error::" <ET> | "stats::" <ST>
    <ET>   ::= "context(" <STR> ")" 
               "probable-cause(" <STR> ")"
               "human-intervention(" <HI> ")"
               <SA>
    <SA>   ::= | "suggested-actions(" <ACTS> ")"
    <STR>  ::= `[^)]+`
    <HI>   ::= "necessary" | "avoidable"
    <ACT>  ::= <A> | <A> <ACTS>
    <A>    ::= "run(" <STR> ")" 
            |  "display-mail(" <STR> ")" 
            |  "display-permissions(" <STR> ")"
    <ET>   ::= "new-mails(" <NUM> ")" <SPC> "del-mails(" <NUM> ")"
    <NUM>  ::= `[0-9]+`
    <SPC>  ::= ` *,? *`

