RELEASE NOTES FOR SLURM VERSION 2.4
25 May 2012


IMPORTANT NOTE:
If using the slurmdbd (SLURM DataBase Daemon) you must update this first.
The 2.4 slurmdbd will work with SLURM daemons of version 2.1.3 and above.
You will not need to update all clusters at the same time, but it is very
important to update slurmdbd first and having it running before updating
any other clusters making use of it.  No real harm will come from updating
your systems before the slurmdbd, but they will not talk to each other
until you do.  Also at least the first time running the slurmdbd you need to
make sure your my.cnf file has innodb_buffer_pool_size equal to at least 64M.
You can accomplish this by adding the line

innodb_buffer_pool_size=64M

under the [mysqld] reference in the my.cnf file and restarting the mysqld.
This is needed when converting large tables over to the new database schema.

SLURM can be upgraded from version 2.3 to version 2.4 without loss of jobs or
other state information.


HIGHLIGHTS
==========
* Up to 500% higher throughput for short-lived jobs (depending upon
  configuration).
* Major modifications to support IBM BlueGene/Q systems.
* New SPANK callbacks added to slurmd: slurm_spank_slurmd_{init,exit} and
  job epilog/prolog: slurm_spank_job_{prolog,epilog}.
* Added MPI plugin, mpi/pmi2, which supports MPI_Comm_spawn() function.

CONFIGURATION FILE CHANGES (see "man slurm.conf" for details)
=============================================================
* "PriorityFlags" added
* "RebootProgram" added
* "ReconfigFlags" added
* "SlurmdDebugLevel" and "SlurmctldDebugLevel" now accept string names in
  addition to numeric values (e.g. "info", "verbose", "debug", etc.). Output
  of scontrol and sview commands also use the string names.
* Changed default value of "StateSaveLocation" configuration parameter from
  "/tmp" to "/var/spool" to help avoid purging.
* Change default "SchedulerParameters" "max_switch_wait" field value from 60 to
  300 seconds.
* Added new "SchedulerParameters" of "bf_max_job_user", maximum number of jobs
  to attempt backfilling per user.
* Modify SchedulerParamters option to match documentation: "bf_res="
  changed to "bf_resolution=".

COMMAND CHANGES (see man pages for details)
===========================================
* Modified advance reservation to select resources optimized for network
  topology and accept multiple specific block sizes rather than a single node
  count.
* Added trigger flag for a permanent trigger. The trigger will NOT be purged
  after an event occurs, but only when explicitly deleted.
* Added the ability to reboot all compute nodes after they become idle. The
  RebootProgram configuration parameter must be set and an authorized user
  must execute the command "scontrol reboot_nodes".
* Added the ability to update a node's NodeAddr and NodeHostName with scontrol.
* Added the option "--name" to the sacct and squeue commands.
* Add support for job allocations with multiple job constraint counts. For
  example: salloc -C "[rack1*2&rack2*4]" ... will allocate the job 2 nodes
  from rack1 and 4 nodes from rack2. Support for only a single constraint
  name been added to job step support.
* Changed meaning of squeue "-n" option to job name from node name for
  consistency with other commands. The "-w" option was added for a short
  node name option. Long options --names and --nodes remain unchanged.
* Sinfo output format of "%P" now prints "*" after default partition even if
  no field width is specified (previously included "*" only if no field width
  was specified. Added output format of "%R" to print partition name only
  without identifying the default partition with "*").
* Added cpu_run_min to the output of sshare --long.
* Modify scontrol to require "-dd" option to report batch job's script.


OTHER CHANGES
=============
* Improve task binding logic by making fuller use of HWLOC library,
  especially with respect to Opteron 6000 series processors.
* Changde to output tools labels from "BP" to "Midplane" (i.e. "BP_List" was
  changed to "MidplaneList").
* Modified srun to fork a processes which can terminate the job and/or step
  allocation if the initial srun process is abnormallly terminated (e.g. by
  SIGKILL).
* Added support for Cray GPU memory allocation as GRES (Generic RESources).
* Correct setting of CUDA_VISIBLE_DEVICES for gres/gpu plugin if device files
  to be used are not in numeric order (e.g. GPU 1 maps to "/dev/nvidia4").
* Cray - Add support for zero compute note resource allocation to run batch
  script on front-end node with no ALPS reservation. Useful for pre- or post-
  processing. NOTE: The partition must be configured with MinNodes=0.

API CHANGES
===========
* Added the UserID of the user issuing the RPC to the job_submit/lua functions.

Changed members of the following structs
========================================


Added the following struct definitions
======================================
block_info_t:		cnode_err_cnt added
slurm_ctl_conf_t	priority_flags, reboot_program and reconfig_flags added
trigger_info_t:		flags added
update_node_msg_t:	node_addr and node_hostname added
slurmdb_association_cond_t:	  grp_mem_list added
slurmdb_association_rec_t:	  grp_mem added
slurmdb_qos_rec_t:		  grp_mem added

Changed the following enums and #defines
========================================
TRIGGER_FLAG_PERM	Added


Added the following API's
=========================


Changed the following API's
===========================
