Batch Processing Systems

The purpose of a batch processing system is to schedule and initiate execution of batch jobs, and to route these jobs between hosts. A good system will let an administrator put constraints on how much resource a particular job or user may take.
CondorCondor Univeristy of Wisconsin Free Checkpointing Y?
http://www.cs.wisc.edu/condor/ N
NQSNetwork Queueing System Sun Microsystems, University of Sheffield Free Can be integrated with commercial NQS products. Old; forerunner of DQS. ??
http://www.gnqs.org/ ?
PBSPortable Batch System NASA, LLNL Free Many features; steep learning curve. Rewrite of NQE. 2/4Y
http://pbs.mrj.com/service.html Y
DQSDistributed Queueing System Florida State University Free - 3/4Y
http://www.genias.de/welcome.html Y
EASY-LLExtensible Argonne Scheduling sYstem - LoadLeveler Cornell University, IBM Free - ??
http://www.tc.cornell.edu/Software/EASY-LL/ ?
CODINEDistributed Queueing System Genias Software Commercial Commercial version of DQS. YY
http://www.scri.fsu.edu/~pasko/dqs.html Y
checkpointing   gui
source code
UC Irvine, Dept of Physics
AENEAS (Array of Enhanced Nodes Supercomputer)
http://aeneas.ps.uci.edu/aeneas
EASY, DQS;

UC Santa Barbara
http://www.mrl.ucsb.edu/computing/help-pages/beowulf.html
Job Scheduling: PBS

Galaxy Beowulf Cluster
11 Celeron 300A computer nodes and one frontend node with channel-bonded network.
It is primarily used for particle transport calculations.
http://nurapt.kaist.ac.kr/~jhpark
DQS;

COst effective COmputing Array
50 processor P II-400 Beowulf Cluster of Aerospace Dept, Penn State University.
Primarily used for Computational Fluid Dynamics (CFD) applications.
http://cocoa.ihpca.psu.edu
DQS;

Condor

Condor supports something called checkpointing. If you link a program in the right way, the program will be checkpointed every so often (note the word link. No source changes or recompilations). This means that you know the program is "doing something", and in the event of a crash, you will lose only the work done since the last checkpoint. Condor can continue processing when the system comes back up, or if you wish, what has been completed can be migrated to another system.

Some programs can't be linked in the above manner (say, you don't have the source/object code). These runs are called "vanilla executables" and can be literally any a.out generated on the system. Vanilla executables can still be scheduled and run by Condor, but won't be checkpointed.

ClassAd mechanism: A user can require that a job run on a machine with 64MB RAM, but state a preference for 128MB if available. Likewise, a workstation can state a preference in a resource offer to run jobs from a certain set of users, and it can require that there be no interactive workstation activity detectable between 9am-5pm before starting a job. Job requirements/preferences and resource availability constraints can be described in terms of powerful expressions, resulting in Condor's adaptation to nearly any desired policy.

PBS: Portable Batch System

  1. Has a GUI as well as command line.
  2. Interfaces with NASA site wide accounting system, ACCT++ and NASA Centralized Test Management System (CTMS).
  3. Is a drop in replacement for NQS.
  4. User can specify priority of their jobs. Defaults can be specified at queue and system level.
  5. User can define a wide range of dependencies between different batch jobs, like "make sure A finishes before running B"
  6. Can allow/deny access on a per-system, per-group or per-user basis.
  7. Comes with an API so we can program our own scheduling applications.
  8. User can set load leveling based on hardware configuration, resource availability and keyboard activity.
  9. Supports MPI, MPL, PVM and HPF

PBS and MAUI

PBS is more a resource manager, is is broken up into multiple parts for modlarity. As such it's internal scheduler is less then optimal so it is designed to be replaced by some external scheduler. That is where maui steps in. The combo of the two produce a very flexable and powerful combo.

PBS can do queueing and scheduling--it comes with a simple First-in First-out (FIFO) scheduler. That's good enough for some sites. You could setup PBS to handle batch queues and process the jobs. Think about printers and their queues and you're on the right track for thinking about batch processing. It's really the same concept.

Alternate schedulers like maui come into play when FIFO is no longer appropriate. Maui can schedule the order of job execution for PBS queued jobs. PBS/Maui has PBS queueing jobs, and Maui deciding when/where to run them. Maui has lots of extra scheduling concepts over and above FIFO like reservations, back-filling of jobs, job priorities--just like some high-end print management packages.

It's easy to see how FIFO might create some big bottlenecks and run a cluster at low utilization from time to time (Imagine a queue of jobs where 1/2 the jobs require >1/2 the cluster--whenever 2 of those jobs are consecutive in the queue, the cluster is going to have some idle nodes). On the other hand, if most of the jobs you're going to run are similar in size, time, and can be configured to utilize most of a cluster, FIFO might be all you need.

Maui is a scheduler in the narrowest sense: it decides which jobs to run where. This includes features such as:

Maui doesn't have the ability to accept jobs from users (queue), start jobs, or monitor nodes. For these it depends on what is often called "resource management" software.

PBS provides all these capabilities. In reality PBS also distinguishes between resource management and scheduling, it includes several schedulers that you can choose from. Maui interfaces with PBS resource management (server and moms) in the same way PBS schedulers interface with PBS resource management components.