Summary Batch Scheduling Software

From: Craig Kruck (kruckc@hitachi.hitachi.com)
Date: Sun Jun 27 1993 - 16:45:13 CDT


Many thanks for all the input regarding batch scheduling applications/software.
While my primary interest was in batch scheduling, many of the responses provided information on load balancing which will be very helpful in the future.
My original question was:

> I'm trying to locate vendors who produce batch scheduling software for
> UNIX platforms. We are in the early stages of a down sizing project and
> any information would be greatly appreciated.
>
> The product will have to support triggers and predecessors along with
> the normal requirements.

We currently have Computer Associate's Unicenter product in house. After
speaking with a few of the vendors mentioned below we have decided that the
Unicenter product is the most robust and powerful product on the market. Along
with the Workload Management features, it also supports Console, File, Problem,
Report, and Tape Management as well as some very enhanced Security (well beyond
what I require).

Again, Thank you for the information:

"Peter W. Osel" <pwo@ztivax.zfe.siemens.de>
charest@CANR.Hydro.Qc.CA (Claude Charest)
nolfb@jcdbs.2000.disa.mil (Bill Nolf)
stern@sunne.East.Sun.COM (Hal Stern - NE Area Systems Engineer)
"Marty Leisner" <leisner@eso.mc.xerox.com>
kevin@uniq.com.au (Kevin Sheehan {Consulting Poster Child})
David Fetrow <fetrow@biostat.washington.edu>
crm (Charlie's Login)
amir@matis.ingr.com (Amir J. Katz)
Hebets <radian!markh@natinst.com>
mwp.michael <MWP.MICHAEL@MELPN1.CV.COM>

_______________________________________________________________________________
From: "Peter W. Osel" <pwo@ztivax.zfe.siemens.de>
        There is a product ``Load Balancer'' by Freedman Sharp And
Associates Inc.: I did not have a closer look at it, but here we go
abyway:

                   ----LOAD BALANCER VERSION 3.3----

Load Balancer is a UNIX batch queueing and load sharing system. It
ensures that jobs (ie: applications) submitted from any host in a
network end up running on the best available host. It takes into
account many performance factors, as well as other real-world factors
such as security, licensing, and interactive user detection. Users can
run, kill, and adjust the status of their jobs from any host, and can
also get status information on jobs, queues, hosts, applications, and
users from any host. No application or kernel modifications are
necessary, since Load Balancer deals exclusively with whole
applications, not application fragments.

Load Balancer is useful to sites looking to maximize the performance
that they can extract from their workstations and servers. By running
each job on the best-available machine, Load Balancer gives each user
the power to run more jobs in a given amount of time, increasing
productivity and reducing time-to-market for products being designed.

Example uses of Load Balancer are: allowing an engineer to run many
simulation runs concurrently on as many hosts as are available,
ensuring that results are achieved as soon as possible; allowing many
people to submit large jobs to a limited set of host machines, using
Load Balancer's batch queueing system to ensure that the hosts do not
become overloaded with too many jobs running at once; allowing a s/w
developer to compile code on the best available hosts, reducing compile
time drastically; giving users the ability to pull up a shell on a
lightly loaded machine for general purpose work; and so on.

Load Balancer v3.3 is extremely full featured, giving the system
administrator maximum flexibility to set policies about who can run
what, where, when. Load Balancer v3.3 is available on Sun, HP, SGI,
IBM, and DEC UNIX computers. A microsoft windows front-end is also
available, giving PC users a point-and-click method to start UNIX
jobs.

For more information about Load Balancer, please send mail to
dan@fsa.ca, or call Dan Freedman at the phone number listed above.
------------------------------------------------------------------------------
From: charest@CANR.Hydro.Qc.CA (Claude Charest)
        Here is a previous summary about batch scheduling i had keep. It wad
written by Glenn Carver from Cambridge University. I hope that it will help
you and that this data is not too old...

> From: Glenn Carver <glenn@atmos-modelling.chemistry.cambridge.ac.uk>
> Glenn Carver Email: carver@atm.ch.cam.ac.uk
> Subject: SUMMARY: batch control
>
> 1. SunOS batch, at, cron.
> -------------------------
> Can be configured for multiple queues per machine. You can specify the number
> of jobs per queue, nice value and retry time for jobs. See man page for
> queuedefs for more details. Very limited capabilities.
>
> 2. Using the print spooler.
> ---------------------------
> Several people pointed out that you can use the print spooler mechanism to
> setup and manage distributed batch queues by running scripts instead of
> printing. Noone sent me details of a working mechanism and I haven't tried
> it yet. It might be very useful in combination with some of the non-distributed software.
>
> 3. dsh.
> -------
> Alan Stebbens <aks%anywhere@edu.ucsb.hub> pointed me in the direction of
> 'dsh'. 'dsh' implements a distributed shell which finds the least loaded
> machine and runs the command on it. dsh is available by anonymous ftp
> from hub.ucsb.edu in pub/shells/dsh.tar.Z
>
> 4. Batch.
> ---------
> Ken Lalonde <ken@edu.toronto.cs> has written a batch control package. It is
> a collection of programs and scripts that allows you to set up various
> queues on a machine with characteristics such as the priority of jobs,
> job resource limits and so on. It runs a daemon which monitors the load
> on the machine and can halt jobs when the load reaches a settable level.
> Batch is not networked. Several people recommended this package. It's
> available by anonymous ftp from ftp.cs.toronto.edu in pub/batch.tar.Z.
>
> 5. QBATCH
> ---------
> Thanks to Milt Ratcliff <milt@pe-nelson.com> for mailing me about QBATCH.
> QBATCH was developed by Alan Saunders on Sun workstations. It is
> not networked but does provide a comprehensive set of job control
> options, more than Batch (4.) but does not halt jobs if load reaches some
> predetermined level. QBATCH is available from several anonymous ftp sites. I
> got it from lth.se in netnews/alt.sources/volume91/jul but it's also available
> from cs.dal.ca in pub/bio as qbatch.tar.Z.
>
> 6. Condor.
> ----------
> Many replies mentioned the Condor package. Condor was written at the
> University of Wisconsin and is quite sophisticated and well documented.
> It is fully distributed, machines enter and leave a 'pool' which condor uses
> to run jobs. Jobs are checkpointed and can be moved from one machine which
> leaves the pool and continued on a machine that enters. The snags appear to
> be that a replacement version of the libc.a library is required to enable the
> checkpointing (programs must be statically linked) and I/O is not implemented
> well for FORTRAN. For more info contact condor-request@cs.wisc.edu. Condor
> is available from many ftp sites as Condor_4.0.0.tar.Z. Use 'archie' to find
> one (USA: quiche.cs.mcgill.ca; EUROPE nic.funet.fi; log in as user archie).
>
> 7. NQS.
> -------
> The Network Queueing System was developed on contract from NASA. There is a
> version (I assume to be the original) on permac.space.swri.edu in
> public/convexug/nqs.tar.Z (and other anonymous ftp sites). NQS is also marketed
> by several companies and improved over the original: COSMIC, 382 East Broad St.,
> Athens GA 30602 supporting SIG, Sun, VAX & Stardent, Sterling Software (415
> area code, sorry no other details). Cray also have a version and sell a version
> called RQS for remote queueing on Cray machines. COSMIC are also rumoured to
> be developing NQS II. For those with money to spend, this may be the one.
>
> At a first glance NQS seems to give similar sort of capabilities as Condor but
> this is quite a big package and I haven't had time to go through it all. I
> did hear from someone who had successfully installed the permac version
> on a multiarchitecture environment (including Suns, although it required a
> bit of work).
>
> 8. MDQS
> -------
> MDQS was developed at the U.S. Army Ballistic Research Lab. and is available
> from ftp.brl.mil in arch/mdqs.tar.Z. MDQS stands for Multi-Device Queueing
> System and appears to have been originally developed to handle a large number
> of network printer devices (multiple devices per queue, multiple queues per
> device) but also includes facilities for batching jobs on machines. This
> appears to be a powerful package with alot of documentation to it.
>
>
> 9. DNQS
> -------
> Tom Green <green@edu.fsu.scri.ds17> mailed me about DNQS. This is available
> from ftp.fsu.edu in the directory pub/DNQS. This package supports a multi-
> architecture environment is a distributed way but doesn't include some of
> the more fancier features of the above packages. However, it was developed
> for a workstation environment rather than a few high-speed processors (such
> as NQS). Documentation is good (not always the case!) and it looks fairly easy
> to setup (although I haven't done it yet). Won't halt jobs when machine load
> is too high, relies on nice priority to do that. Known to run on
> Sun, VAX, DecStation, SGI & IBM.
______________________________________________________________________________
From: nolfb@jcdbs.2000.disa.mil (Bill Nolf)
I believe SAIC and HP has a batch scheduler and load balancer product,
however I don't remember the name. It should be in one of the trade rags.
______________________________________________________________________________
From: stern@sunne.East.Sun.COM (Hal Stern - NE Area Systems Engineer)
if you want all of the dependencies (like triggers) then
check out OpenVision's Distributed Task Scheduler (formerly
the fusion systems group product). they're in pleasanton, CA
______________________________________________________________________________
From: "Marty Leisner" <leisner@eso.mc.xerox.com>
What's the matter with batch(1) and at(1) and cron(1)?
marty
______________________________________________________________________________
From: kevin@uniq.com.au (Kevin Sheehan {Consulting Poster Child})
You can use archie to look for NQS, but we also got some info
on a product call Load Balancer. lb@fsa.ca is the Email alias
I have for it. (Load Balancer@Freedman Sharp & A.)

Haven't used it yet, but it looks interesting.
______________________________________________________________________________
From: David Fetrow <fetrow@biostat.washington.edu>
 Lowend solution include:

        Good old "at" and "cron"

        NAQ (there are free and commercial versions)

..I'm a lowend kind of guy so don't know about anything better.
______________________________________________________________________________
>From: crm (Charlie's Login)
I heard that Computer Associates of Islandia NY have a product for Sun like this - you may want to check it out.
______________________________________________________________________________
From: amir@matis.ingr.com (Amir J. Katz)
There is a commercial product called CONTROL-M that runs on various UNIX
platforms. It is a full-featured batch processing scheduler which is also
available on IBM mainframes, AS/400 and VAX/VMS.

This product is a part of an architecture called IOA (Integrated Operations
Architecture) which is developed and distributed by 4th DIMENSION SOFTWARE.

For more information, please contact Mr. Itai Ben-Dor at:

   4th DIMENSION SOFTWARE Ltd.
   P.O.Box 43227
   Tel Aviv 61430
   ISRAEL
   Tel. +972-3-491211
   Fax. +972-3-491002

or Mr. Joseph Hollander at:

   4th DIMENSION SOFTWARE Ltd.
   One Park Plaza, 11th Floor,
   Irvine, CA 92714
   Tel. (714) 757-4300
   Fax. (714) 756-3900

Disclaimer: I am associated with this product.
------------------------------------------------------------------------------
From: Hebets <radian!markh@natinst.com>
I'm not sure what you mean by "triggers and predecessors",
maybe I just don't recognize the non-Unix jargon.

1) All Unixes will include the "crontab" facility for scheduling
recurring batch processes. All newer Unixes (SVR3? SVR4?) will include
the capability for each user to maintain their own crontab schedule.
Crontab will allow you to fire of commands every few minutes, every few
hours, at 4:23 a.m. on every Tuesday, etc., etc. You can expect the
crontab implementation to be very solid on most Unixes, because the OS
uses it to schedule some routine maintenance.

2) All newer Unixes will include the "batch" and "at" facilities
for submitting jobs to a queue now (batch) or at some later
time (at). I haven't been real impressed with the quality of
implementation for batch or at on most of our machines, unfortunately.

3) I've been looking for some products to spread batch queues
over many machines on a network. I've turned up four vendors,
two of whom have actually answered E-mail and sound like they
have real products.

The two I've exchanged mail with:
  Platform Computing, Utopia Load Sharing Facility --- zhou@platform.com
  Freedman Sharp Associates, Load Balancer --- dan@fsa.ca

The two I haven't been able to contact:
  Sterling, the Network Queueing System (NQS)
  VXM Technologies, Inc. (I forget the name of the product.)
------------------------------------------------------------------------------
From: mwp.michael <MWP.MICHAEL@MELPN1.CV.COM>
CV have a batch product for Unix. Is a port of our PRIMOS software and I
know little about it (like, when it becomes official, whether it supports
triggers etc, platforms supported), but if you contact your local CV
office and ask about the BATCH/open product they should be able to help.
------------------------------------------------------------------------------



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:07:58 CDT