Class Runner

java.lang.Object
edu.rit.pj.job.Runner

public class Runner extends Object
Class Runner is a parallel program that runs, in parallel, a group of Jobs created by a JobGenerator. The job generator is specified on the command line as a constructor expression. An instance of the class specified in the constructor expression is constructed, with the constructor arguments specified in the constructor expression. For further information, see class Instance.

The Runner program is targeted at three use cases:

  • Sequential jobs on a cluster parallel computer. Each job is a sequential (single-threaded) program. The Runner program is running on a cluster parallel computer with N nodes and one CPU per node. Run the Runner program as follows:
         java -Dpj.nn=N edu.rit.pj.job.Runner . . .
     
    The Runner program runs with one process per node and one thread per process.
  • Sequential jobs on a hybrid parallel computer. Each job is a sequential (single-threaded) program. The Runner program is running on a hybrid SMP cluster parallel computer with N nodes and C total CPUs. (For example, on a hybrid parallel computer with 10 nodes and 4 CPUs per node, C = 40.) Run the Runner program as follows:
         java -Dpj.nn=N -Dpj.np=C edu.rit.pj.job.Runner . . .
     
    The Runner program runs with multiple processes per node and one thread per process.
  • SMP parallel jobs on a hybrid parallel computer. Each job is an SMP parallel (multi-threaded) program. The Runner program is running on a hybrid SMP cluster parallel computer with N nodes and multiple CPUs per node. Run the Runner program as follows:
         java -Dpj.nn=N edu.rit.pj.job.Runner . . .
     
    The Runner program runs with one process per node and multiple threads per process, typically as many threads as there are CPUs on the node.

All these processes form a worker team. The Runner program uses the job generator specified on the command line to create jobs and sends each job to a worker team process to be executed.

When the Runner program starts, it prints the job generator constructor expression on the standard output. Whenever a job starts or finishes, the Runner program prints a log message on the standard output consisting of the job's number and description.

Checkpointing. It is recommended to redirect the Runner program's standard output into a checkpoint file. If a failure occurs before the Runner program finishes running all the jobs, the checkpoint file contains a record of the job generator that was used as well as which jobs did and did not finish. To resume the Runner program where it left off, specify the checkpoint file name on the command line instead of a job generator constructor expression. The Runner program reads the checkpoint file to determine the job generator and the jobs that finished. The Runner program then generates and runs the jobs that did not finish.

Usage: java edu.rit.pj.job.Runner { generator | file }
generator = Job generator constructor expression
file = Checkpoint file name

Version:
22-Oct-2010
Author:
Alan Kaminsky
  • Method Details

    • main

      public static void main(String[] args) throws Exception
      Main program.
      Parameters:
      args - an array of String objects.
      Throws:
      Exception - if any.