Previous topic

util.bitset

Next topic

util.garray

This Page

util.cluster

Classes and functions for managing a cluster of compute nodes via SSH.

class glimpse.util.cluster.Builder

Provides a builder syntax for constructing job lists.

AddJob(commands=[], files=[], name=None, repeat=None)

Add information for a new job specification.

Parameters:
  • commands (list of str) – Path to binary, and arguments.
  • files (list of str) – Path to files needed to run this job.
  • name (str) – Name of job, for debugging.
  • repeat (int) – Number of times to replicate job.
MakeJobSpecs()

Generate the list of job specs described by previous calls to this object.

SetRepeat(repeat)

Set the default number of times to replicate a job.

glimpse.util.cluster.CheckLocalCommand(cmd, verbose=False)

Run a single command on local machine, returning stdout.

This function throws an exception if the command fails.

Parameters:
  • cmd (list of str) – Command to run.
  • verbose (bool) – Flag controlling whether extra logging information is printed.
glimpse.util.cluster.CheckRemoteCommands(host, cmds, verbose=False)

Open a pipe via SSH, execute given commands, close the pipe, and return stdout.

Parameters:
  • host (str) – Name of remote node.
  • cmds (list of str) – Commands to execute on remote node.
  • verbose (bool) – Flag controlling whether extra logging information is printed.

Note

Requires that the command ‘ssh’ be on the local path.

glimpse.util.cluster.CopyLocalFilesToRemoteHost(host, remote_path, *files, **check_opts)

Copy a set of files to a remote node.

Parameters:
  • host (str) – Name of remote node.
  • remote_path (str) – Path of remote directory to which local files are copied.
  • files (list of str) – Set of local files to copy.
  • check_opts – Optional arguments for CheckLocalCommand().

Note

Requires that the command ‘scp’ be on the local path.

class glimpse.util.cluster.Job(job_spec)

The record of a job, including its spec and ID.

host = None

Node on which this job is running.

id_ = None

Unique identifier for this job.

spec = None

(JobSpec) The job specification.

class glimpse.util.cluster.JobSpec(cmds, files=[], name=None)

Describes the commands necessary to launch a job.

cmds = None

(list of str) Path to binary, and arguments.

files = None

(list of str) Path to files needed to run command.

name = None

(str) Name of job, for debugging.

class glimpse.util.cluster.LoggingManager(sleep_time_in_secs, log)

Manager that logs job events to disk.

HandleJobDone(job)

Event handler, called when job completes.

HandleJobLaunch(job)

Event handler, called when job begins.

HandleSleep()

Event handler, called when manager waits on finished jobs.

last_was_sleep = None

(bool) Flag indicating that the last activity was a sleep event.

log = None

(file) Destination for logging messages.

class glimpse.util.cluster.Manager(sleep_time_in_secs)

Allocates N jobs to M hosts, with N > M.

HandleJobDone(job)

Event handler, called when job completes.

HandleJobLaunch(job)

Event handler, called when job begins.

HandleSleep()

Event handler, called when manager waits on finished jobs.

LaunchJob()

Launch the next job in the queue on a free host.

ProcessJobs()

Run all jobs in the queue, blocking until they finish.

Setup(job_specs, network)

Initialize the manager to process a set of jobs.

Parameters:
  • jobs_specs – Command, arguments, and files required to launch each of a set of jobs.
  • network (Network) – The network on which to launch the jobs.
UpdateJobStatus()

Poll hosts for job status

Returns:True if any job has finished.
Return type:bool
class glimpse.util.cluster.Network(results_dir, clusters, verbose=False)

Handles remote command invocations.

GetHosts()

Get the set of hosts in the network.

GetJobStatus(jobs)

Get the current state for a list of jobs.

GetStderr(job)

Get job data that was written to the standard error stream.

GetStdout(job)

Get job data that was written to the standard output stream.

LaunchJob(job, host)

Start a job on a remote host, returning the allocated job ID.

Requires that the command ‘gjob’ be on the remote path of the worker node. The set of job commands is launched as a bash script.

class glimpse.util.cluster.Queue(states, objects)

Maintains the state for each element in a fixed list of objects.

AllInState(state)

Determine if all objects are in given state.

AnyInState(state)

Determine if any objects are in given state.

ChooseNext(state)

Choose first object in given state.

ChooseRandom(state)

Choose random object in given state.

InState(state)

Get list of objects in given state.

SetState(obj, state)

Set the state of an object.

Parameters:
  • obj – The resource.
  • state – The new state of the object.
glimpse.util.cluster.STATES = ['free', 'busy', 'done', 'unknown', 'error']

Set of all resource states.

glimpse.util.cluster.STATE_BUSY = 'busy'

Resource is in use (not ready).

glimpse.util.cluster.STATE_DONE = 'done'

A single-use resource has been consumed.

glimpse.util.cluster.STATE_ERROR = 'error'

Resource is not ready because an error occured.

glimpse.util.cluster.STATE_FREE = 'free'

Resource is ready for use.

glimpse.util.cluster.STATE_UNKNOWN = 'unknown'

Resource is in an unknown state. This is usually treated as an error.

glimpse.util.cluster.WriteRemoteFile(host, remote_path, contents, verbose=False)

Write data stored in memory to a file on a remote node.

Parameters:
  • host (str) – Name of remote node.
  • remote_path (str) – Path of output file on remote node.
  • contents – Data to write.
  • verbose (bool) – Flag controlling whether extra logging information is printed.

Note

Requires that the command ‘ssh’ be on the local path.