rcluster.rcluster module

class rcluster.rcluster.RCluster(aws_access_key_id, aws_secret_access_key, region_name, instance_conf, manager_runtime=None, worker_runtime=None, key_path=None, ip_ref='public_ip_address', ver='0.2.9', purge=False)[source]

Bases: object

RCluster class object

Designed to organize the information for a boto3 connection to EC2, paramiko connections using a consistent SSH key, creation of EC2 instances using a consistent key, the creation and tracking of manager and worker nodes comprising an R PSOCK cluster, and networking those manager and worker nodes to access within an RStudio Server session.

__repr__()[source]

Indicates RCluster and pretty prints the _config dictionary

__setattr__(key, value)[source]

Redefined to keep an updated version of the RCluster configuration options saved. Allows for easy exporting, duplication, and modification of configurations.

See fromConfig() and writeConfig()

connect(instance)[source]

Create SSH connection to boto3.EC2.Instance as paramiko.client.

Parameters:instance – A boto3.EC2.Instance object
createAmi(base=None, setup_fn=None, ver=None, update_image=True, terminate=True, wait=True)[source]

Create an AMI, returning the AMI ID.

Parameters:
  • base – boto3.EC2.Instance object or nothing; optional to allow for snapshotting.
  • setup_fn – The shell script used to configure the instance; optional to allow for snapshotting.
  • ver – Name of AMI, defaults to self.ver.
  • update_image – Flag; whether to change the RCluster’s instance_conf AMI ID to that of the new image.
  • terminate – Flag; whether to terminate the instance used to build the AMI (useful for debugging).
createCluster(n_workers=1, setup_pause=60, **kwargs)[source]

Initialize the cluster. Launch a manager instance and n_workers worker instances, automating the configuration of their shared networking.

Parameters:
  • n_workers – Number of worker instances to launch (default 1)
  • setup_pause – Pause time to allow manager and workers to boot before attempting configuration steps (default 60)
createInstances(n_instances, **kwargs)[source]

Create EC2 instances using RCluster’s configuration.

Parameters:
  • n_instances – The number of instances to be created
  • kwargs – arbitrary arguments to boto3 Session Resource ec2.create_instances; will supersede RCluster.instance_conf content
fromConfig(fn, **kwargs)[source]

Use RCluster JSON configuration to create RCluster object. Prompts the user to input mandatory configuration values that are missing (i.e., AWS access credentials).

Parameters:
  • fn – The filename containing RCluster configuration data
  • kwargs – Alternate or supplement RCluster configuration; will override the content of fn
retrieveAccessIp()[source]

Identify the master’s access IP address (if a master has been defined).

terminateInstances(ver=None)[source]

Terminate EC2.Instance objects created by the current configuration file.

writeConfig(fn)[source]

Write out RCluster configuration data as JSON.

Parameters:fn – The filename to be written, will overwrite previous file
rcluster.rcluster._ec2Purge(ec2_res, ver)[source]

Utility to clear an AWS account of previous RCluster settings (useful for development). Removes resources associated with a provided version:

  • Terminates instances with the tag key ‘rcluster’ and value ver
  • Deregisters AMI named ver
  • Deletes key-pair named ver
  • Deletes placement group named ver
  • Deletes security group named ver
Parameters:
  • ec2_res – A boto3.EC2.ServiceResource
  • ver – The “version” to delete