Persistence

Note

This part of the documentation should only concern users interested in moving the persistence layer to a different system to store metadata associated with commands performed.

Tasks to be performed are stored persistently on disk. This is required to ensure that all steps computed, and the sequence of steps leading to results, are conserved across restarts of the main process or of the system.

Currently, this is implemented in an SQLite database.

Persistence/memoization for the DAG

class railroadtracks.hortator.DbID(id, new)
id

Alias for field number 0

new

Alias for field number 1

class railroadtracks.hortator.PersistentTaskGraph(db_fn, model, wd='.', force_create=False, isolation_level=None)[source]

List of tasks stored on disk.

class StoredEntityNoLabel(id, clsname, entityname)
clsname

Alias for field number 1

entityname

Alias for field number 2

id

Alias for field number 0

PersistentTaskGraph.finalsteps()[source]

Concrete steps for which all targets are final

PersistentTaskGraph.get_parenttask_of_storedentity(stored_entity)[source]

Return the task producing a stored entity. There should obviously only be one such task, and an Exception is raised if not the case. :param stored_entity: the stored entity in the database :type stored_entity: StoredEntity :rtype: a StepConcrete_DbEntry namedtuple, or None

PersistentTaskGraph.get_sourcesofactivity(activity)[source]

Retrieve the sources of steps performing a specific activity. :param activity: an activity :type activity: Enum

PersistentTaskGraph.get_srcassets(concrete_step_id)[source]

Return the source files for a given concrete step ID. :param concrete_step_id: ID for the concrete step in the database. :rtype: generator

PersistentTaskGraph.get_targetassets(concrete_step_id)[source]

Return the target files for a given concrete step ID. :param concrete_step_id: ID for the concrete step in the database. :rtype: generator

PersistentTaskGraph.get_targetsofactivity(activity)[source]

Retrieve the targets of steps performing a specific activity. :param activity: an activity :type activity: Enum

PersistentTaskGraph.get_targetsoftype(clsname)[source]

Return all targets of a given type.

PersistentTaskGraph.get_targetstepconcrete(stored_entity)[source]

Return the tasks using a given stored entity. :param stored_entity: the stored entity in the database. :type stored_entity: can be StoredEntity or StoredSequence :rtype: a SepConcrete_DbEntry namedtuple, or None

PersistentTaskGraph.id_step_activity(activity)[source]

Conditionally add an activity (add only if not already present) :param activity: one actibity name :rtype: ID for the activity as an integer

PersistentTaskGraph.id_step_type(activities)[source]

Conditionally add a step type (add only if not already present). :param activities: sequence of activity names :rtype: ID for the step type as an integer

PersistentTaskGraph.id_step_variant(step, activities)[source]

Return a database ID for the step variant (creating a new ID only of the variant is not already tracked)

Parameters:
  • step (core.StepAbstract) – a step
  • activities – a sequence of activity names
Return type:

ID for a step variant as an int.

PersistentTaskGraph.id_stepconcrete(step_variant_id, sources, targets, parameters, tag=1)[source]

Conditionally add a task (“concrete” step), that is a step variant (executable and parameters) to which source and target files, as well as parameters, are added.

Parameters:
  • step_variant_id (integer) – ID for the step variant
  • sources (AssetSet) – sequence of sources
  • targets (AssetSet) – sequence of targets
  • parameters (a sequence of str) – list of parameters
  • tag (a sequence of int) – a tag, used to performed repetitions of the exact same task
Return type:

DbID

PersistentTaskGraph.id_stepparameters(parameters)[source]

Conditionally add parameters (add only if not already present) :param parameters: sequence of parameters :rtype: ID for the pattern as a DbID.

PersistentTaskGraph.id_stored_entity(cls, name)[source]

Conditionally add a stored entity (add only if not already present) :param cls: Python class for the stored entity :param name: Parameter “name” for the class “cls”. :rtype: ID for the pattern as a DbID.

PersistentTaskGraph.id_stored_sequence(cls, clsname_sequence)[source]

Conditionally add a sequence of stored entities (add only if not already present) :param clsname_sequence: Sequence of pairs (Python class for the stored entity, parameter “name” for the class “cls”) :rtype: ID for the sequence as a DbID.

PersistentTaskGraph.iter_finaltargets()[source]

Targets not used as source anywhere else.

PersistentTaskGraph.iter_steps()[source]

Iterate through the concrete steps

PersistentTaskGraph.statuslist

Status list

PersistentTaskGraph.step_concrete_state(step_concrete_id, state_id)[source]

Set the state of a task:

  • step_concrete_id: task ID (DbID)
  • state_id: state ID
PersistentTaskGraph.version

Version for the database and package (mixing versions comes at one’s own risks)

class railroadtracks.hortator.Step(step, sources, targets, parameters, model)[source]

When used in the context of a StepGraph, a Step is small graph, or subgraph, consituted of a vertex, connected downstream to targets and upstream to sources. For more information about a StepGraph, see the documentation for it.

class railroadtracks.hortator.StepGraph(persistent_graph)[source]

The steps to be performed are stored in a directed acyclic graph (DAG).

This graph can be thought of as a two-level graph. The higher level represents the connectivity between steps (we will call supersteps), and the lower-level expands each step into sources, targets, and a step using sources to produce targets.

There is a persistent representation (currently a mysql database), and this class is aiming at isolating this implementation detail from a user.

add(step, assets, parameters=(), tag=1, use_cache=True)[source]

Add a step, associated assets, and optional parameters, to the StepGraph.

The task graph is like a directed (presumably) acyclic multilevel graph. Asset vertices are only connected to step vertices (in other words asset vertices represent connective layers between steps).

Parameters:
  • step (a core.StepAbstract (or of child classes) object) – The step to be added
  • assets (a core.AssetStep (or of child classes) object) – The assets linked to the step added. If assets.target is undefined, the method will define it with unique identifiers and these will assigned in place.
  • parameters (A sequence of str elements) – Parameters for the step
  • tag – A tag to differentiate repetitions of the exact same task.
Return type:

StepConcrete_DbEntry as the entry added to the database

cleantargets_stepconcrete(step_concrete_id)[source]

Clean the targets downstream of a task (step_concrete), which means erasing the target files and (re)setting the tasks’ status to ‘TO DO’.

Parameters:step_concrete_id – A task
destinationwalk_stepconcrete(step_concrete_id, func_stored_entity, func_stored_sequence, func_step_concrete, func_storedentity_stepconcrete, func_stepconcrete_storedentity)[source]

Walk down the path.

destinationwalk_storedentity(stored_entity, func_stored_entity, func_stored_sequence, func_step_concrete, func_storedentity_stepconcrete, func_stepconcrete_storedentity)[source]

Walk down the path.

provenancewalk_storedentity(stored_entity, func_stored_entity, func_stored_sequence, func_step_concrete, func_storedentity_stepconcrete, func_stepconcrete_storedentity)[source]

Walk up the path. :param stored_entity_id: the stored entity to start from :param func_stored_entity: a callback called with each stored entity :param func_step_concrete: a callback called with each step concrete :param func_storedentity_stepconcrete: a callback called with each link between a stored entity and a step concrete :param func_stepconcrete_storedentity: a callback called with each link between a step concrete and a stored entity

static stepconcrete_dirname(stepconcrete_id)[source]

Name of the directory corresponding to an ID.

Parameters:stepconcrete_id – ID for a directory.
railroadtracks.hortator.TaskStatusCount

alias of TaskStatus