Persistence¶
Note
This part of the documentation should only concern users interested in moving the persistence layer to a different system to store metadata associated with commands performed.
Tasks to be performed are stored persistently on disk. This is required to ensure that all steps computed, and the sequence of steps leading to results, are conserved across restarts of the main process or of the system.
Currently, this is implemented in an SQLite database.
Persistence/memoization for the DAG
-
class
railroadtracks.hortator.
DbID
(id, new)¶ -
id
¶ Alias for field number 0
-
new
¶ Alias for field number 1
-
-
class
railroadtracks.hortator.
PersistentTaskGraph
(db_fn, model, wd='.', force_create=False, isolation_level=None)[source]¶ List of tasks stored on disk.
-
class
StoredEntityNoLabel
(id, clsname, entityname)¶ -
clsname
¶ Alias for field number 1
-
entityname
¶ Alias for field number 2
-
id
¶ Alias for field number 0
-
-
PersistentTaskGraph.
get_parenttask_of_storedentity
(stored_entity)[source]¶ Return the task producing a stored entity. There should obviously only be one such task, and an Exception is raised if not the case. :param stored_entity: the stored entity in the database :type stored_entity:
StoredEntity
:rtype: a StepConcrete_DbEntrynamedtuple
, or None
-
PersistentTaskGraph.
get_sourcesofactivity
(activity)[source]¶ Retrieve the sources of steps performing a specific activity. :param activity: an activity :type activity:
Enum
-
PersistentTaskGraph.
get_srcassets
(concrete_step_id)[source]¶ Return the source files for a given concrete step ID. :param concrete_step_id: ID for the concrete step in the database. :rtype: generator
-
PersistentTaskGraph.
get_targetassets
(concrete_step_id)[source]¶ Return the target files for a given concrete step ID. :param concrete_step_id: ID for the concrete step in the database. :rtype: generator
-
PersistentTaskGraph.
get_targetsofactivity
(activity)[source]¶ Retrieve the targets of steps performing a specific activity. :param activity: an activity :type activity:
Enum
-
PersistentTaskGraph.
get_targetstepconcrete
(stored_entity)[source]¶ Return the tasks using a given stored entity. :param stored_entity: the stored entity in the database. :type stored_entity: can be
StoredEntity
orStoredSequence
:rtype: a SepConcrete_DbEntrynamedtuple
, or None
-
PersistentTaskGraph.
id_step_activity
(activity)[source]¶ Conditionally add an activity (add only if not already present) :param activity: one actibity name :rtype: ID for the activity as an integer
-
PersistentTaskGraph.
id_step_type
(activities)[source]¶ Conditionally add a step type (add only if not already present). :param activities: sequence of activity names :rtype: ID for the step type as an integer
-
PersistentTaskGraph.
id_step_variant
(step, activities)[source]¶ Return a database ID for the step variant (creating a new ID only of the variant is not already tracked)
Parameters: - step (
core.StepAbstract
) – a step - activities – a sequence of activity names
Return type: ID for a step variant as an
int
.- step (
-
PersistentTaskGraph.
id_stepconcrete
(step_variant_id, sources, targets, parameters, tag=1)[source]¶ Conditionally add a task (“concrete” step), that is a step variant (executable and parameters) to which source and target files, as well as parameters, are added.
Parameters: Return type:
-
PersistentTaskGraph.
id_stepparameters
(parameters)[source]¶ Conditionally add parameters (add only if not already present) :param parameters: sequence of parameters :rtype: ID for the pattern as a
DbID
.
-
PersistentTaskGraph.
id_stored_entity
(cls, name)[source]¶ Conditionally add a stored entity (add only if not already present) :param cls: Python class for the stored entity :param name: Parameter “name” for the class “cls”. :rtype: ID for the pattern as a
DbID
.
-
PersistentTaskGraph.
id_stored_sequence
(cls, clsname_sequence)[source]¶ Conditionally add a sequence of stored entities (add only if not already present) :param clsname_sequence: Sequence of pairs (Python class for the stored entity, parameter “name” for the class “cls”) :rtype: ID for the sequence as a
DbID
.
-
PersistentTaskGraph.
statuslist
¶ Status list
-
PersistentTaskGraph.
step_concrete_state
(step_concrete_id, state_id)[source]¶ Set the state of a task:
- step_concrete_id: task ID (DbID)
- state_id: state ID
-
PersistentTaskGraph.
version
¶ Version for the database and package (mixing versions comes at one’s own risks)
-
class
-
class
railroadtracks.hortator.
Step
(step, sources, targets, parameters, model)[source]¶ When used in the context of a StepGraph, a Step is small graph, or subgraph, consituted of a vertex, connected downstream to targets and upstream to sources. For more information about a StepGraph, see the documentation for it.
-
class
railroadtracks.hortator.
StepGraph
(persistent_graph)[source]¶ The steps to be performed are stored in a directed acyclic graph (DAG).
This graph can be thought of as a two-level graph. The higher level represents the connectivity between steps (we will call supersteps), and the lower-level expands each step into sources, targets, and a step using sources to produce targets.
There is a persistent representation (currently a mysql database), and this class is aiming at isolating this implementation detail from a user.
-
add
(step, assets, parameters=(), tag=1, use_cache=True)[source]¶ Add a step, associated assets, and optional parameters, to the StepGraph.
The task graph is like a directed (presumably) acyclic multilevel graph. Asset vertices are only connected to step vertices (in other words asset vertices represent connective layers between steps).
Parameters: - step (a
core.StepAbstract
(or of child classes) object) – The step to be added - assets (a
core.AssetStep
(or of child classes) object) – The assets linked to the step added. Ifassets.target
is undefined, the method will define it with unique identifiers and these will assigned in place. - parameters (A sequence of
str
elements) – Parameters for the step - tag – A tag to differentiate repetitions of the exact same task.
Return type: StepConcrete_DbEntry
as the entry added to the database- step (a
-
cleantargets_stepconcrete
(step_concrete_id)[source]¶ Clean the targets downstream of a task (step_concrete), which means erasing the target files and (re)setting the tasks’ status to ‘TO DO’.
Parameters: step_concrete_id – A task
-
destinationwalk_stepconcrete
(step_concrete_id, func_stored_entity, func_stored_sequence, func_step_concrete, func_storedentity_stepconcrete, func_stepconcrete_storedentity)[source]¶ Walk down the path.
-
destinationwalk_storedentity
(stored_entity, func_stored_entity, func_stored_sequence, func_step_concrete, func_storedentity_stepconcrete, func_stepconcrete_storedentity)[source]¶ Walk down the path.
-
provenancewalk_storedentity
(stored_entity, func_stored_entity, func_stored_sequence, func_step_concrete, func_storedentity_stepconcrete, func_stepconcrete_storedentity)[source]¶ Walk up the path. :param stored_entity_id: the stored entity to start from :param func_stored_entity: a callback called with each stored entity :param func_step_concrete: a callback called with each step concrete :param func_storedentity_stepconcrete: a callback called with each link between a stored entity and a step concrete :param func_stepconcrete_storedentity: a callback called with each link between a step concrete and a stored entity
-
-
railroadtracks.hortator.
TaskStatusCount
¶ alias of
TaskStatus