fireworks.features package

Submodules

fireworks.features.background_task module

class fireworks.features.background_task.BackgroundTask(tasks, num_launches=0, sleep_time=60, run_on_finish=False)

Bases: fireworks.utilities.fw_serializers.FWSerializable, object

__init__(tasks, num_launches=0, sleep_time=60, run_on_finish=False)
Args:
tasks [FireTask]: a list of FireTasks to perform num_launches (int): the total number of times to run the process (0=infinite) sleep_time (int): sleep time in seconds between background runs run_on_finish (bool): always run this task upon completion of Firework
classmethod from_dict(*args, **kwargs)
to_dict(*args, **kwargs)

fireworks.features.dupefinder module

class fireworks.features.dupefinder.DupeFinderBase

Bases: fireworks.utilities.fw_serializers.FWSerializable

This serves an Abstract class for implementing Duplicate Finders

__init__()
classmethod from_dict(m_dict)
query(spec)

Given a spec, returns a database query that gives potential candidates for duplicated FireWorks.

Args:
spec (dict): spec to check for duplicates
to_dict(*args, **kwargs)
verify(spec1, spec2)

Method that checks whether two specs are identical enough to be considered duplicates. Return true if duplicated.

Args: spec1 (dict) spec2 (dict)

Returns:
bool

fireworks.features.fw_report module

class fireworks.features.fw_report.FWReport(lpad)
__init__(lpad)

Args: lpad (LaunchPad)

get_stats(coll='fireworks', interval='days', num_intervals=5, additional_query=None)

Compile statistics of completed Fireworks/Workflows for past <num_intervals> <interval>, e.g. past 5 days.

Args:
coll (str): collection, either “fireworks”, “workflows”, or “launches” interval (str): one of “minutes”, “hours”, “days”, “months”, “years” num_intervals (int): number of intervals to go back in time from present moment additional_query (dict): additional constraints on reporting
Returns:
list, with each item being a dictionary of statistics for a given interval
get_stats_str(decorated_stat_list)

Convert the list of stats from FWReport.get_stats() to a string representation for viewing.

Args:
decorated_stat_list ([dict])
Returns:
str

fireworks.features.introspect module

class fireworks.features.introspect.Introspector(lpad)
__init__(lpad)
Args:
lpad (LaunchPad)
introspect_fizzled(coll='fws', rsort=True, threshold=10, limit=100)
print_report(table, coll)
fireworks.features.introspect.collect_stats(list_keys, filter_truncated=True)

Turns a list of keys (from flatten_to_keys) into a dict of <str>:count, i.e. counts the number of times each key appears.

Args:
list_keys filter_truncated (bool)
Returns:
dict
fireworks.features.introspect.compare_stats(statsdict1, numsamples1, statsdict2, numsamples2, threshold=5)
fireworks.features.introspect.flatten_to_keys(curr_doc, curr_recurs=1, max_recurs=2)

Converts a dictionary into a list of keys, with string values “key1.key2:val”

Args:
curr_doc curr_recurs (int) max_recurs (int)
Return:
str

fireworks.features.multi_launcher module

fireworks.features.multi_launcher.launch_multiprocess(launchpad, fworker, loglvl, nlaunches, num_jobs, sleep_time, total_node_list=None, ppn=1, timeout=None, exclude_current_node=False)

Launch the jobs in the job packing mode.

Args:
launchpad (LaunchPad) fworker (FWorker) loglvl (str): level at which to output logs nlaunches (int): 0 means ‘until completion’, -1 or “infinite” means to loop forever num_jobs(int): number of sub jobs sleep_time (int): secs to sleep between rapidfire loop iterations total_node_list ([str]): contents of NODEFILE (doesn’t affect execution) ppn (int): processors per node (doesn’t affect execution) timeout (int): # of seconds after which to stop the rapidfire process exclude_current_node: Don’t use the script launching node as a compute node
fireworks.features.multi_launcher.ping_multilaunch(port, stop_event)

A single manager to ping all launches during multiprocess launches

Args:
port (int): Listening port number of the DataServer stop_event (Thread.Event): stop event
fireworks.features.multi_launcher.rapidfire_process(fworker, nlaunches, sleep, loglvl, port, node_list, sub_nproc, timeout, running_ids_dict)

Initializes shared data with multiprocessing parameters and starts a rapidfire.

Args:
fworker (FWorker): object nlaunches (int): 0 means ‘until completion’, -1 or “infinite” means to loop forever sleep (int): secs to sleep between rapidfire loop iterations loglvl (str): level at which to output logs to stdout port (int): Listening port number of the shared object manage password (str): security password to access the server node_list ([str]): computer node list sub_nproc (int): number of processors of the sub job timeout (int): # of seconds after which to stop the rapidfire process
fireworks.features.multi_launcher.split_node_lists(num_jobs, total_node_list=None, ppn=24)

Parse node list and processor list from nodefile contents

Args:
num_jobs (int): number of sub jobs total_node_list (list of str): the node list of the whole large job ppn (int): number of procesors per node
Returns:
(([int],[int])) the node list and processor list for each job
fireworks.features.multi_launcher.start_rockets(fworker, nlaunches, sleep, loglvl, port, node_lists, sub_nproc_list, timeout=None, running_ids_dict=None)

Create each sub job and start a rocket launch in each one

Args:
fworker (FWorker): object nlaunches (int): 0 means ‘until completion’, -1 or “infinite” means to loop forever sleep (int): secs to sleep between rapidfire loop iterations loglvl (str): level at which to output logs to stdout port (int): Listening port number node_lists ([str]): computer node list sub_nproc_list ([int]): list of the number of the process of sub jobs timeout (int): # of seconds after which to stop the rapidfire process running_ids_dict (dict): Shared dict between process to record IDs
Returns:
([multiprocessing.Process]) all the created processes

fireworks.features.stats module

class fireworks.features.stats.FWStats(lpad)
__init__(lpad)

Object to get Fireworks running stats from a LaunchPad.

Args:
lpad (LaunchPad): A LaunchPad object that manages the Fireworks database
get_daily_completion_summary(query_start=None, query_end=None, query=None, time_field=u'time_end', **args)

Get daily summary of fireworks for a specified time range :param query_start: (str) The start time (inclusive) to query in isoformat (YYYY-MM-DDTHH:MM:SS.mmmmmm). Default is 30 days before current time. :param query_end: (str) The end time (exclusive) to query in isoformat (YYYY-MM-DDTHH:MM:SS.mmmmmm). Default is current time. :param query: (dict) Additional Pymongo queries to filter entries for process. :param time_field: (str) The field to query time range. Default is “time_end”. :param args: (dict) Time difference to calculate query_start from query_end. Accepts arguments in python datetime.timedelta function. args and query_start can not be given at the same time. Default is 30 days. :return: (list) A summary of daily fireworks stats for the specified time range.

get_fireworks_summary(query_start=None, query_end=None, query=None, time_field=u'updated_on', **args)

Get fireworks summary for a specified time range.

Args:
query_start (str): The start time (inclusive) to query in isoformat (YYYY-MM-DDTHH:MM:SS.mmmmmm).
Default is 30 days before current time.
query_end (str): The end time (exclusive) to query in isoformat (YYYY-MM-DDTHH:MM:SS.mmmmmm).
Default is current time.

query (dict): Additional Pymongo queries to filter entries for process. time_field (str): The field to query time range. Default is “updated_on”. args (dict): Time difference to calculate query_start from query_end.

Accepts arguments in python datetime.timedelta function. args and query_start can not be given at the same time. Default is 30 days.
Returns:
(list) A summary of fireworks stats for the specified time range.
get_launch_summary(query_start=None, query_end=None, time_field=u'time_end', query=None, runtime_stats=False, include_ids=False, **args)

Get launch summary for a specified time range.

Args:
query_start (str): The start time (inclusive) to query in isoformat (YYYY-MM-DDTHH:MM:SS.mmmmmm).
Default is 30 days before current time.
query_end (str): The end time (exclusive) to query in isoformat (YYYY-MM-DDTHH:MM:SS.mmmmmm).
Default is current time.

time_field (str): The field to query time range. Default is “time_end”. query (dict): Additional Pymongo queries to filter entries for process. runtime_stats (bool): If return runtime stats. Default is False. include_ids (bool): If return fw_ids. Default is False. args (dict): Time difference to calculate query_start from query_end.

Accepts arguments in python datetime.timedelta function. args and query_start can not be given at the same time. Default is 30 days.
Returns:
(list) A summary of launch stats for the specified time range.
get_workflow_summary(query_start=None, query_end=None, query=None, time_field=u'updated_on', **args)

Get workflow summary for a specified time range. :param query_start: (str) The start time (inclusive) to query in isoformat (YYYY-MM-DDTHH:MM:SS.mmmmmm). Default is 30 days before current time. :param query_end: (str) The end time (exclusive) to query in isoformat (YYYY-MM-DDTHH:MM:SS.mmmmmm). Default is current time. :param query: (dict) Additional Pymongo queries to filter entries for process. :param time_field: (str) The field to query time range. Default is “updated_on”. :param args: (dict) Time difference to calculate query_start from query_end. Accepts arguments in python datetime.timedelta function. args and query_start can not be given at the same time. Default is 30 days. :return: (list) A summary of workflow stats for the specified time range.

group_fizzled_fireworks(group_by, query_start=None, query_end=None, query=None, include_ids=False, **args)

Group fizzled fireworks for a specified time range by a specified key. :param group_by: (str) Database field used to group fireworks items. :param query_start: (str) The start time (inclusive) to query in isoformat (YYYY-MM-DDTHH:MM:SS.mmmmmm). Default is 30 days before current time. :param query_end: (str) The end time (exclusive) to query in isoformat (YYYY-MM-DDTHH:MM:SS.mmmmmm). Default is current time. :param query: (dict) Additional Pymongo queries to filter entries for process. :param include_ids: (bool) If return fw_ids. Default is False. :param args: (dict) Time difference to calculate query_start from query_end. Accepts arguments in python datetime.timedelta function. args and query_start can not be given at the same time. Default is 30 days. :return: (list) A summary of fizzled fireworks for group by the specified key.

identify_catastrophes(error_ratio=0.01, query_start=None, query_end=None, query=None, time_field=u'time_end', include_ids=True, **args)

Get days with higher failure ratio :param error_ratio: (float) Threshold of error ratio to define as a catastrophic day :param query_start: (str) The start time (inclusive) to query in isoformat (YYYY-MM-DDTHH:MM:SS.mmmmmm). Default is 30 days before current time. :param query_end: (str) The end time (exclusive) to query in isoformat (YYYY-MM-DDTHH:MM:SS.mmmmmm). Default is current time. :param query: (dict) Additional Pymongo queries to filter entries for process. :param time_field: (str) The field to query time range. Default is “time_end”. :param include_ids: (bool) If return fw_ids. Default is False. :param args: (dict) Time difference to calculate query_start from query_end. Accepts arguments in python datetime.timedelta function. args and query_start can not be given at the same time. Default is 30 days. :return: (list) Dates with higher failure ratio with optional failed fw_ids.

Module contents