fireworks.features package

Submodules

fireworks.features.background_task module

class fireworks.features.background_task.BackgroundTask(tasks, num_launches=0, sleep_time=60, run_on_finish=False)

Bases: fireworks.utilities.fw_serializers.FWSerializable, object

__init__(tasks, num_launches=0, sleep_time=60, run_on_finish=False)
Parameters:
  • tasks – [FireTask] - a list of FireTasks to perform
  • num_launches – (int) the total number of times to run the process (0=infinite)
  • sleep_time – (int) sleep time in seconds between background runs
  • (bool) (run_on_finish) – always run this task upon completion of Firework
classmethod from_dict(*args, **kwargs)
to_dict(*args, **kwargs)

fireworks.features.dupefinder module

class fireworks.features.dupefinder.DupeFinderBase

Bases: fireworks.utilities.fw_serializers.FWSerializable

This serves an Abstract class for implementing Duplicate Finders

__init__()
classmethod from_dict(m_dict)
query(spec)

Given a spec, returns a database query that gives potential candidates for duplicated FireWorks.

Parameters:spec – spec to check for duplicates
to_dict(*args, **kwargs)
verify(spec1, spec2)

Method that checks whether two specs are identical enough to be considered duplicates. Return true if duplicated. :param spec1: (dict) :param spec2: (dict)

fireworks.features.fw_report module

class fireworks.features.fw_report.FWReport(lpad)
__init__(lpad)
Parameters:lpad – (LaunchPad)
get_stats(coll='fireworks', interval='days', num_intervals=5, additional_query=None)

Compile statistics of completed Fireworks/Workflows for past <num_intervals> <interval>, e.g. past 5 days.

Parameters:
  • coll – collection, either “fireworks”, “workflows”, or “launches”
  • interval – one of “minutes”, “hours”, “days”, “months”, “years”
  • num_intervals – number of intervals to go back in time from present moment
  • additional_query – additional constraints on reporting
Returns:

list, with each item being a dictionary of statistics for a given interval

get_stats_str(decorated_stat_list)

Convert the list of stats from FWReport.get_stats() to a string representation for viewing.

Parameters:decorated_stat_list – list of dict
Returns:String

fireworks.features.introspect module

class fireworks.features.introspect.Introspector(lpad)
__init__(lpad)
Parameters:lpad – (LaunchPad)
introspect_fizzled(coll='fws', rsort=True, threshold=10, limit=100)
print_report(table, coll)
fireworks.features.introspect.collect_stats(list_keys, filter_truncated=True)

Turns a list of keys (from flatten_to_keys) into a dict of <str>:count, i.e. counts the number of times each key appears :param list_keys: :param filter_truncated: :return:

fireworks.features.introspect.compare_stats(statsdict1, numsamples1, statsdict2, numsamples2, threshold=5)
fireworks.features.introspect.flatten_to_keys(curr_doc, curr_recurs=1, max_recurs=2)

Converts a dictionary into a list of keys, with string values “key1.key2:val”

Parameters:
  • curr_doc
  • curr_recurs
  • max_recurs
Returns:

[<str>]

fireworks.features.multi_launcher module

fireworks.features.multi_launcher.launch_multiprocess(launchpad, fworker, loglvl, nlaunches, num_jobs, sleep_time, total_node_list=None, ppn=1, timeout=None, exclude_current_node=False)

Launch the jobs in the job packing mode. :param launchpad: (LaunchPad) object :param fworker: (FWorker) object :param loglvl: (str) level at which to output logs :param nlaunches: (int) 0 means ‘until completion’, -1 or “infinite” means to loop forever :param num_jobs: (int) number of sub jobs :param sleep_time: (int) secs to sleep between rapidfire loop iterations :param total_node_list: ([str]) contents of NODEFILE (doesn’t affect execution) :param ppn: (int) processors per node (doesn’t affect execution) :param timeout: (int) # of seconds after which to stop the rapidfire process :param exclude_current_node: Don’t use the script launching node as a compute node

fireworks.features.multi_launcher.ping_multilaunch(port, stop_event)

A single manager to ping all launches during multiprocess launches

Parameters:
  • port – (int) Listening port number of the DataServer
  • stop_event – (Thread.Event) stop event
fireworks.features.multi_launcher.rapidfire_process(fworker, nlaunches, sleep, loglvl, port, node_list, sub_nproc, timeout, running_ids_dict)

Initializes shared data with multiprocessing parameters and starts a rapidfire

Parameters:
  • fworker – (FWorker) object
  • nlaunches – (int) 0 means ‘until completion’, -1 or “infinite” means to loop forever
  • sleep – (int) secs to sleep between rapidfire loop iterations
  • loglvl – (str) level at which to output logs to stdout
  • port – (int) Listening port number of the shared object manage
  • password – (str) security password to access the server
  • node_list – ([str]) computer node list
  • sub_nproc – (int) number of processors of the sub job
  • timeout – (int) # of seconds after which to stop the rapidfire process
fireworks.features.multi_launcher.split_node_lists(num_jobs, total_node_list=None, ppn=24)

Parse node list and processor list from nodefile contents

Parameters:
  • num_jobs – (int) number of sub jobs
  • total_node_list – (list of str) the node list of the whole large job
  • ppn – (int) number of procesors per node
Returns:

(([int],[int])) the node list and processor list for each job

fireworks.features.multi_launcher.start_rockets(fworker, nlaunches, sleep, loglvl, port, node_lists, sub_nproc_list, timeout=None, running_ids_dict=None)

Create each sub job and start a rocket launch in each one

Parameters:
  • fworker – (FWorker) object
  • nlaunches – nlaunches: (int) 0 means ‘until completion’, -1 or “infinite” means to loop forever
  • sleep – (int) secs to sleep between rapidfire loop iterations
  • loglvl – (str) level at which to output logs to stdout
  • port – (int) Listening port number
  • node_lists – ([str]) computer node list
  • sub_nproc_list – ([int]) list of the number of the process of sub jobs
  • timeout – (int) # of seconds after which to stop the rapidfire process
  • running_ids_dict – Shared dict between process to record IDs
Returns:

([multiprocessing.Process]) all the created processes

fireworks.features.stats module

class fireworks.features.stats.FWStats(lpad)
__init__(lpad)

Object to get Fireworks running stats from a LaunchPad :param lpad: (LaunchPad) A LaunchPad object that manages the Fireworks database

get_daily_completion_summary(query_start=None, query_end=None, query=None, time_field=u'time_end', **args)

Get daily summary of fireworks for a specified time range :param query_start: (str) The start time (inclusive) to query in isoformat (YYYY-MM-DDTHH:MM:SS.mmmmmm). Default is 30 days before current time. :param query_end: (str) The end time (exclusive) to query in isoformat (YYYY-MM-DDTHH:MM:SS.mmmmmm). Default is current time. :param query: (dict) Additional Pymongo queries to filter entries for process. :param time_field: (str) The field to query time range. Default is “time_end”. :param args: (dict) Time difference to calculate query_start from query_end. Accepts arguments in python datetime.timedelta function. args and query_start can not be given at the same time. Default is 30 days. :return: (list) A summary of daily fireworks stats for the specified time range.

get_fireworks_summary(query_start=None, query_end=None, query=None, time_field=u'updated_on', **args)

Get fireworks summary for a specified time range. :param query_start: (str) The start time (inclusive) to query in isoformat (YYYY-MM-DDTHH:MM:SS.mmmmmm). Default is 30 days before current time. :param query_end: (str) The end time (exclusive) to query in isoformat (YYYY-MM-DDTHH:MM:SS.mmmmmm). Default is current time. :param query: (dict) Additional Pymongo queries to filter entries for process. :param time_field: (str) The field to query time range. Default is “updated_on”. :param args: (dict) Time difference to calculate query_start from query_end. Accepts arguments in python datetime.timedelta function. args and query_start can not be given at the same time. Default is 30 days. :return: (list) A summary of fireworks stats for the specified time range.

get_launch_summary(query_start=None, query_end=None, time_field=u'time_end', query=None, runtime_stats=False, include_ids=False, **args)

Get launch summary for a specified time range. :param query_start: (str) The start time (inclusive) to query in isoformat (YYYY-MM-DDTHH:MM:SS.mmmmmm). Default is 30 days before current time. :param query_end: (str) The end time (exclusive) to query in isoformat (YYYY-MM-DDTHH:MM:SS.mmmmmm). Default is current time. :param time_field: (str) The field to query time range. Default is “time_end”. :param query: (dict) Additional Pymongo queries to filter entries for process. :param runtime_stats: (bool) If return runtime stats. Default is False. :param include_ids: (bool) If return fw_ids. Default is False. :param args: (dict) Time difference to calculate query_start from query_end. Accepts arguments in python datetime.timedelta function. args and query_start can not be given at the same time. Default is 30 days. :return: (list) A summary of launch stats for the specified time range.

get_workflow_summary(query_start=None, query_end=None, query=None, time_field=u'updated_on', **args)

Get workflow summary for a specified time range. :param query_start: (str) The start time (inclusive) to query in isoformat (YYYY-MM-DDTHH:MM:SS.mmmmmm). Default is 30 days before current time. :param query_end: (str) The end time (exclusive) to query in isoformat (YYYY-MM-DDTHH:MM:SS.mmmmmm). Default is current time. :param query: (dict) Additional Pymongo queries to filter entries for process. :param time_field: (str) The field to query time range. Default is “updated_on”. :param args: (dict) Time difference to calculate query_start from query_end. Accepts arguments in python datetime.timedelta function. args and query_start can not be given at the same time. Default is 30 days. :return: (list) A summary of workflow stats for the specified time range.

group_fizzled_fireworks(group_by, query_start=None, query_end=None, query=None, include_ids=False, **args)

Group fizzled fireworks for a specified time range by a specified key. :param group_by: (str) Database field used to group fireworks items. :param query_start: (str) The start time (inclusive) to query in isoformat (YYYY-MM-DDTHH:MM:SS.mmmmmm). Default is 30 days before current time. :param query_end: (str) The end time (exclusive) to query in isoformat (YYYY-MM-DDTHH:MM:SS.mmmmmm). Default is current time. :param query: (dict) Additional Pymongo queries to filter entries for process. :param include_ids: (bool) If return fw_ids. Default is False. :param args: (dict) Time difference to calculate query_start from query_end. Accepts arguments in python datetime.timedelta function. args and query_start can not be given at the same time. Default is 30 days. :return: (list) A summary of fizzled fireworks for group by the specified key.

identify_catastrophes(error_ratio=0.01, query_start=None, query_end=None, query=None, time_field=u'time_end', include_ids=True, **args)

Get days with higher failure ratio :param error_ratio: (float) Threshold of error ratio to define as a catastrophic day :param query_start: (str) The start time (inclusive) to query in isoformat (YYYY-MM-DDTHH:MM:SS.mmmmmm). Default is 30 days before current time. :param query_end: (str) The end time (exclusive) to query in isoformat (YYYY-MM-DDTHH:MM:SS.mmmmmm). Default is current time. :param query: (dict) Additional Pymongo queries to filter entries for process. :param time_field: (str) The field to query time range. Default is “time_end”. :param include_ids: (bool) If return fw_ids. Default is False. :param args: (dict) Time difference to calculate query_start from query_end. Accepts arguments in python datetime.timedelta function. args and query_start can not be given at the same time. Default is 30 days. :return: (list) Dates with higher failure ratio with optional failed fw_ids.

Module contents