fireworks.features package

Submodules

fireworks.features.background_task module

class fireworks.features.background_task.BackgroundTask(tasks, num_launches=0, sleep_time=60, run_on_finish=False)

Bases: fireworks.utilities.fw_serializers.FWSerializable, object

classmethod from_dict(*args, **kwargs)
to_dict(*args, **kwargs)

fireworks.features.dupefinder module

class fireworks.features.dupefinder.DupeFinderBase

Bases: fireworks.utilities.fw_serializers.FWSerializable

This serves an Abstract class for implementing Duplicate Finders

classmethod from_dict(m_dict)
query(spec)

Given a spec, returns a database query that gives potential candidates for duplicated Fireworks.

Parameters:spec (dict) – spec to check for duplicates
to_dict(*args, **kwargs)
verify(spec1, spec2)

Method that checks whether two specs are identical enough to be considered duplicates. Return true if duplicated.

Args: spec1 (dict) spec2 (dict)

Returns:bool

fireworks.features.fw_report module

class fireworks.features.fw_report.FWReport(lpad)
get_stats(coll='fireworks', interval='days', num_intervals=5, additional_query=None)

Compile statistics of completed Fireworks/Workflows for past <num_intervals> <interval>, e.g. past 5 days.

Parameters:
  • coll (str) – collection, either “fireworks”, “workflows”, or “launches”
  • interval (str) – one of “minutes”, “hours”, “days”, “months”, “years”
  • num_intervals (int) – number of intervals to go back in time from present moment
  • additional_query (dict) – additional constraints on reporting
Returns:

list, with each item being a dictionary of statistics for a given interval

get_stats_str(decorated_stat_list)

Convert the list of stats from FWReport.get_stats() to a string representation for viewing.

Parameters:decorated_stat_list ([dict]) –
Returns:str

fireworks.features.introspect module

class fireworks.features.introspect.Introspector(lpad)
introspect_fizzled(coll='fws', rsort=True, threshold=10, limit=100)
print_report(table, coll)
fireworks.features.introspect.collect_stats(list_keys, filter_truncated=True)

Turns a list of keys (from flatten_to_keys) into a dict of <str>:count, i.e. counts the number of times each key appears.

Parameters:
  • list_keys
  • filter_truncated (bool) –
Returns:

dict

fireworks.features.introspect.compare_stats(statsdict1, numsamples1, statsdict2, numsamples2, threshold=5)
fireworks.features.introspect.flatten_to_keys(curr_doc, curr_recurs=1, max_recurs=2)

Converts a dictionary into a list of keys, with string values “key1.key2:val”

Parameters:
  • curr_doc
  • curr_recurs (int) –
  • max_recurs (int) –
Returns:

str

fireworks.features.multi_launcher module

fireworks.features.multi_launcher.launch_multiprocess(launchpad, fworker, loglvl, nlaunches, num_jobs, sleep_time, total_node_list=None, ppn=1, timeout=None, exclude_current_node=False, local_redirect=False)

Launch the jobs in the job packing mode.

Parameters:
  • launchpad (LaunchPad) –
  • fworker (FWorker) –
  • loglvl (str) – level at which to output logs
  • nlaunches (int) – 0 means ‘until completion’, -1 or “infinite” means to loop forever
  • num_jobs (int) – number of sub jobs
  • sleep_time (int) – secs to sleep between rapidfire loop iterations
  • total_node_list ([str]) – contents of NODEFILE (doesn’t affect execution)
  • ppn (int) – processors per node (doesn’t affect execution)
  • timeout (int) – # of seconds after which to stop the rapidfire process
  • exclude_current_node – Don’t use the script launching node as a compute node
  • local_redirect (bool) – redirect standard input and output to local file
fireworks.features.multi_launcher.ping_multilaunch(port, stop_event)

A single manager to ping all launches during multiprocess launches

Parameters:
  • port (int) – Listening port number of the DataServer
  • stop_event (Thread.Event) – stop event
fireworks.features.multi_launcher.rapidfire_process(fworker, nlaunches, sleep, loglvl, port, node_list, sub_nproc, timeout, running_ids_dict, local_redirect)

Initializes shared data with multiprocessing parameters and starts a rapidfire.

Parameters:
  • fworker (FWorker) – object
  • nlaunches (int) – 0 means ‘until completion’, -1 or “infinite” means to loop forever
  • sleep (int) – secs to sleep between rapidfire loop iterations
  • loglvl (str) – level at which to output logs to stdout
  • port (int) – Listening port number of the shared object manage
  • password (str) – security password to access the server
  • node_list ([str]) – computer node list
  • sub_nproc (int) – number of processors of the sub job
  • timeout (int) – # of seconds after which to stop the rapidfire process
  • local_redirect (bool) – redirect standard input and output to local file
fireworks.features.multi_launcher.split_node_lists(num_jobs, total_node_list=None, ppn=24)

Parse node list and processor list from nodefile contents

Parameters:
  • num_jobs (int) – number of sub jobs
  • total_node_list (list of str) – the node list of the whole large job
  • ppn (int) – number of procesors per node
Returns:

(([int],[int])) the node list and processor list for each job

fireworks.features.multi_launcher.start_rockets(fworker, nlaunches, sleep, loglvl, port, node_lists, sub_nproc_list, timeout=None, running_ids_dict=None, local_redirect=False)

Create each sub job and start a rocket launch in each one

Parameters:
  • fworker (FWorker) – object
  • nlaunches (int) – 0 means ‘until completion’, -1 or “infinite” means to loop forever
  • sleep (int) – secs to sleep between rapidfire loop iterations
  • loglvl (str) – level at which to output logs to stdout
  • port (int) – Listening port number
  • node_lists ([str]) – computer node list
  • sub_nproc_list ([int]) – list of the number of the process of sub jobs
  • timeout (int) – # of seconds after which to stop the rapidfire process
  • running_ids_dict (dict) – Shared dict between process to record IDs
  • local_redirect (bool) – redirect standard input and output to local file
Returns:

([multiprocessing.Process]) all the created processes

fireworks.features.stats module

class fireworks.features.stats.FWStats(lpad)
get_daily_completion_summary(query_start=None, query_end=None, query=None, time_field=u'time_end', **args)

Get daily summary of fireworks for a specified time range :param query_start: (str) The start time (inclusive) to query in isoformat (YYYY-MM-DDTHH:MM:SS.mmmmmm). Default is 30 days before current time. :param query_end: (str) The end time (exclusive) to query in isoformat (YYYY-MM-DDTHH:MM:SS.mmmmmm). Default is current time. :param query: (dict) Additional Pymongo queries to filter entries for process. :param time_field: (str) The field to query time range. Default is “time_end”. :param args: (dict) Time difference to calculate query_start from query_end. Accepts arguments in python datetime.timedelta function. args and query_start can not be given at the same time. Default is 30 days. :return: (list) A summary of daily fireworks stats for the specified time range.

get_fireworks_summary(query_start=None, query_end=None, query=None, time_field=u'updated_on', **args)

Get fireworks summary for a specified time range.

Parameters:
  • query_start (str) – The start time (inclusive) to query in isoformat (YYYY-MM-DDTHH:MM:SS.mmmmmm). Default is 30 days before current time.
  • query_end (str) – The end time (exclusive) to query in isoformat (YYYY-MM-DDTHH:MM:SS.mmmmmm). Default is current time.
  • query (dict) – Additional Pymongo queries to filter entries for process.
  • time_field (str) – The field to query time range. Default is “updated_on”.
  • args (dict) – Time difference to calculate query_start from query_end. Accepts arguments in python datetime.timedelta function. args and query_start can not be given at the same time. Default is 30 days.
Returns:

(list) A summary of fireworks stats for the specified time range.

get_launch_summary(query_start=None, query_end=None, time_field=u'time_end', query=None, runtime_stats=False, include_ids=False, **args)

Get launch summary for a specified time range.

Parameters:
  • query_start (str) – The start time (inclusive) to query in isoformat (YYYY-MM-DDTHH:MM:SS.mmmmmm). Default is 30 days before current time.
  • query_end (str) – The end time (exclusive) to query in isoformat (YYYY-MM-DDTHH:MM:SS.mmmmmm). Default is current time.
  • time_field (str) – The field to query time range. Default is “time_end”.
  • query (dict) – Additional Pymongo queries to filter entries for process.
  • runtime_stats (bool) – If return runtime stats. Default is False.
  • include_ids (bool) – If return fw_ids. Default is False.
  • args (dict) – Time difference to calculate query_start from query_end. Accepts arguments in python datetime.timedelta function. args and query_start can not be given at the same time. Default is 30 days.
Returns:

(list) A summary of launch stats for the specified time range.

get_workflow_summary(query_start=None, query_end=None, query=None, time_field=u'updated_on', **args)

Get workflow summary for a specified time range. :param query_start: (str) The start time (inclusive) to query in isoformat (YYYY-MM-DDTHH:MM:SS.mmmmmm). Default is 30 days before current time. :param query_end: (str) The end time (exclusive) to query in isoformat (YYYY-MM-DDTHH:MM:SS.mmmmmm). Default is current time. :param query: (dict) Additional Pymongo queries to filter entries for process. :param time_field: (str) The field to query time range. Default is “updated_on”. :param args: (dict) Time difference to calculate query_start from query_end. Accepts arguments in python datetime.timedelta function. args and query_start can not be given at the same time. Default is 30 days. :return: (list) A summary of workflow stats for the specified time range.

group_fizzled_fireworks(group_by, query_start=None, query_end=None, query=None, include_ids=False, **args)

Group fizzled fireworks for a specified time range by a specified key. :param group_by: (str) Database field used to group fireworks items. :param query_start: (str) The start time (inclusive) to query in isoformat (YYYY-MM-DDTHH:MM:SS.mmmmmm). Default is 30 days before current time. :param query_end: (str) The end time (exclusive) to query in isoformat (YYYY-MM-DDTHH:MM:SS.mmmmmm). Default is current time. :param query: (dict) Additional Pymongo queries to filter entries for process. :param include_ids: (bool) If return fw_ids. Default is False. :param args: (dict) Time difference to calculate query_start from query_end. Accepts arguments in python datetime.timedelta function. args and query_start can not be given at the same time. Default is 30 days. :return: (list) A summary of fizzled fireworks for group by the specified key.

identify_catastrophes(error_ratio=0.01, query_start=None, query_end=None, query=None, time_field=u'time_end', include_ids=True, **args)

Get days with higher failure ratio :param error_ratio: (float) Threshold of error ratio to define as a catastrophic day :param query_start: (str) The start time (inclusive) to query in isoformat (YYYY-MM-DDTHH:MM:SS.mmmmmm). Default is 30 days before current time. :param query_end: (str) The end time (exclusive) to query in isoformat (YYYY-MM-DDTHH:MM:SS.mmmmmm). Default is current time. :param query: (dict) Additional Pymongo queries to filter entries for process. :param time_field: (str) The field to query time range. Default is “time_end”. :param include_ids: (bool) If return fw_ids. Default is False. :param args: (dict) Time difference to calculate query_start from query_end. Accepts arguments in python datetime.timedelta function. args and query_start can not be given at the same time. Default is 30 days. :return: (list) Dates with higher failure ratio with optional failed fw_ids.

Module contents