hstlc Package Modules¶
database.database_interface module¶
This module serves as the interface and connection module to the hstlc
database. The load_connection() function within allows the user
to conenct to the database via the session, base, and
engine objects (described below). The classes within serve as the
object-relational mappings (ORMs) that define the individual tables of
the database, and are used to build the tables via the base object.
The engine object serves as the low-level database API and perhaps
most importantly contains dialects which allows the sqlalchemy module
to communicate with the database.
The base object serves as a base class for class definitions. It
produces Table objects and constructs ORMs.
The session object manages operations on ORM-mapped objects, as
construced by the base. These operations include querying, for
example.
Authors:
Matthew Bourque
Use:
This module is intended to be imported from various hstlc modules and scripts. The objects that are importable from this module are as follows:
from lightcurve_pipeline.database.database_interface import engine
from lightcurve_pipeline.database.database_interface import base
from lightcurve_pipeline.database.database_interface import session
from lightcurve_pipeline.database.database_interface import Metadata
from lightcurve_pipeline.database.database_interface import Outputs
from lightcurve_pipeline.database.database_interface import BadData
from lightcurve_pipeline.database.database_interface import Stats
Dependencies:
- Users must have access to the hstlc database
- Users must also have a
config.yamlfile located in thelightcurve_pipeline/utils/directory with the following keys:
db_connection_string- The hstlc database connection string
- Other external library dependencies include:
pymysqlsqlalchemylightcurve_pipeline
-
lightcurve_pipeline.database.database_interface.load_connection(connection_string, echo=False)¶ Create and return a connection to the database given in the connection string.
Parameters: connection_string : str
A string that points to the database conenction. The connection string is in the following form:
dialect+driver://username:password@host:port/databaseecho : bool
Show all SQL produced
Returns: session : sesson object
Provides a holding zone for all objects loaded or associated with the database.
base : base object
Provides a base class for declarative class definitions.
engine : engine object
Provides a source of database connectivity and behavior.
-
lightcurve_pipeline.database.database_interface.get_session()¶ Return the
sessionobject of the database connectionIn many cases, all that is needed is the
sessionobject to interact with the database. This function can be used just to establish a connection and retreive thesessionobject.Returns: session : sqlalchemy.orm.session.Session
Provides a holding zone for all objects loaded or associated with the database.
database.update_database module¶
This module serves as an interface for updating the various tables of the hstlc database, either by inserting new records, or updating existing ones
Authors:
Matthew Bourque
Use:
This module is intended to be imported from the various hstlc scripts, as such:
from lightcurve_pipeline.database.update_database import update_bad_data_table
from lightcurve_pipeline.database.update_database import update_metadata_table
from lightcurve_pipeline.database.update_database import update_stats_table
from lightcurve_pipeline.database.update_database import update_outputs_table
Dependencies:
- Users must have access to the hstlc database
- Users must also have a
config.yamlfile located in thelightcurve_pipeline/utils/directory with the following keys:
db_connection_string- The hstlc database connection string
- Other external library dependencies include:
pymysqlsqlalchemylightcurve_pipeline
-
lightcurve_pipeline.database.update_database.update_bad_data_table(filename, reason)¶ Insert or update a record pertaining to the filename in the
bad_datatableParameters: filename : string
The filename of the file
reason : string
The reason that the data is bad. Can either be
No events,Bad EXPFLAG,Non-linear time,Singular event,Bad Proposal, orShort Exposure.
-
lightcurve_pipeline.database.update_database.update_metadata_table(metadata_dict)¶ Insert or update a record in the metadata table containing the
metadata_dictinformationParameters: metadata_dict : dict
A dictionary containing metadata of the file. Each key of the
metadata_dictcorresponds to a column in the matadata table of the database.
-
lightcurve_pipeline.database.update_database.update_outputs_table(metadata_dict, outputs_dict)¶ Insert or update a record in the outputs table containing output product information
Parameters: metadata_dict : dict
A dictionary containing metadata of the file. Each key of the
metadata_dictcorresponds to a column in the matadata table of the database.outputs_dict : dict
A dictionary containing output product information. Each key of the
outputs_dictcorresponds to a column in the outputs table of the database.
-
lightcurve_pipeline.database.update_database.update_stats_table(stats_dict, dataset)¶ Insert or update a record in the stats table for the given dataset containing the lightcurve product statistics given in the
stats_dictParameters: stats_dict : dict
A dictionary containing the lightcurve statistics. Each key of
stats_dictcorresponds to a column in the stats table of the database.dataset : string
The path to the lightcurve product
ingest.make_lightcurves module¶
ingest.resolve_target module¶
This module contains functions that attempt to resolve target names
(i.e. TARGNAME) to a more common option, if possible. The method
for doing this is as follows:
- Look up the
targnamefrom the hard-codedtargname_dictdictionary. If it exists, then use thattargname- If no dictionary entry exists, look up the
targnamein the CDS web service[1]- If the CDS web service returns resolved target names, and one of those target names already exists in the
metadatatable, then use thattargname- If the
targnamecannot be resolved through any of these steps, then use the originaltargname
The hard-coded targname_dict dictionary resides in the
utils.targname_dict module.
Authors:
Justin Ely, Matthew Bourque
Use:
This module is intended to be imported from and used by theingest_hstlcscript as such:
from lightcurve_pipeline.ingest.resolve_target import get_targname
get_targname(targname)
Dependencies:
- Users must have access to the CDS web service
- Users must have access to the hstlc database
- Users must have a
config.yamlfile located in thelightcurve_pipeline/utils/directory with the following keys:
db_connection_string- The hstlc database connection string
- Other external library dependencies include:
pymysqlsqlalchemylightcurvelightcurve_pipeline
References:
[1] Centre de Donnees astronomiques de Strasbourg (http://cdsweb.u-strasbg.fr/)
-
lightcurve_pipeline.ingest.resolve_target.get_targname(targname)¶ Resolve the
targnameto a better option, if available. If thetargnamecannot be resolved, the originaltargnameis returned.Parameters: targname : str
The name of the target
Returns: new_targname: str
The resolved target name
-
lightcurve_pipeline.ingest.resolve_target.resolve(targname)¶ Resolve target name via the CDS web service
Parameters: targname : str
The name of the target
Returns: other_names : set
set of resolved other names
quality.data_checks module¶
Perform data quality checks for the given dataset. The dataset is checked for a number of issues, which include:
- A non-normal
EXPFLAG- indicating that something went wrong during the observation - A non-linear time column in which time does not progress linearly
through the
TIMEcolumn in the dataset - A dataset not having any events
- A dataset in which all events occur at a single time
- A dataset that is part of a problematic proposal
- A dataset with an exposure time that is too short
Datasets that do not pass these checks are moved to the
bad_data_dir, as determined by the config file (see below)
Authors:
Justin Ely, Matthew Bourque
Use:
This module is intended to be imported and used by theingest_hstlcscript as such:
from lightcurve_pipeline.quality.data_checks import dataset_ok
dataset_ok(dataset)
Dependencies:
- Users must have access to the hstlc database
- Users must also have a
config.yamlfile located in thelightcurve_pipeline/utils/directory with the following keys:
db_connection_string- The hstlc database connection stringbad_data_dir- The directory in which bad data files are stored
- Other external library dependencies include:
astropylightcurve_pipelinepymysqlsqlalchemy
-
lightcurve_pipeline.quality.data_checks.check_bad_proposal(hdu)¶ Check that the proposal ID is not in a list of known ‘bad’ programs. Programs can be bad for a number of reasons, typically because of specialized calibration purposes like focus sweeps or high-voltage tests.
Parameters: hdu : astropy.io.fits.hdu.hdulist.HDUList
The hdulist of the dataset
Returns: success : boolean
Trueif events are not from a known bad proposal,Falseotherwisereason : string
An empty string if success is
True,Bad Proposalotherwise
-
lightcurve_pipeline.quality.data_checks.check_expflag(hdu)¶ Check that the
EXPFLAGkeyword isNORMALParameters: hdu : astropy.io.fits.hdu.hdulist.HDUList
The hdulist of the dataset
Returns: success : boolean
Trueif theEXPFLAGisNORMAL,Falseotherwisereason : string
An empty string if success is
True,Bad EXPFLAGotherwise
-
lightcurve_pipeline.quality.data_checks.check_exptime(hdu)¶ Check that the dataset exptime is not too short. The threshold is initially set to 1 second to filter out a small subset of very short exposures.
Parameters: hdu : astropy.io.fits.hdu.hdulist.HDUList
The hdulist of the dataset
Returns: success : boolean
Trueif exptime greater than threshold,Falseotherwisereason : string
An empty string if success is
True,Short Exposureotherwise
-
lightcurve_pipeline.quality.data_checks.check_linear(hdu)¶ Check that the time column linearly progresses
Parameters: hdu : astropy.io.fits.hdu.hdulist.HDUList
The hdulist of the dataset
Returns: success : boolean
Trueif time progressing linearly,Falseotherwisereason : string
An empty string if success is
True,Non-linear timeotherwise
-
lightcurve_pipeline.quality.data_checks.check_no_events(hdu)¶ Check that the dataset has events
Parameters: hdu : astropy.io.fits.hdu.hdulist.HDUList
The hdulist of the dataset
Returns: success : boolean
Trueif the dataset has events,Falseotherwisereason : string
An empty string if success is
True,No eventsotherwise
-
lightcurve_pipeline.quality.data_checks.check_not_singular(hdu)¶ Check that the events in the dataset are not from a single time
Parameters: hdu : astropy.io.fits.hdu.hdulist.HDUList
The hdulist of the dataset
Returns: success : boolean
Trueif events are not from a single time,Falseotherwisereason : string
An empty string if success is
True,Singular eventotherwise
-
lightcurve_pipeline.quality.data_checks.dataset_ok(filename, move=True)¶ Perform quality check on the given dataset, and update the
bad_datatable and move the dataset to thebad_datadirectory if it doesn’t passParameters: filename : string
The full path to the dataset
move : bool, optional
Whether or not to update the
bad_datatable and move the fileReturns: Trueif the dataset passes all of the quality checks,Falseif it doesn’t.
-
lightcurve_pipeline.quality.data_checks.move_file(filename)¶ Move the given dataset to the
bad_datadirectoryParameters: filename : string
The full path to the dataset
utils.periodogram_stats module¶
Generate lomb-scargle periodogram statistics. The periodogram
statistics are used in the stats table in the hstlc database as
well as the periodogram plots generated by the make_hstlc_plots
script.
Authors:
Matthew Bourque
Use:
This module is intended to be imported and used by thebuild_stats_tablescript as such:
from lightcurve_pipeline.utils.periodogram_stats import get_periodogram_stats
periods, power, mean, three_sigma, significant_periods, significant_powers = get_periodogram_stats(dataset, freq_space)
Dependencies:
- External library dependencies include:
astropynumpyscipy
-
lightcurve_pipeline.utils.periodogram_stats.get_periodogram_stats(dataset, freq_space)¶ Find significant periods from the given dataset and frequency space using a lomb-scargle periodogram.
Parameters: dataset : string
The path to the lightcurve product.
freq_space : string
Can either be
short,med, orlong. This defines the frequency space to look for significant periods.shortis defined as the range (STEPSIZE, 10 minutes),medis (10 minutes, 1 hour), andlongis (1 hour, 10 hours).Returns: periods : numpy array
An array of the periods to check
power : numpy array
An array of the lomb-scargle powers corresponding to each period
mean : float
The mean of the lomb-scargle powers
std : float
The standard deviation of the lomb-scargle powers
three_sigma : float
Three standard deviations above the mean of the lomb-scargle powers
significant_periods : list
A list of the periods that have powers greater than 3-sigma about the mean
significant_powers : list
A list of the lomb-scargle powers that are greater than 3-sigma about the mean
utils.targname_dict module¶
Define a targname_dict that is used as a look up table for
resolving target names. The targname_dict is comrpised of two
major sets of key/value pairs:
- Target names that need resolving
- Target names that do not need resolving
The target names that need resolving are typically those that use a common name, but have variations dealing with hyphens (e.g. AZV-148 instead of AZV148) or indexing (e.g. SATURN1 instead of SATURN). The target names that don’t need resolving are ones that already have established common names (e.g. CALLISTO), or incredibly unique names that will never be resolved to a common name (e.g. 1507476-162738).
Authors:
Matthew Bourque
Use:
This module is intended to be imported and used by theresolve_targetmodule as such:
from lightcurve_pipeline.utils.targname_dict import targname_dict
utils.utils module¶
This module houses several functions that are key to several modules and scripts within the hstlc package. Please see the individual function documentation for more information.
Authors:
Matthew Bourque
Use:
This functions within this module are intended to be imported by the various hstlc scripts and modules, as such:
from lightcurve_pipeline.utils.utils import SETTINGS
from lightcurve_pipeline.utils.utils import insert_or_update
from lightcurve_pipeline.utils.utils import set_permissions
from lightcurve_pipeline.utils.utils import setup_logging
from lightcurve_pipeline.utils.utils import make_directory
Dependencies:
- External library dependencies include:
astropylightcurve_pipelinenumpypymyslqsqlalchemy
-
lightcurve_pipeline.utils.utils.get_settings()¶ Return the setting information located in the configuration file located in the
lightcurve_pipeline/utils/directoryReturns: data : dict
A dictionary containing the settings present in the config.yaml configuration file. Thus, the keys of this dictionary presumably are:
db_connection_stringingest_dirfilesystem_diroutputs_dircomposite_dirlog_dirdownload_dirplot_dirbad_data_dirhome_dir
The values of the keys are the user-supplied configurations
-
lightcurve_pipeline.utils.utils.insert_or_update(table, data, id_num)¶ Insert or update the given database table with the given data. This function performs the logic of inserting or updating an entry into the hstlc database; if an entry with the given
id_numalready exists, then the entry is updated, otherwise a new entry is inserted.Parameters: table : sqlalchemy.ext.declarative.api.DeclarativeMeta
The table of the database to update
data : dict
A dictionary of the information to update. Each key of the dictionary must be a column in the given table
id_num : string
The row ID to update. If
id_numis blank, then a new row is inserted instead.
-
lightcurve_pipeline.utils.utils.make_directory(directory)¶ Create a directory if it doesn’t already exist and set the hstlc permissions
Parameters: directory : string
The path to the directory
-
lightcurve_pipeline.utils.utils.set_permissions(path)¶ Set the permissions of the file path to hstlc permissions settings. The hstlc permissions settings are groupID =
hstlcand permissions ofrwxrwx---.Parameters: path : string
The path to the file
-
lightcurve_pipeline.utils.utils.setup_logging(module)¶ This function will configure the logging for the execution of the given module. Logs are written out to the
log_dirdirectory (as determined by theconfig.yamlfile) with the filename<module>_<timestamp>.log.Parameters: module : string
The name of the module to log