hstlc Scripts¶
ingest_hstlc script¶
build_stats_table script¶
make_hstlc_plots script¶
Create various plots that deal with the hstlc filesystem, database,
and output products. This script uses multiprocessing. Users can set
the number of cores used via the num_cores setting in the config
file (see below).
Authors:
Justin Ely, Matthew Bourque
Use:
This script is intended to be executed as part of the
hstlc_pipelineshell script. However, users can also execute this script via the command line as such:>>> make_hstlc_plots
Outputs:
hlsp_hstlc_*.pngstatic lightcurve plots for each composite lightcurve, placed in thecomposite_dirdirectory, as determined by the config file (see below)hlsp_hstlc_*.htmlbokeh plots showing a ‘dashboard’ of various plots for each composite lightcurve, placed in thecomposite_dirdirectory, as determined by the config file (see below)interesting_hstlc.html,boring_hstlc.html, andnull_hstlc.html‘exploratory’ tables, which are sortable tables that display statistics and plots for each dataset, placed in theplot_dirdirectory, as determined by the config file (see below)exptime_histogram.html- A histrogram showing the cumulative exposure time by target in the form of a bokeh plot, placed in theplot_dirdirectory, as determined by the config file (see below)pie_config_cos_fuv.html,pie_config_cos_nuv.html,pie_config_stis_fuv.html, andpie_config_stis_nuv.html‘configuration’ pie charts that show the breakdown of datasets by grating/cenwave for each instrument/detector combination, placed in theplot_dirdirectory, as determined by the config file (see below)opt_elem.html- a historgram showing the number of datasets for each filter, placed in theplot_dirdirectory, as determined by the config file (see below)<dataset name>_periodgram.png- Lomb-Scargle periodograms for each dataset (both individual and composite), placed in theplot_dirdirectory, as determined by the config file (see below). Additionally, periodograms that are deemed interesting are saved in a separateperiodogram_subsetdirectory under theplot_dirdirectory.- a log file in the
log_dirdirectory as determined by the config file (see below)
Dependencies:
- Users must have access to the hstlc database
- Users must also have a
config.yamlfile located in thelightcurve_pipeline/utils/directory with the following keys:
db_connection_string- The hstlc database connection stringplot_dir- The path to where hstlc output plots are storedcomposite_dir- The path to where hstlc composite output products are storedlog_dir- The path to where the log file will be storednum_cores- The number of cores to use during multiprocessing
- Other external library dependencies include:
astropybokehlightcurve_pipelinematplotlibnumpypymysqlsqlalchemy
-
lightcurve_pipeline.scripts.make_hstlc_plots.bar_opt_elem()¶ Create a bar chart showing the number of composite lightcurves for each COS & STIS optical element
-
lightcurve_pipeline.scripts.make_hstlc_plots.configuration_piechart()¶ Create a piechart showing distribution of configurations for each imstrument/detector combination
-
lightcurve_pipeline.scripts.make_hstlc_plots.dataset_dashboard(filename, plot_file='')¶ Creates interactive bokeh ‘dashboard’ plot for the given filename
Parameters: filename : str
The path to the lightcurve
plot_file : str
The path to the PNG plot. The user can supply this argument if they wish to update the plot or save to a specific location.
-
lightcurve_pipeline.scripts.make_hstlc_plots.exploratory_tables()¶ Create html tables containing data from the stats table as well as plots, broken down into interesting, boring, and null results
-
lightcurve_pipeline.scripts.make_hstlc_plots.histogram_exptime()¶ Create a histogram showing the distribution of exposure times for the composite lightcurves
-
lightcurve_pipeline.scripts.make_hstlc_plots.main()¶ The main function of the
make_hstlc_plotsscript
-
lightcurve_pipeline.scripts.make_hstlc_plots.make_exploratory_table(dataset_list, table_name)¶ Create html tables containing data from the stats table as well as plots
Parameters: dataset_list : list
A list of the paths to the datasets to process
table_name : str
The path to the output file
-
lightcurve_pipeline.scripts.make_hstlc_plots.periodogram(dataset)¶ Create a Lomb-Scargle periodgram for the given dataset
Parameters: dataset : string
The path to the dataset
-
lightcurve_pipeline.scripts.make_hstlc_plots.plot_dataset(filename, plot_file='')¶ Create an interactive bokeh lightcurve plot for the given filename
Parameters: filename : str
The path to the lightcurve
plot_file : str
The path to the PNG plot. The user can supply this argument if they wish to update the plot or save to a specific location
-
lightcurve_pipeline.scripts.make_hstlc_plots.plot_dataset_static(filename, plot_file='')¶ Creates static PNG lightcurve plot for the given filename
Parameters: filename : str
The path to the lightcurve
plot_file : str
The path to the PNG plot. The user can supply this argument if they wish to update the plot or save to a specific location
reset_hstlc_filesystem script¶
Reset the hstlc filesystem by moving files back into the ingestion
directory. Files are moved from the filesystem_dir directory to
the ingest_dir directory, as determined by the config file (see
below). Additionally, output products located in the outputs_dir
directory, as determined by the config file (see below) are removed.
Authors:
Matthew Bourque
Use:
This script is intended to be executed via the command line as such:
>>> reset_hstlc_filesystem
Dependencies:
Users must have a
config.yamlfile located in thelightcurve_pipeline/utils/directory with the following keys:
ingest_dir- The path to where files to be ingested are storedfilesystem_dir- The path to the hstlc filesystemoutputs_dir- The path to where hstlc output products are stored
- Other external library dependencies include:
lightcurve_pipeline
-
lightcurve_pipeline.scripts.reset_hstlc_filesystem.main()¶ The main function of the
reset_hstlc_filesystemscript
-
lightcurve_pipeline.scripts.reset_hstlc_filesystem.move_files_to_ingest()¶ Move files from filesystem back to the ingestion directory. If the file already exists in the ingest directory, the file is removed rather than moved.
-
lightcurve_pipeline.scripts.reset_hstlc_filesystem.remove_filesystem_directories()¶ Remove parent directories from the filesystem if they are empty
-
lightcurve_pipeline.scripts.reset_hstlc_filesystem.remove_output_directories()¶ Remove all output products and output directories
reset_hstlc_database script¶
Reset all or specific tables in the hstlc database
Authors:
Matthew Bourque
Use:
This script is intended to be executed via the command line as such:
>>> reset_hstlc_database [table]
table(optional) - Reset the specific table given. Can be any valid table that exists in the hstlc database,allin which all tables will be reset, orproductionin which only themetadata,outputs, andstatstables will be reset. If an argument is not provided, the default value ofproductionis used.
Dependencies:
- Users must have access to the hstlc database
- Users must also have a
config.yamlfile located in thelightcurve_pipeline/utils/directory with the following keys:
db_connection_string- The hstlc database connection stringhome_dir- The home hstlc directory, where thebad_datatable will be stored in a text file
- Other external library dependencies include:
lightcurve_pipelinepymysqlsqlalchemy
-
lightcurve_pipeline.scripts.reset_hstlc_database.get_valid_tables()¶ Return a list of table names in the hstlc database
Returns: tables : list
A list of hstlc table names
-
lightcurve_pipeline.scripts.reset_hstlc_database.main()¶ The main function of the
reset_hstlc_databasescript
-
lightcurve_pipeline.scripts.reset_hstlc_database.parse_args()¶ Parse command line arguments
Returns: args : argparse object
An argparse object containing the arguments
-
lightcurve_pipeline.scripts.reset_hstlc_database.rebuild_production_tables()¶ Rebuild the
prodctiontables of the hstlc database, which consist of themetadata,outputs, andstatstables. Thebad_datatable is treated separately; Since thebad_datatable cannot easily be reconstructed (since bad data is not necessarily re-ingested), the data within the table is written out to a text file and re-ingested after the database is reset. This essentially results in a reset database for the production tables, but the bad data table effectively remains untouched.
download_hstlc script¶
This script retreives COS & STIS TIMETAG data from the MAST archive by
submitting XML requests. The datasets to download is determined by
comparing the contents of the hstlc database to the contents of the
MAST database; any COS/STIS TIMETAG data that exists in MAST but does
not exist in the hstlc database is retreived. Data is downloaded to
the ingest_dir directory determine by the config file (see below).
Authors:
Matthew Bourque
Use:
This script is intended to be executed via the command line as such:
>>> download_hstlc
Outputs:
The following filetypes are retreived (if available) and placed in the
ingest_dirdirectory:
*_x1d.fits- 1 dimensional extracted spectra*_tag.fits- STIS TIMETAG data*_corrtag.fits- COS NUV TIMETAG data*_corrtag_<a or b>.fits- COS FUV TIMETAG dataSubmission results are also saved to an XML file and stored in the
download_dirdirectory determined by the config file (see below). The submission results indicate if the XML request was sucessful or if there were errors.Executing this script creates a log file in the
log_dirdirectory as determined by the config file (see below)
Dependencies:
As of early 2016, submission of XML requests to the MAST archive requires a special Python 2.6 environemnt with specific XML libraries installed. More information can be found here:
https://confluence.stsci.edu/display/STScISSOPublic/ArchiveXMLsubmitPKImaterial
Additionally,
tsqlmust be installed and the tsql executable must be placed in the directory~/freetds/bin/tsql.tsqlcan be downloaded using freetds (http://www.freetds.org/).Users must have access to the hstlc database
Users must also have a
config.yamlfile located in thelightcurve_pipeline/utils/directory with the following keys:
db_connection_string- The hstlc database connection stringingest_dir- The path to where the files will be stored after retreivallog_dir- The path to where the log file will be storeddownload_dir- the path to where XML submission results will be storedmast_server- The MAST server hostnamemast_database- The name of the MAST databasemast_account- The MAST account usernamemast_password- The MAST account passwordarchive_user- The requester usernamehost- The hostname of the machine used for ftpftp_user- The username of the account of the machine used for ftpdads_host- The hostname of the machine on which the MAST database residesarchive- The HTTPs connection hostname
- Other external library dependencies include:
lightcurve_pipeline
-
lightcurve_pipeline.scripts.download_hstlc.build_xml_request(datasets)¶ Build the XML request for the given datasets
Parameters: datasets : list
A list of rootnames to download from MAST.
Returns: xml_request : string
The XML request string.
-
lightcurve_pipeline.scripts.download_hstlc.everything_retrieved(tracking_id)¶ Check every 15 minutes to see if all submitted datasets have been retrieved. Based on code from J. Ely. Parameters:
- tracking_id : string
- A submission ID string..
- Returns:
- done : bool
- Boolean specifying is data is retrieved or not.
- killed : bool
- Boolean specifying is request was killed.
-
lightcurve_pipeline.scripts.download_hstlc.get_filesystem_rootnames()¶ Return a list of the rootnames in the hstlc database.
This list is compared to the MAST database to determine which datasets to download.
Returns: filesystem_rootnames : list
A list of rootnames that are in the hstlc filesystem.
-
lightcurve_pipeline.scripts.download_hstlc.get_mast_rootnames()¶ Return a list of rootnames of all COS & STIS TIMETAGE data in MAST.
- The following target names are ignored:
- DARK BIAS DEUTERIUM WAVE ANY NONE
Returns: mast_rootnames : list
A list of rootnames of COS/STIS TIMETAG data in the MAST archive.
-
lightcurve_pipeline.scripts.download_hstlc.main()¶ The main function of the
download_hstlcscript
-
lightcurve_pipeline.scripts.download_hstlc.save_submission_results(submission_results)¶ Save the submission results to an XML file.
Submission results are saved in a separate XML file and stored in the ‘download_dir’ directory as determine by the config file.
Parameters: submission_results : httplib object
The submission results returned by MAST after the XML request is submitted.
-
lightcurve_pipeline.scripts.download_hstlc.submit_xml_request(xml_request)¶ Submit the XML request to the MAST archive.
Parameters: xml_request : string
The request XML string.
Returns: submission_results : httplib object
The XML request submission results.