ngcloud.report module

class ngcloud.report.Stage(job_info, report_root)[source]

base class of NGCloud stage of a report.

A stage in report usually maps to a NGS tool used in a pipeline, and this tool generates meaningfull output of log files that are need for further analysis.

Stage is a connection between Jinja2’s template system with NGS results. Therefore one can write the logics about how to extract information out of a tool(stage) output and pass those information to Jinja2’s template with some variables.

Examples

A minimal stage for rendering single template, no static file copying:

class SimpleStage(Stage):
    template_find_paths = ['path/to/templates']
    template_entrances = 'simple.html'
    # so it render template 'path/to/templates/simple.html'

With static file copying

class CopyStaticStage(Stage):
    template_find_paths = ['path/to/templates']
    template_entrances = 'copy_static.html'

    # all related result files are under
    # <result_root>/copy_static/
    result_foldername = 'copy_static'
    embed_result_joint = [{
        'src': 'joint',
        'patterns': ['foo'],
        'dest': 'copy_static'
    }]
    # will copy <result_root>/copy_static/joint/foo
    #   -> <report_root>/static/copy_static/foo

    embed_result_persample = [{
        'src': 'each',
        'patterns': ['baz']
        'dest': 'copy_static'
    }]
    # for every sample <sample>, copy
    # <result_root>/copy_static/each/<sample>/baz
    #   -> <report_root>/static/copy_static/<sample>/baz

Attributes

template_entrances ((list of) str) Template names passed to Jinja2 to call render()
template_find_paths ((list of) path-like object) Paths in order to load templates
report_root (Path object) Path to where report will be under
embed_result_joint (list of dict) Embeded joint static file copying description
embed_result_persample (list of dict) Embedded per sample static file copying description
result_foldername (str) Folder name to the NGS result of this stage
job_info (JobInfo object) Information about how the NGS result is run
result_info (dict object) Key-value pairs storing parsed NGS result

Methods

copy_static() Copy stage-specific static files under report folder.
copy_static_joint() Copy static files that are jointly produced from all samples.
copy_static_persample() Copy statics file that are spearately produced by each sample.
parse() Parse the NGS result and store in self.result_info
render() Render the templates of this stages and return HTML output.
template_entrances = ['stage.html']

Name of templates that will trigger render().

In most cases, there is only one entry point, so a stage corresponds to one HTML page in report. However, if this attribute contains a list of template names then multiple HTML pages will be produced.

template_find_paths = ['report/templates']

Path to root of templates.

These paths are passed to jinja2.FileSystemLoader in order. Generally if one is going to extend a NGCloud pipeline, then one shoud supply the NGCloud’s template root path and custom templates path. See Extend Builtin Pipeline for more info.

embed_result_joint = []

List of description for joint result embedded into report.

Each element of list (say desc) is a dict having structure:

desc = {
    'src': 'in_result',
    'patterns': ['foo*', '**/bar'],
    'dest': 'in_report/deep'
}
embed_result_joint = [descA, descB, ...]
  • src: path appended after self.result_root
  • patterns: list of file patterns support *, **, ? globbing syntax
  • dest: path appended under report static files, usually <report_root>/static/

All files matching patterns under src will be copied to dest.

copy_static_joint() accesses this attribute.

New in version 0.3.

embed_result_persample = []

List of description for per sample result embedded into report.

Its structure is similar to embed_result_persample while how it works is quite different. If we use the above example:

desc = {
    'src': 'in_result',
    'patterns:': ['foo*', '**/bar'],
    'dest': 'in_report/deep'
}
embed_result_persample = [descA, descB, ...]

They differ in how NGCloud find the paths:

  • src: path appended after self.result_root plus sample’s full name
  • dest: path appended under report static files plus samples’s full name.

And whole procedure will be performed for each sample, so one will get multiple sets of files matching patterns.

copy_static_persample() accesses this attribute.

New in version 0.3.

result_foldername = ''

Folder name to the result of this stage.

Number prefix of the folder can be excluded. Therefore, setting “mystage” can recognized all following folders: mystage, 3_mystage, 05_mystage.

If an unique folder matching the pattern is found, the path to this folder is stored in self.result_root.

Otherwise, ValueError is raised if none or more than two matched folder are found.

New in version 0.3.

__init__(job_info, report_root)[source]

Initiate a Stage object.

Here NGS result info and path to gerenerate report is passed.

job_info = job_info

JobInfo object.

report_root = report_root

Path object.

result_info = dict()

A dict object to store NGS result info. See parse() for more information.

Note

Key names should follow Python argument naming rule.

result_root = join_info.root_path + result_foldername

Path object to the stage result root folder.

This attribute is automatically set by finding the matched foldername based on result_foldername

render()[source]

Render the templates of this stages and return HTML output.

It calls each template’s render() functio with arguments self.job_info, self.result_info and unpacked self.result_info. So one can access the following variables in their templates:

  • job_info
  • result_info
  • keys of result_info

Internally, it calls jinja2.Template.render().

How it works can be conceptually viewed as:

return tpl.render(
    job_info=self.job_info, result_info=self.result_info,
    **self.result_info)

Each templates specified in template_entrances will be rendered. The HTML output will be stored in a dict with key using its filename:

{'stage.html': '<html>...</html>'}
Returns:

dict object

Key-value pairs that maps entrance template name to rendered template HTML content.

Examples

If a stage has NGS results parsed,

>>> mystage = Stage()
>>> mystage.result_info.update(
...     {'map_rate': '0.556', 'idfy_genes': '633'})
>>> mystage.render()

The arugments passed to Jinja2’s render() are:

tpl.render(job_info=mystage.job_info, result_info=result_info,
           map_rate='0.556', idfy_genes='633')
parse()[source]

Parse the NGS result and store in self.result_info

By default, no action is taken in this method.

copy_static()[source]

Copy stage-specific static files under report folder.

It calls copy_static_joint() and copy_static_persample() to copy static files. Their behavior differ slightly.

  • copy_static_joint() copy those files that are jointly produced based on all samples. So those files should uniquely exist in this stage.

    For example, differential expression comparison for each pairs of samples. The comparison result no longer belongs to any sample alone but stage-wide.

    It takes the file description from embed_result_joint.

  • copy_static_persample() copy those files of each sample. If there are total 4 samples, then there should be 4 sets of such files.

    For example, quality check stage will check each sample and produce their own quality information.

    It takes the file description from embed_result_persample.

By default, nothing will be copied because both embbed_result are default to empty list.

Notes

If a stage doesn’t not need to copy any static files or one type of the static files is not required, one could simply passing a empty list to embed_result_joint or embed_result_persample to skip copying. There is no need to override this function.

Changed in version 0.3: Add default behavior

copy_static_joint()[source]

Copy static files that are jointly produced from all samples.

It reads the result information to be embedded from attribute embed_result_joint.

For each description dict, say desc, in the list embed_result_joint, it finds files with each globbing pattern:

<self.result_root>/desc['src']/desc['patterns']

and copies them to:

<self.report_root>/static/desc['dest']/

Examples

For stage

class JointStage(Stage):
    embed_result_joint = [{
        'src': 'from',
        'patterns': ['*.jpg', 'sub_dir/bar'],
        'dest': 'to'
    }, {
        'src': 'deep/src',
        'patterns': 'foo',
        'dest': 'deeper/alt'
    }]

the static files are mapped:

<result_root>/from/*.jpg       -> <report>/static/to/*.jpg
<result_root>/from/sub_dir/bar -> <report>/static/to/bar
<result_root>/deep/src/foo     -> <report>/static/deeper/alt/foo

New in version 0.3.

copy_static_persample()[source]

Copy statics file that are spearately produced by each sample.

It reads the result information to be embedded from attribute embed_result_persample.

For each description dict, say desc, in the list embed_result_persample, it finds files with each globbing patterns for each sample:

<self.result_root>/desc['src']/sample.full_name/desc['patterns']

and copies them to:

<self.report_root>/static/desc['dest']/sample.full_name/

Examples

If we have samples with full name: A_R1, and B. For stage

class PerSampleStage(Stage):
    embed_result_persample = [{
        'src': 'from',
        'patterns': ['foo', 'bar'],
        'dest': 'to'
    }]

the static files are mapped:

<result_root>/from/A_R1/foo -> <report>/static/to/A_R1/foo
<result_root>/from/A_R1/bar -> <report>/static/to/A_R1/bar
<result_root>/from/B/foo    -> <report>/static/to/B/foo
<result_root>/from/B/bar    -> <report>/static/to/B/bar

New in version 0.3.

class ngcloud.report.Report[source]

NGCloud report base class of every pipeline.

To combind custom pipeline with ngreport, __init__() signature must match Report. Setup the custom logics in template_config()

Raises:

TypeError

When initiate this class directly, or subclass does not implement template_config()

Attributes

job_info (Path object)
out_dir (Path object)
report_root (Path object)
stage_classnames (list of class) List of stage class name in order used in for this pipeline report.
static_roots (Path object) Path to the template static file dir

Methods

copy_static() Copy template statics files to output dir.
generate(job_dir, out_dir) Render a report and output to given directory.
output_report() Output rendered htmls to output directory.
render_report() Put real results into report template and return rendered html.
template_config() Setup configuration for report templates.
stage_classnames = [<class 'ngcloud.report.Stage'>]

(List of class name) Store the sequence of stages in use.

Specify names of subclass of Stage. One only needs to pass names of the stage class, don’t initiate the stage class.

stage_classnames = [IndexStage, QCStage]

See Stage for how to write a new stage class

static_roots = ['']

(list of) path-like object to root dir of report static files, such as JS, CSS files for making html pages.

For example,

static_roots = Path('my/report/static')

where below my/report/static has needed js, css, img files.

A common case will be to extend existed pipelines, then both shared static files and custom static files can be uesd by giving a list of paths to root of static files.

from ngcloud.pipe import get_shared_static_root  # get builtin static
static_roots = [get_shared_static_root(), '/path/to/my/static']

See Extend Builtin Pipeline for more inforation.

__init__()[source]

Call template_config(). Don’t override me.

generate(job_dir, out_dir)[source]

Render a report and output to given directory.

The whole process breaks down into follwoing parts:

  1. read NGS result as JobInfo
  2. render report, covered by render_report()
  3. copy template-related static files such as JS and CSS into output dir, covered by copy_static()
  4. copy stage-related static files into output dir. Call each Stage.copy_static() respectively
  5. output rendered reports into output dir, covered by output_report()

Warning

Override this function with care. You might break the logic.

render_report()[source]

Put real results into report template and return rendered html.

copy_static()[source]

Copy template statics files to output dir.

Files under each path specifed by static_roots will be copied to folder static below report_root.

output_report()[source]

Output rendered htmls to output directory.

No original data is involved, just some file I/Oing.

template_config()[source]

Setup configuration for report templates.

One could also put the extra logics here for custom report, since this function will always be called by __init__()

ngcloud.report.gen_report(pipe_report_cls, job_dir, out_dir)[source]

Generate a NGCloud report.

For normal usage, one can use ngreport command instead of calling this Python function directly.

Parameters:

pipe_report_cls: str

Name of the Python class to generate the report of certain pipeline.

job_dir: path-like object

out_dir: path-like object

Notes

To extend NGCloud with your custom pipeline, inherit Report and call this function manually.

ngcloud.report.main()[source]

Store the logics for ngreport.

If one wants to use ngreport‘s functionality, try calling gen_report() not this function.