Output Files¶

Outputs are specified in an extremely similar manner to options:

outputs=[
    [
        'test_output',
        'Main output file of this utility',
        {
            'validate': 'File/Output',
            'required': True,
            'default': 'ggo_out.complex',
            'data_format': 'text/tabular',
            'default_format': 'TSV_U',
        }
    ]
],

As you’ll notice there are two options specified which are entirely redundant and pointless, but until a future release must be specified: validate and required.

The outputs variable must be a list of 3-member lists. The first parameter is the name of the command line flag, if it is used. The second parameter is the description of the file, used in the label in galaxy and on the command line:

Output Files:
  --test_output [TEST_OUTPUT]
                        Main output file of this utility (Default:
                        ggo_out.complex) [Required]
  --test_output_format [TEST_OUTPUT_FORMAT]
                        Associated Format for test_output (Default: TSV_U)
                        [Required]

As noted above, for every output file an additional $ofname_format parameter is generated. The format handler specified internally determines the extension which need not be specified in --test_output.

These files can be made use of through the galaxygetopt.outputfiles module.

from galaxygetopt.outputfiles import OutputFiles
of = OutputFiles(name='test_output', GGO=c)
data = {
    'Sheet1': {
        'header': ['Key', 'Value'],
        'data': [
            ['A1', 'B1'],
            ['A2', 'B2'],
        ]
    }
}
of.CRR(data=data)

Here we see a tabular data structure used. The OutputFiles class requires access to the GalaxyGetOpt object in order to know which outputs have been pre-declared. In the invocation

of = OutputFiles(name='test_output', GGO=c)

The name parameter references the first element in the list of the declared output files. This gives OutputFiles access to the default format, path information, and two variables completely hidden from user and developer: test_output_id and test_output_files_path, both of which are available in galaxy.

Output Formats¶

Format	Name	Available Handlers
Plain text	`text/plain`	TXT, CONF
Tabular Data	`text/tabular`	TSV, TSV_U, CSV, CSV_U

Planned handlers/formats

Format	Name	Available Handlers
Tabular Data	`text/tabular`	XLS, XLSX, ODS, JSON, YAML
HTML Data	`text/html`	HTML (any via pandoc?)
Archives	`archive`	tar.gz, zip, tar
Genomic Data	TBD	BioPython supported formats

Plain Text¶

Simply a string. Put whatever you want in it

data = """Hello,
Word"""

Tabular Data¶

The top level object is a dict, with names being sheet names. It is recommended that it match r'[A-Za-z0-9-]+', though that is not strictly enforced currently. This concept is used to represent NxMxO dimensional data, where NxM represents a single table or sheet.

Each sheet consists of a dict containing two keys: ‘headers’ and ‘data’. Headers should contain a list of strings, and data should contain a list of lists. Every value will be coerced into a string.

data = {
    'Sheet1': {
        'header': ['Key', 'Value'],
        'data': [
            ['some_key', 42],
        ]
    }
}

Multiple Output Files¶

Sometimes, you will not know many files you need to produce until runtime. The following examples will use this output

['test_output','Main output file of this utility',
    {
        'validate': 'File/Output',
        'required': True,
        'default': 'ggo_out.complex',
        'data_format': 'text/plain',
        'default_format': 'TXT',
    }
]

Separate History Items¶

This can be accomplished via varCRR:

of = OutputFiles(name='test_output', GGO=c)
for i in range(10):
    data = "file %s" % (i,)
    of.varCRR(data=data, filename="file-" + str(i))

It is the developer’s responsibility to ensure filename is specified and unique. At the CLI, this translates into files named file-0.txt to file-9.txt. Under the galaxy environment, these translate into names like $__new_file_path__/primary_10000_file-2_visible_txt. Be careful not to use underscores as it will completely screw up filenames! We currently don’t test for this, though that will likely be introduced in another release

Single History Item¶

of = OutputFiles(name='test_output', GGO=c)
for i in range(10):
    data = "file %s" % (i,)
    of.subCRR(data=data, filename="file-" + str(i))

Sub files are contained in a folder, the folder name comes from --test_output_files_path. By default this is sub.files_path. The names are accessible to you as a list in the return value of subCRR in order to facilitate proper linking. Support for this will be improved in the future.

Output Files¶

Output Formats¶

Plain Text¶

Tabular Data¶

Multiple Output Files¶

Separate History Items¶

Single History Item¶

Table Of Contents

Previous topic

Next topic

This Page

Navigation

Output Files¶

Output Formats¶

Plain Text¶

Tabular Data¶

Multiple Output Files¶

Separate History Items¶

Single History Item¶

Table Of Contents

Previous topic

Next topic

This Page

Quick search

Navigation