clitool 0.4.1 documentation

clitool Package

«  Examples   ::   Contents

clitool Package

cli Module

This module provides three features:

  1. climain() decorator for main function to parse basic command line options.
  2. cliconfig() to load configuration file with multiple environments.
  3. clistream() to handle files or standard input as command line arguments.

Basic script looks like:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

from pprint import pprint

from clitool.cli import climain, cliconfig, clistream


@climain
def main(config, output, **kwargs):
    # Load all tab-separated data onto `data`.
    data = []
    clistream(data.append, delimiter='\t', **kwargs)

    # Dump all data into given output stream. (default is standard output)
    pprint(data, stream=output)

    # Get database connection from given configuration file.
    cfg = cliconfig(config)
    dsl = cfg.get("YOUR_DATABASE_CONFIG_KEY")
    # Implement database session factory code.
    session = SessionFactory(dsl).create()

    # Save all data on database
    for dt = data:
        # Implement mapping code from input to database model.
        e = mapping(dt)
        session.save(e)

    session.commit()


if __name__ == '__main__':
    main()
@climain

Decorator for main function to parse basic command line options and arguments. This is a simple wrapper of parse_arguments() expecting multiple files as command line argument, and passes command line options and arguments to wrapping function.

The wrapped function get keyword arguments defined in base_parser() and files which you can ignore. The sequence is on your own.

A example which accepts multiple input files, one output file-like object, input/output encoding, and other keywords is such like:

from clitool.cli import climain

@climain
def main(files, input_encoding, output, output_encoding, **kwargs):
    # your main function goes here
    if files:
        print("Hello %d inputs with %s." % (len(files), input_encoding))

Command line utilities.

This module is also executable to create script boilerplate.

$ python -m clitool.cli -o your-script.py
$ ./your-script.py --help
clitool.cli.base_parser()[source]

Create arguments parser with basic options and no help message.

  • -c, –config: load configuration file.
  • -v, –verbose: increase logging verbosity. -v, -vv, and -vvv.
  • -q, –quiet: quiet logging except critical level.
  • -o, –output: output file. (default=sys.stdout)
  • –basedir: base directory. (default=os.getcwd)
  • –input-encoding: input data encoding. (default=utf-8)
  • –output-encoding: output data encoding. (default=utf-8)
  • –processes: count of processes.
  • –chunksize: a number of chunks submitted to the process pool.
Return type:argparse.ArgumentParser
clitool.cli.cliconfig(fp, env=None)[source]

Load configuration data. Given pointer is closed internally. If None is given, force to exit.

More detailed information is available on underlying feature, clitool.config.

Parameters:
  • fp (FileType) – opened file pointer of configuration
  • env (str) – environment to load
Return type:

dict

clitool.cli.clistream(reporter, *args, **kwargs)[source]

Handle stream data on command line interface, and returns statistics of success, error, and total amount.

More detailed information is available on underlying feature, clitool.processor.

Parameters:
  • Handler (object which supports handle method.) – [DEPRECATED] Handler for file-like streams. (default: clitool.processor.CliHandler)
  • reporter (callable) – callback to report processed value
  • delimiter (string) – line delimiter [optional]
  • args – functions to parse each item in the stream.
  • kwargs – keywords, including files and input_encoding.
Return type:

list

clitool.cli.parse_arguments(**kwargs)[source]

Parse command line arguments after setting basic options. If successfully parsed, set logging verbosity. This function accepts variable keyword arguments. Their values are passed to ArgumentParser.add_argument(). If special keyword flags is given, it’ll be converted as flag option.

Examples - multiple files including zero

cliargs = parse_arguments(files=dict(nargs='*'))
if cliargs.files:
    for fp in cliargs.files:
        print(fp.name)

Examples - only one file (but property is list of files)

cliargs = parse_arguments(files=dict(nargs=1))
fp = cliargs.files[0]
print(fp.name)

Examples - mode switch of defined values

cliargs = parse_arguments(mode=dict(
            flags=('-m', '--mode'), required=True,
            choices=("A", "B", "C")))
print(cliargs.mode)
Parameters:kwargs – keywords arguments to pass add_argument() method.
Return type:NameSpace object

config Module

Configuration loader to support multi file types along with environmental variable PYTHON_CLITOOL_ENV. Default variable is clitool.DEFAULT_RUNNING_MODE (development).

Supported file types are:

  • ini/cfg
  • json
  • yaml (if “pyyaml” is installed)
class clitool.config.ConfigLoader(fp, filetype=None)[source]

Bases: builtins.object

Parameters:
  • fp – file pointer to load
  • filetype (string) – either of ‘ini|cfg’, ‘json’, or ‘yml|yaml’ file. If nothing specified, detect by file extension automatically.
flip()[source]

Provide flip view to compare how key/value pair is defined in each environment for administrative usage.

Return type:dict
load(env=None)[source]

Load a section values of given environment. If nothing to specified, use environmental variable. If unknown environment was specified, warn it on logger.

Parameters:env (string) – environment key to load in a coercive manner
Return type:dict

Configuration file examples

INI format

[development]
database.url=sqlite:///sample.sqlite

[staging]
database.url=postgresql+pypostgresql://user:pass@host/database

[production]
database.url=mysql://user:pass@host/database

JSON style

{
    "development": {
        "database": {
            "url": "sqlite:///sample.sqlite"
        }
    },
    "staging": {
        "database": {
            "url": "postgresql+pypostgresql://user:pass@host/database"
        }
    },
    "production": {
        "database": {
            "url": "mysql+mysqlconnector://user:pass@host/database"
        }
    }
}

YAML style

development:
  database:
    url: "sqlite:///sample.sqlite"

staging:
  database:
    url: "postgresql+pypostgresql://user:pass@host/database"

production:
  database:
    url: "mysql://user:pass@host/database"

processor Module

Stream processing utility.

class clitool.processor.CliHandler(streamer, delimiter=None)[source]

Bases: builtins.object

Simple command line arguments handler.

Parameters:
  • streamer (Streamer) – streaming object
  • delimiter (string) – column delimiter such as ” “
handle(files, encoding, chunksize=1)[source]

Handle given files with given encoding.

Parameters:
  • files (list) – opened files.
  • encoding (string) – encoding of opened file
  • chunksize (int) – a number of chunk
Return type:

list

reader(fp, encoding)[source]

Simple open wrapper for several file types. This supports .gz and .json.

Parameters:
  • fp (file pointer) – opened file
  • encoding (string) – encoding of opened file
Return type:

file pointer

class clitool.processor.RowMapper(header=None, loose=False, *args, **kwargs)[source]

Bases: builtins.object

Map list_or_tuple to dict object using given keys. If keys are not given, first list_or_tuple is used as keys. If length of given data is not different from keys length, no valid data is returned.

Since this object is aimed to use with Streamer, mapping is fired to call this.

If you know header names a priori, use standard namedtuple instead. But the case that keys contains non-ascii, such as Japanese text, this class may be useful.

Parameters:
  • header (tuple) – list of header values.
  • loose (bool) – loosly match flag.
class clitool.processor.SimpleDictReporter(*args, **kwargs)[source]

Bases: builtins.object

Reporting class for streamer API. Passing processed data as mapping object, report the key/value pair if value is string. To call report(), you can get the result as dict.

report()[source]
Return type:dict
class clitool.processor.Streamer(callback=None, *args, **kwargs)[source]

Bases: builtins.object

Simple streaming module to accept step-by-step procedures. General steps are:

  1. check input value meets your requirements.
  2. parse something for your business.
  3. collect parsed value.

Step 1 and Step 2 have to follow these rules:

  • return True to skip parsing
  • return False to report error
  • return something valid to continue processing the item

Step 3 is arbitrary function to accept one argument such as list.append().

Parameters:
  • callback (callable) – function to collect parsed value
  • args (list) – callables
consume(stream, source=None, chunksize=1)[source]

Consuming given strem object and returns processing stats.

Parameters:
  • stream (iterable) – streaming object to consume
  • source (string) – source of stream to consume
  • chunksize (integer) – chunk size for multiprocessing
Return type:

dict

accesslog Module

Utilities to parse Apache access log.

To get known about access log, see Apache HTTP server official document. [en] [ja]

This module is also executable to parse access log record.

$ tail -f /var/log/httpd/access_log | python -m clitool.accesslog

Ouput labels come from <http://ltsv.org/>

clitool.accesslog.logparse(*args, **kwargs)[source]

Parse access log on the terminal application. If list of files are given, parse each file. Otherwise, parse standard input.

Parameters:args – supporting functions after processed raw log line
Type:list of callables
Return type:tuple of (statistics, key/value report)

Parsed record is a map object which has following properties.

  • host : Remote IP address.
  • time : Access date and time. (datetime; naive)
  • utcoffset : UTC offset of access time (timedelta)
  • path : HTTP request path which is splitted from query.
  • query : HTTP requert query string which is removed from ”?”.
  • method : HTTP request method.
  • protocol : HTTP request protocol version.
  • status : HTTP response status code. (int)
  • size : HTTP response size, if available. (int)
  • referer : Referer header. If “-” is given, this property does not exist.
  • ua : User agent. If “-” is given, this property does not exist.
  • ident : remote logname
  • user : remote user
  • trailing : Additional information if using custom log format.

This module also work as script file. Simple usage is that:

$ tail -f /var/log/httpd/access_log | python -m clitool.accesslog

And two options are available.

  • –color : Set color on error record.
  • –status : Filter condition along with response status.

If you would like to check only error responses, set --status=500,503.

Since the script expand each record on key/value manner, you can combine it with grep or any other Unix-like tools. To get Top 10 access, try it.

$ python -m clitool.accesslog /var/log/httpd/access_log |
    grep request_path | sort | uniq -c | sort -nr | head -n 10

textio Module

Text I/O utilities.

Type mapping follows the rule of “Cerberus”.

Example usage:

import logging
import sys

FIELDS = (
    {'id': 'id', 'type': 'string'},
    {'id': 'updated', 'type': 'datetime', 'format': '%Y-%m-%dT%H:%M:%SZ'},
    {'id': 'name', 'type': 'string'},
    {'id': 'latitude', 'type': 'float'},
    {'id': 'longitude', 'type': 'float'},
    {'id': 'zipcode', 'type': 'string'},
    {'id': 'kind', 'type': 'string', 'default': 'UNKNOWN'},
    {'id': 'update_type', 'type': 'integer'}
)

s = Sequential()
s.callback(RowMapper(FIELDS))
s.errback(logging.error)
for row in sys.stdin:
    dt = s(row)
    print(dt)
class clitool.textio.DictMapper(fields)[source]

Bases: builtins.object

Convert dictionary object to list of strings of values.

class clitool.textio.RowMapper(fields, strict=True, *args, **kwargs)[source]

Bases: builtins.object

Map list_or_tuple to dict object using given fields definition. If length of given data is not different from keys length, raise ValueError. To accept loose inputs, change strict flag False.

If input fields are all string type, use namedtuple in standard library instead. The case, however, that keys contains non-ascii characters, such as Japanese text, this class may be useful.

Parameters:
  • fields (tuple) – list of fields values.
  • strict (boolean) – strict match flag.
Return type:

callable

class clitool.textio.Sequential[source]

Bases: builtins.object

Apply callback functions sequentially.

If error occurs in callback, apply errback functions against catched exception.

Return type:callable

_unicodecsv Module

This file is for Python 2.7. And all classes are from Python Official Document.

More sophisticated implementation is available on “csvkit” module.

class clitool._unicodecsv.UTF8Recoder(f, encoding)[source]

Bases: builtins.object

Iterator that reads an encoded stream and reencodes the input to UTF-8

next()[source]
class clitool._unicodecsv.UnicodeReader(f, dialect=<class 'csv.excel'>, encoding='utf-8', **kwds)[source]

Bases: builtins.object

A CSV reader which will iterate over lines in the CSV file “f”, which is encoded in the given encoding.

next()[source]
class clitool._unicodecsv.UnicodeWriter(f, dialect=<class 'csv.excel'>, encoding='utf-8', **kwds)[source]

Bases: builtins.object

A CSV writer which will write rows to CSV file “f”, which is encoded in the given encoding.

writerow(row)[source]
writerows(rows)[source]

«  Examples   ::   Contents