clitool Package¶
cli Module¶
This module provides three features:
- climain() decorator for main function to parse basic command line options.
- cliconfig() to load configuration file with multiple environments.
- clistream() to handle files or standard input as command line arguments.
Basic script looks like:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from pprint import pprint
from clitool.cli import climain, cliconfig, clistream
@climain
def main(config, output, **kwargs):
# Load all tab-separated data onto `data`.
data = []
clistream(data.append, delimiter='\t', **kwargs)
# Dump all data into given output stream. (default is standard output)
pprint(data, stream=output)
# Get database connection from given configuration file.
cfg = cliconfig(config)
dsl = cfg.get("YOUR_DATABASE_CONFIG_KEY")
# Implement database session factory code.
session = SessionFactory(dsl).create()
# Save all data on database
for dt = data:
# Implement mapping code from input to database model.
e = mapping(dt)
session.save(e)
session.commit()
if __name__ == '__main__':
main()
- @climain¶
Decorator for main function to parse basic command line options and arguments. This is a simple wrapper of parse_arguments() expecting multiple files as command line argument, and passes command line options and arguments to wrapping function.
The wrapped function get keyword arguments defined in base_parser() and files which you can ignore. The sequence is on your own.
A example which accepts multiple input files, one output file-like object, input/output encoding, and other keywords is such like:
from clitool.cli import climain @climain def main(files, input_encoding, output, output_encoding, **kwargs): # your main function goes here if files: print("Hello %d inputs with %s." % (len(files), input_encoding))
Command line utilities.
This module is also executable to create script boilerplate.
$ python -m clitool.cli -o your-script.py
$ ./your-script.py --help
- clitool.cli.base_parser()[source]¶
Create arguments parser with basic options and no help message.
- -c, –config: load configuration file.
- -v, –verbose: increase logging verbosity. -v, -vv, and -vvv.
- -q, –quiet: quiet logging except critical level.
- -o, –output: output file. (default=sys.stdout)
- –basedir: base directory. (default=os.getcwd)
- –input-encoding: input data encoding. (default=utf-8)
- –output-encoding: output data encoding. (default=utf-8)
- –processes: count of processes.
- –chunksize: a number of chunks submitted to the process pool.
Return type: argparse.ArgumentParser
- clitool.cli.cliconfig(fp, env=None)[source]¶
Load configuration data. Given pointer is closed internally. If None is given, force to exit.
More detailed information is available on underlying feature, clitool.config.
Parameters: - fp (FileType) – opened file pointer of configuration
- env (str) – environment to load
Return type: dict
- clitool.cli.clistream(reporter, *args, **kwargs)[source]¶
Handle stream data on command line interface, and returns statistics of success, error, and total amount.
More detailed information is available on underlying feature, clitool.processor.
Parameters: - Handler (object which supports handle method.) – [DEPRECATED] Handler for file-like streams. (default: clitool.processor.CliHandler)
- reporter (callable) – callback to report processed value
- delimiter (string) – line delimiter [optional]
- args – functions to parse each item in the stream.
- kwargs – keywords, including files and input_encoding.
Return type: list
- clitool.cli.parse_arguments(**kwargs)[source]¶
Parse command line arguments after setting basic options. If successfully parsed, set logging verbosity. This function accepts variable keyword arguments. Their values are passed to ArgumentParser.add_argument(). If special keyword flags is given, it’ll be converted as flag option.
Examples - multiple files including zero
cliargs = parse_arguments(files=dict(nargs='*')) if cliargs.files: for fp in cliargs.files: print(fp.name)
Examples - only one file (but property is list of files)
cliargs = parse_arguments(files=dict(nargs=1)) fp = cliargs.files[0] print(fp.name)
Examples - mode switch of defined values
cliargs = parse_arguments(mode=dict( flags=('-m', '--mode'), required=True, choices=("A", "B", "C"))) print(cliargs.mode)
Parameters: kwargs – keywords arguments to pass add_argument() method. Return type: NameSpace object
config Module¶
Configuration loader to support multi file types along with environmental variable PYTHON_CLITOOL_ENV. Default variable is clitool.DEFAULT_RUNNING_MODE (development).
Supported file types are:
- ini/cfg
- json
- yaml (if “pyyaml” is installed)
- class clitool.config.ConfigLoader(fp, filetype=None)[source]¶
Bases: builtins.object
Parameters: - fp – file pointer to load
- filetype (string) – either of ‘ini|cfg’, ‘json’, or ‘yml|yaml’ file. If nothing specified, detect by file extension automatically.
Configuration file examples¶
INI format
[development]
database.url=sqlite:///sample.sqlite
[staging]
database.url=postgresql+pypostgresql://user:pass@host/database
[production]
database.url=mysql://user:pass@host/database
JSON style
{
"development": {
"database": {
"url": "sqlite:///sample.sqlite"
}
},
"staging": {
"database": {
"url": "postgresql+pypostgresql://user:pass@host/database"
}
},
"production": {
"database": {
"url": "mysql+mysqlconnector://user:pass@host/database"
}
}
}
YAML style
development:
database:
url: "sqlite:///sample.sqlite"
staging:
database:
url: "postgresql+pypostgresql://user:pass@host/database"
production:
database:
url: "mysql://user:pass@host/database"
processor Module¶
Stream processing utility.
- class clitool.processor.CliHandler(streamer, delimiter=None)[source]¶
Bases: builtins.object
Simple command line arguments handler.
Parameters: - streamer (Streamer) – streaming object
- delimiter (string) – column delimiter such as ” “
- class clitool.processor.RowMapper(header=None, loose=False, *args, **kwargs)[source]¶
Bases: builtins.object
Map list_or_tuple to dict object using given keys. If keys are not given, first list_or_tuple is used as keys. If length of given data is not different from keys length, no valid data is returned.
Since this object is aimed to use with Streamer, mapping is fired to call this.
If you know header names a priori, use standard namedtuple instead. But the case that keys contains non-ascii, such as Japanese text, this class may be useful.
Parameters: - header (tuple) – list of header values.
- loose (bool) – loosly match flag.
- class clitool.processor.SimpleDictReporter(*args, **kwargs)[source]¶
Bases: builtins.object
Reporting class for streamer API. Passing processed data as mapping object, report the key/value pair if value is string. To call report(), you can get the result as dict.
- class clitool.processor.Streamer(callback=None, *args, **kwargs)[source]¶
Bases: builtins.object
Simple streaming module to accept step-by-step procedures. General steps are:
- check input value meets your requirements.
- parse something for your business.
- collect parsed value.
Step 1 and Step 2 have to follow these rules:
- return True to skip parsing
- return False to report error
- return something valid to continue processing the item
Step 3 is arbitrary function to accept one argument such as list.append().
Parameters: - callback (callable) – function to collect parsed value
- args (list) – callables
accesslog Module¶
Utilities to parse Apache access log.
To get known about access log, see Apache HTTP server official document. [en] [ja]
This module is also executable to parse access log record.
$ tail -f /var/log/httpd/access_log | python -m clitool.accesslog
Ouput labels come from <http://ltsv.org/>
- clitool.accesslog.logparse(*args, **kwargs)[source]¶
Parse access log on the terminal application. If list of files are given, parse each file. Otherwise, parse standard input.
Parameters: args – supporting functions after processed raw log line Type: list of callables Return type: tuple of (statistics, key/value report)
Parsed record is a map object which has following properties.
- host : Remote IP address.
- time : Access date and time. (datetime; naive)
- utcoffset : UTC offset of access time (timedelta)
- path : HTTP request path which is splitted from query.
- query : HTTP requert query string which is removed from ”?”.
- method : HTTP request method.
- protocol : HTTP request protocol version.
- status : HTTP response status code. (int)
- size : HTTP response size, if available. (int)
- referer : Referer header. If “-” is given, this property does not exist.
- ua : User agent. If “-” is given, this property does not exist.
- ident : remote logname
- user : remote user
- trailing : Additional information if using custom log format.
This module also work as script file. Simple usage is that:
$ tail -f /var/log/httpd/access_log | python -m clitool.accesslog
And two options are available.
- –color : Set color on error record.
- –status : Filter condition along with response status.
If you would like to check only error responses, set --status=500,503.
Since the script expand each record on key/value manner, you can combine it with grep or any other Unix-like tools. To get Top 10 access, try it.
$ python -m clitool.accesslog /var/log/httpd/access_log |
grep request_path | sort | uniq -c | sort -nr | head -n 10
textio Module¶
Text I/O utilities.
Type mapping follows the rule of “Cerberus”.
Example usage:
import logging
import sys
FIELDS = (
{'id': 'id', 'type': 'string'},
{'id': 'updated', 'type': 'datetime', 'format': '%Y-%m-%dT%H:%M:%SZ'},
{'id': 'name', 'type': 'string'},
{'id': 'latitude', 'type': 'float'},
{'id': 'longitude', 'type': 'float'},
{'id': 'zipcode', 'type': 'string'},
{'id': 'kind', 'type': 'string', 'default': 'UNKNOWN'},
{'id': 'update_type', 'type': 'integer'}
)
s = Sequential()
s.callback(RowMapper(FIELDS))
s.errback(logging.error)
for row in sys.stdin:
dt = s(row)
print(dt)
- class clitool.textio.DictMapper(fields)[source]¶
Bases: builtins.object
Convert dictionary object to list of strings of values.
- class clitool.textio.RowMapper(fields, strict=True, *args, **kwargs)[source]¶
Bases: builtins.object
Map list_or_tuple to dict object using given fields definition. If length of given data is not different from keys length, raise ValueError. To accept loose inputs, change strict flag False.
If input fields are all string type, use namedtuple in standard library instead. The case, however, that keys contains non-ascii characters, such as Japanese text, this class may be useful.
Parameters: - fields (tuple) – list of fields values.
- strict (boolean) – strict match flag.
Return type: callable
_unicodecsv Module¶
This file is for Python 2.7. And all classes are from Python Official Document.
More sophisticated implementation is available on “csvkit” module.
- class clitool._unicodecsv.UTF8Recoder(f, encoding)[source]¶
Bases: builtins.object
Iterator that reads an encoded stream and reencodes the input to UTF-8