>>> context=notch(
'yahi/test/biggersample.log',
'yahi/test/biggersample.log',
include="yahi/test/include.json",
silent=True,
exclude='{ "_country" : "US"}',
output_format="csv"
)
Would I have been smart, it would have been called «aim». Since you tell your target, and the parameters of your parsing (log_format...).
Warning
notch arguments always override arguments given in the command line.
Stores the filter used to filter the data. If nothing specified it will use include and exclude.
Given output_file / output_format write a Mapping in the specified file with the sepcified format.
Warning
Output will close output_file once it has written in it. Thus, reusing it another time will cause an exception. you should notch once for every time you shoot if you context.output for writing to a file.
Given one or more context, you now can shoot your request to the context given back by notch.
Note
It is basically a way to GROUP BY like in mysql. As my dict supports addition we have the following logic for each line (given you request an aggregation on country and useragent and you are at the 31st line:
>>> { '_country' : { 'BE' : 10, 'FR' : 20
... }, 'user_agent' : { 'mozilla' : 13, 'unknown' : 17 } } + {
... 'country' : { 'BE' : 1}, 'user_agent' : { 'safari': 1 } }
{ '_country' : { 'BE' : 11, 'FR' : 20 },
'user_agent' : { 'mozilla' : 13, 'unknown' : 17,'safari': 1 } }
},
First since we use named capture in our log regexps, we directly transform a log in a dict. You can give the name you want for your capture except for 3 special things:
Once datetime is captured since datetime objects are easier to use than strings datetime value is transformed in _datetime with the date_pattern.
Once ip is catpured given geo_ip is enabled _country will be set with the 2 letters of the ISO code of the country.
Once agent is captured, it will be transformed -if user_agent is enabled- into