Python: package json_to

json_to_csv (version 1.0.0)

Copyright (c) 2017 Timothy Savannah All Rights Reserved Licensed under terms of LGPLv3. json_to_csv - Module for converting json data to csv data, and various supplementry methods. Resulting csv will confirm to RFC 4180 "Common format for MIME Type for Comma-Separated Values (CSV) Files May also be used to just extract data into lists.

Package Contents

_private
help

Classes



builtins.Exception(builtins.BaseException)

FormatStrParseError

builtins.object

JsonToCsv

class FormatStrParseError(builtins.Exception)

    FormatStrParseError - Raised if there is an error in parsing the format string.

Method resolution order:

FormatStrParseError

builtins.Exception

builtins.BaseException

builtins.object

Data descriptors defined here:

__weakref__

list of weak references to the object (if defined)

Methods inherited from builtins.Exception:

__init__(self, /, *args, **kwargs)
Initialize self.  See help(type(self)) for accurate signature.

__new__(*args, **kwargs) from builtins.type
Create and return a new object.  See help(type) for accurate signature.

Methods inherited from builtins.BaseException:

__delattr__(self, name, /)
Implement delattr(self, name).

__getattribute__(self, name, /)
Return getattr(self, name).

__reduce__(...)
helper for pickle

__repr__(self, /)
Return repr(self).

__setattr__(self, name, value, /)
Implement setattr(self, name, value).

__setstate__(...)

__str__(self, /)
Return str(self).

with_traceback(...)
Exception.with_traceback(tb) -- set self.__traceback__ to tb and return self.

Data descriptors inherited from builtins.BaseException:

__cause__

exception cause

__context__

exception context

__dict__

__suppress_context__

__traceback__

args

class JsonToCsv(builtins.object)

    JsonToCsv - Public class containing methods for dealing with converting     Json to csv data, merging data, etc.     Designed to produce RFC 4180 csv output from json data using a meta language.

Methods defined here:

__init__(self, formatStr, nullValue='', debug=False)
__init__ - Create a JsonToCsv object. @param formatStr <str> - The format formatStr for the json data to be converted. @param nullValue <str> Default empty string - The value to assign to a "null" result. @param debug <bool> Default False - If True, will output some debug data on stderr.

convertToCsv(self, data, quoteFields='smart', lineSeparator='\r\n')
            convertToCsv - Convert given data to csv.                Alias to calling:                  extractData                             and then passing those results to:                  dataToStr             @param data <string/dict> - Either a string of json data, or a dict             @param quoteFields <bool or 'smart'> Default 'smart' -                 If False, fields will not be quoted (thus a comma or newline, etc will break the output, but it looks neater on screen)                 If True, fields will always be quoted (protecting against commas, allows values to contain newlines, etc)                 If 'smart' (default), the need to quote fields will be auto-determined. This may take slighly longer on HUGE datasets,                   but is generally okay.             @param lineSeparator <str> - This will separate the lines. RFC4180 defines CRLF as the preferred ending, but implementations                 can vary (i.e. unix generally just uses ' '). If you plan to have newlines (' ') in the data, I suggest using ' ' as                 the lineSeparator as otherwise many implementations (like python's own csv module) will swallow the newline within the data.             @return <list/str> - see "asList" param above.

extractData(self, data)
extractData - Return a list of lists. The outer list represents lines, the inner list data points.     e.x.  returnData[0] is first line,  returnData[0][2] is first line third data point.     @param data <string/dict> - Either a string of JSON data, or a dict.     NOTE: This is the recommended method to be used. You can pass the data to       JsonToCsv.dataToStr to convert to csv, tsv, and other formats.     @return list<list<str>> - List of lines, each line containing a list of datapoints.

Static methods defined here:

dataToStr(csvData, separator=',', quoteFields='smart', lineSeparator='\r\n')
            dataToStr - Convert a list of lists of csv data to a string.             @param csvData list<list> - A list of lists, first list is lines, inner-list are values.               This is the data returned by JsonToCsv.extractData             @param separator <str> - Default ',' this is the separator used between fields (i.e. would be a tab in TSV format)             @param quoteFields <bool or 'smart'> Default 'smart' -                 If False, fields will not be quoted (thus a comma or newline, etc will break the output, but it looks neater on screen)                 If True, fields will always be quoted (protecting against commas, allows values to contain newlines, etc)                 If 'smart' (default), the need to quote fields will be auto-determined. This may take slighly longer on HUGE datasets,                   but is generally okay. Quotes within a field (") will be replaced with two adjacent quotes ("") as per RFC4180                   Use 'smart' unless you REALLY need to specify otherwise, as 'smart' will always produce RFC4180 csv files             @param lineSeparator <str> - This will separate the lines. RFC4180 defines CRLF as the preferred ending, but implementations                 can vary (i.e. unix generally just uses ' '). If you plan to have newlines (' ') in the data, I suggest using ' ' as                 the lineSeparator as otherwise many implementations (like python's own csv module) will swallow the newline within the data.             @return str - csv data

findDuplicates(csvData, fieldNum, flat=False)
findDuplicates - Find lines with duplicate values in a specific field number.     This is useful to strip duplicates before using JsonToCsv.joinCsv       which requires unique values in the join field.       @see JsonToCsv.joinCsv for example code     @param csvData list<list<str>> - List of lines, each line containing string field values.         JsonToCsv.extractData returns data in this form.     @param fieldNum int - Index of the field number in which to search for duplicates     @param flat bool Default False - If False, return is a map of { "duplicateKey" : lines(copy) }.                                      If True, return is a flat list of all duplicate lines     @return :      When #flat is False:         dict { duplicateKeyValue[str] : lines[list<list<str>>] (copy) } -           This dict has the values with duplicates as the key, and a COPY of the lines as each value.      When #flat is True        lines[list<list<str>>] (copy)           Copies of all lines with duplicate value in #fieldNum. Duplicates will be adjacent

joinCsv(csvData1, joinFieldNum1, csvData2, joinFieldNum2)
joinCsv - Join two sets of csv data based on a common field value in the two sets.   csvData should be a list of list (1st is lines, second is items). Such data is gathered by using JsonToCsv.extractData method   Combined data will append the fields of csvData2 to csvData1, omitting the common field from csvData2   @param csvData1 list<list> - The "primary" data set   @param joinFieldNum1 <int> - The index of the common field in csvData1   @param csvData2 list<list> - The secondary data set   @param joinFieldNum2 <int> - The index of the common field in csvData2   @return tuple( mergedData [list<list>], onlyCsvData1 [list<list>], onlyCsvData2 [list<list>] )     Return is a tuple of 3 elements. The first is the merged csv data where a join field matched.      The second is the elements only present in csvData1      The third is the elements only present in csvData2   @raises ValueError - If csvData1 or csvData2 are not in the right format (list of lists)   @raises KeyError   - If there are duplicate keys preventing a proper merge   NOTE: each csvData MUST have unique values in the "join field", or it cannot join.     Maybe try out something new for today, and check out "multiJoinCsv" function.     Use multiJoinCsv to link all matches in csvData1 to all matches in csvData2 where join fields match.     JsonToCsv.findDuplicates will identify duplicate values for a given joinfield.       So you can have something like:       myCsvData = JsonToCsv.extractData(....)       joinFieldNum = 3  # Example, 4th field is the field we will join on       myCsvDataDuplicateLines = JsonToCsv.findDuplicates(myCsvData, joinFieldNum, flat=True)       if myCsvDataDuplicateLines:           myCsvDataUniq = [line for line in myCsvData if line not in myCsvDataDuplicateLines]       else:           myCsvDataUniq = myCsvData

multiJoinCsv(csvData1, joinFieldNum1, csvData2, joinFieldNum2)
multiJoinCsv - Join two sets of csv data based on a common field value, but this time merge any results, i.e. if key is repeated on A then you'd have:    AA and AB.   csvData should be a list of list (1st is lines, second is items). Such data is gathered by using JsonToCsv.extractData method   Combined data will append the fields of csvData2 to csvData1, omitting the common field from csvData2   @param csvData1 list<list> - The "primary" data set   @param joinFieldNum1 <int> - The index of the common field in csvData1   @param csvData2 list<list> - The secondary data set   @param joinFieldNum2 <int> - The index of the common field in csvData2   @return tuple( mergedData [list<list>], onlyCsvData1 [list<list>], onlyCsvData2 [list<list>] )     Return is a tuple of 3 elements. The first is the merged csv data where a join field matched.      The second is the elements only present in csvData1      The third is the elements only present in csvData2   @raises ValueError - If csvData1 or csvData2 are not in the right format (list of lists)

Data descriptors defined here:

__dict__

dictionary for instance variables (if defined)

__weakref__

list of weak references to the object (if defined)

Data

__all__ = ('FormatStrParseError', 'JsonToCsv')
__version_tuple__ = (1, 0, 0)

Data
		__all__ = ('FormatStrParseError', 'JsonToCsv') __version_tuple__ = (1, 0, 0)