invoice¶
Classes for different invoce types
-
class
pdf2xlsx.invoice.CreditEntry(entry_tuple=None, invo=None)¶ These entries contain negative prices as these are creadit invoices Dummy!
-
class
pdf2xlsx.invoice.CreditInvoice(no=0, orig_date='', pay_due='', total_sum=0, entries=None, orig_invo_no=0)¶ Creadit invoice class
-
parse_line(line)¶ Parameters: line (str) – The actual line to parse Returns: True when the parsing of the Invoice was started Return type: bool
-
xlsx_write(worksheet, row, col)¶ Write the invoice information to a template xlsx file.
Parameters: - worksheet (Worksheet) – Worksheet class to write info
- row (int) – Row number to start writing
- col (int) – Column number to start writing
Returns: the next position of cursor row,col
Return type: tuple of (int,int)
-
-
class
pdf2xlsx.invoice.Entry(entry_tuple=None, invo=None)¶ Parse, store and write to xlsx invoice entries. The invoice informations are stored in the EntryTuple namedtuple. The parsing is contolled by a state variable (:entry_found:) Because the invoice entries are split into two line, the tmp_str attribute is used to store the first part of the entire The ME values are configurable, so they cannot be created at class level, they need to be recomputed at evry instantiation
Parameters: - entry_tuple (EntryTuple) – The invoice entry
- invo (Invoice) – The parent invoice containing this entry
-
line2entry(line)¶ Extracts entry information from the given line. Tries to search for nine different group in the line. See implementation of entry_pattern. This should match the following pattern: NNNNNN-NNN STR+WSPACE PREDEFSTR INTEGER INTEGER-. INTEGER% INTEGER-. INTEGER-. INTEGER% Where: N: a single digit: 0-9 STR+WSPACE: string containing white spaces, numbers and special characters PREDEFSTR: string without white space ( predefined ) INTEGER: decimal number, unknown length INTEGER-.: a decimal number, grouped with . by thousends e.g 1.589.674 INTEGER%: an integer with percentage at the end
Parameters: pdfline (str) – Line to parse, this line should be begin with NNNNNNN-NNN Returns: The actual invoice entry Return type: EntryTuple
-
parse_line(line)¶ Parse through raw text which is supplied line-by-line. This is the structure of the pdf (the brackets() indicate what should be collected): n times: <disinterested rubish> (NNNNNN-NNN ...
...) <disinterested rubish> When the Invoice code is found, an additional line is waited, and then it is sent to the line2entry converter.
Parameters: line (str) – The actual line to parse Returns: True when an entry was found Return type: bool
-
xlsx_write(worksheet, row, col)¶ Write the entry information to a template xlsx file.
Parameters: - worksheet (Worksheet) – Worksheet class to write info
- row (int) – Row number to start writing
- col (int) – Column number to start writing
Returns: the next position of cursor row,col
Return type: tuple of (int,int)
-
class
pdf2xlsx.invoice.EntryTuple(kod, nev, ME, mennyiseg, BEgysegar, Kedv, NEgysegar, osszesen, AFA)¶ -
AFA¶ Alias for field number 8
-
BEgysegar¶ Alias for field number 4
-
Kedv¶ Alias for field number 5
-
ME¶ Alias for field number 2
-
NEgysegar¶ Alias for field number 6
-
kod¶ Alias for field number 0
-
mennyiseg¶ Alias for field number 3
-
nev¶ Alias for field number 1
-
osszesen¶ Alias for field number 7
-
-
class
pdf2xlsx.invoice.Invoice(no=0, orig_date='', pay_due='', total_sum=0, entries=None)¶ Parse, store and write to xlsx invoce informations. Such as Invoice Number, Invoice Date, Payment Date, Total Sum Price. It also contains a list of Entry, which is also extracted form raw string. The parsing of the raw string is controlled by three state variables: no_parsed, orig_date_parsed and pay_due_parsed. These represent the structure of the pdf.
Parameters: - no (int) – Invoice number, default:0
- orig_date (str) – Invoice date stored as a string YYYY.MM.DD
- pay_due (str) – Payment Date stored as string YYYY.MM.DD
- total_sum (int) – Total price of invoice
- entries (list) – List of
Entrycontaining each entries in invoice
[TODO] implement state pattern for parsing ??? [TODO] implement _to_money as a mixin class
-
parse_line(line)¶ Parse through a raw text which is supplied line-by-line. This is the structure of the pdf (the brackets() indicate what should be collected): <disinterested rubish> Számla sorszáma: (NNNNNNNN) ... <disinterested rubish> Számla kelte: (YYYY.MM.DD|DD.MM.YYYY) ... <disinterested rubish> FIZETÉSI HATÁRIDŐ:(YYYY.MM.DD|DD.MM.YYYY) (NNN[.NNN.NNN]) <disinterested rubish> This is structure is paresed using the three state variable, and stored inside the class attributes
Parameters: line (str) – The actual line to parse Returns: True when the parsing of the Invoice was started Return type: bool
-
xlsx_write(worksheet, row, col)¶ Write the invoice information to a template xlsx file.
Parameters: - worksheet (Worksheet) – Worksheet class to write info
- row (int) – Row number to start writing
- col (int) – Column number to start writing
Returns: the next position of cursor row,col
Return type: tuple of (int,int)
-
pdf2xlsx.invoice.get_invo_type(pdf_line)¶ TODO add title parse to decide between invoce types
-
pdf2xlsx.invoice.invo_parser(pdf_file, logger)¶ Factory to generate the apropriate invoce type based on the title in the PDF