Making Sense of Multitable CSVs

Challenge - Multitable CSVs Have No Common Format

Some CSVs and XLSX files contain multiple tables. These are especially challenging to read given there is no universal format, and these files are typically built for human, not machine consumption.

This reader assumes that any line with a single field marks the beginning of a new section, with that field as the title of the section. The next line is assumed to be the header for that new section.

This is a file format reader for beancount_reds_importers that converts:

---- examples.csv -----
downloaded on: blah blah
section1
date,        transactions, amount
2020-02-02,  3,            5.00
2020-02-02,  3,            5.00
section2
account_num, balance,      date
123123,      1000,         2020-12-31
23048,       2000,         2020-12-31
end_of_file
-----------------------

to this data structure:

self.alltables =  {'section1': <petl table of section 1>
                   'section2': <petl table of section 2>
                  }

where each value is a petl table.

Code link

Notes mentioning this note