Fixed width file data parser and generator

Data parser to parse the fixed width file. The codebase includes three major parts:

dataparser.py defines the main parser class
generator.py defines the methods that are used to generate the example fixed width files
testcases.py includes all the test cases. Test data are included in the directory tests

Run the Code

Prerequisites:

* Python 3.6
* Git

Download the repo

# Download the repo
git clone https://github.com/chuanwuliu/data-parser.git

Parse a fixed width file

The main data parser function is defined in dataparser.py. To convert a fixed width input_file and save the result to output_file:
```
python dataparser.py input_file output_file
```
For example
```
python dataparser.py tests/test_input1.txt tests/_temp_output2.csv
```
The default delimiter is comma. You can customised the delimiter using the -d argument. For example, parsing with @
```
python dataparser.py tests/test_input1.txt tests/_temp_output2.csv --d @
```
More details about the usage

 usage: dataparser.py [-h] [-d DELIMITER] [-s SPEC_FILE] input_file output_file
 
 positional arguments:
   input_file    Path to the input (fixed width) file
   output_file   Path to save the output
 
 optional arguments:
   -h, --help    show this help message and exit
   -d DELIMITER  Delimiter for parsing the file
   -s SPEC_FILE  Path to specification (json) file

Test Cases:

Run the test cases

python testcases.py

Following cases have been tested:

Parse input file with fully filled up fields
Parse input file with left aligned fields and blank fields
Parse input file with right aligned fields and blank fields
Parse file with all blank fields

In each case, the sample input is parsed and its sample output is compared with a manually parsed output.

Currently, fields in the fixed width file only include letters, digits and pure whitespace character. Fields with more complicated whitespaces such as \t and \r have not been considered and tested.

A helper function has been built for generating some example files

python generator.py

Contact:

Charles Liu: dr.liuchuanwu@gmail.com