/parser

Dynamic xml and json parser to transform it to csv

Primary LanguagePython

XML_JSON_Parse

Dynamic XML_JSON parser to transform XML and JSON files to CSV using multi threading.

Prerequisites

The module is developed in python3 with usage of xml, json, pandas, threading and queue modules.

Usage / Start Process

For XML parsing

python3 main.py -i -o -e Resident

For JSON parsing

python3 main.py -i -o -e fruit

Command Line Arguments:

usage: main.py [-h] -i INPUT_FILE -o OUTPUT_FILE -e ELEMENT

Convert XML or JSON file to csv

    optional arguments:
  • -h, --help show this help message and exit
  • -i INPUT_FILE, --input_file INPUT_FILE Source XML or JSON file (mandatory)
  • -o OUTPUT_FILE, --output_file OUTPUT_FILE Destination csv file (mandatory)
  • -e ELEMENT, --element ELEMENT element to parse (mandatory)

Example For XML parsing

Input xml file : resident.xml with Resident as element to parse

<State>
<Resident>
<Name>Sample Name</Name>
	<PhoneNumber>1234567891</PhoneNumber>
	<EmailAddress>sample_name@example.com</EmailAddress>
	<Address>
		<StreetLine1>Street Line1</StreetLine1>
		<City>City Name</City>
		<StateCode>AE</StateCode>
		<PostalCode>12345</PostalCode>
	</Address>
</Resident>
<Resident>
	<Name>Sample Name1</Name>
	<PhoneNumber>1234567891</PhoneNumber>
	<EmailAddress>sample_name1@example.com</EmailAddress>
	<Address>
		<StreetLine1>Current Address</StreetLine1>
		<City>Los Angeles</City>
		<StateCode>CA</StateCode>
		<PostalCode>56666</PostalCode>
	</Address>
</Resident>
</State>

python3 main.py -i resident.xml -o resident_output.csv -e Resident

output CSV file : resident_output.csv

Resident##Address##City Resident##Address##PostalCode Resident##Address##StateCode Resident##Address##StreetLine1 Resident##EmailAddress Resident##Name Resident##PhoneNumber
Los Angeles 56666 CA Current Address sample_name1@example.com Sample Name1 1234567891
City Name 12345 AE Street Line1 sample_name@example.com Sample Name 1234567891

Example For JSON parsing

Input json file : sample_2.json with fruit as element to parse

{
    "fruit":[
        {
            "name":"Apple",
            "binomial name":"Malus domestica",
            "major_producers":[
                "China", 
                "United States", 
                "Turkey"
            ],
            "nutrition":{
                "carbohydrates":"13.81g",
                "fat":"0.17g",
                "protein":"0.26g"
            }
        },
        {
            "name":"Orange",
            "binomial name":"Citrus x sinensis",
            "major_producers":[
                "Brazil", 
                "United States", 
                "India"
            ],
            "nutrition":{
                "carbohydrates":"11.75g",
                "fat":"0.12g",
                "protein":"0.94g"
            }
        },
        {
            "name":"Mango",
            "binomial name":"Mangifera indica",
            "major_producers":[
                "India", 
                "China", 
                "Thailand"
            ],
            "nutrition":{
                "carbohydrates":"15g",
                "fat":"0.38g",
                "protein":"0.82g"
            }
        }
    ]
}

python3 main.py -i sample_2.json -o sample_2_output.csv -e fruit

output CSV file : sample_2_output.csv

fruit##binomial name fruit##major_producers fruit##name fruit##nutrition##carbohydrates fruit##nutrition##fat fruit##nutrition##protein
Mangifera indica India||China||Thailand Mango 15g 0.38g 0.82g
Citrus x sinensis Brazil||United States||India Orange 11.75g 0.12g 0.94g
Malus domestica China||United States||Turkey Apple 13.81g 0.17g 0.26g