JSON to CSV Converter
This Python script extracts data from a JSON file and writes it into a CSV file. The script is designed to handle specific fields from the JSON file, such as UID, Title, Abstract, Author Names:Affiliation, Journal Name, Volume, Published Date, Eissn, and DOI. Features
Parses large JSON files using ijson for efficient memory usage.
Extracts specific fields from nested JSON objects.
Handles missing or malformed data gracefully.
Outputs a well-formatted CSV file with customizable headers.
Requirements
Python 3.x
Required Python packages:
ijson
csv (part of Python's standard library)
rich (for pretty-printing, optional)
You can install the required packages using pip:
bash
pip install ijson rich
Usage
Prepare your JSON file:
Place your JSON file in the same directory as the script, or provide the path to the file.
Run the script:
You can run the script using the command line:
bash
python your_script_name.py
The script will convert the JSON data in export.json to a CSV file named data.csv.
Customize input/output:
If you want to use a different JSON input file or output CSV file, modify the extract_data_json_to_csv function call in the __main__ section:
python
extract_data_json_to_csv("path/to/your/input.json", "path/to/your/output.csv")
Example
bash
python your_script_name.py
This command will convert the JSON data from export.json into a CSV file named data.csv. Code Overview
The script defines a single function, extract_data_json_to_csv, which performs the following:
Opens the JSON file and initializes a CSV writer.
Iterates over JSON objects using ijson.items for efficient parsing.
Extracts relevant fields, handling missing data with default values (N/A).
Writes the extracted data into a CSV file with predefined headers.
Handling of Specific Fields
UID: Unique identifier.
Title: Extracted from the static_data.summary.titles.title field.
Abstract: Extracted from the static_data.fullrecord_metadata.abstracts.abstract.abstract_text.p field.
Author Names
: Combines author names with their affiliations.
Journal Name: Extracted from the static_data.summary.publishers.publisher.names.name.
Volume: Extracted from the static_data.summary.pub_info.vol.
Published Date: Extracted from the static_data.summary.pub_info.sortdate.
Eissn: Extracted from the dynamic_data.cluster_related.identifiers.identifier.
DOI: Extracted from the dynamic_data.cluster_related.identifiers.identifier.
License
This project is licensed under the MIT License. See the LICENSE file for details. Contact
For any questions or issues, please open an issue on this repository.