A Python command line application to merge JSON files with support for maximum output file size.
Developed and tested in Linux environment but this should work fine for Windows and Mac environments too as the os module of Python takes care of compatability
- Python3
- Python standard library
- Click package - To create robust command line application ($pip install click - to install)
Clone and cd into this repo
Help: python src.py --help
- -o (or) --output-base -> OUTPUT BASE FILE NAME (Default name: output)
- -m (or) --max-file-size -> MAXIMUM_OUTPUT_FILE_SIZE (Defalut size: 1MB)
By default it writes the the Output directory, but this can be changed in the function. The reason it was done this way is that the permissible inputs to the program were only the 4 listed in the first functional requirements.
- utils.py - Holds all the utility functions
- src.py - Source file that needs to be executed
Accept folder path, input file base name, output file base name, maximum file size
Read all files in the folder path and process the files in increasing order of the suffix
Output files are named using the output file base name as prefix and a counter as a suffix
Merged files are never greater than the maximum file size
Each output file contains a proper JSON array
Any kind of JSON arrays can be merged
Supports non-English characters too
Algorithmic Complexity
O(number of input files * log (number of input files)) - Mainly because of the sorting procedure used to handle the files in increasing order of the suffixes
The merged files are as large as possible, without exceeding the maximum file size