/parse_interpret_various_formats

Python tools

Primary LanguageHTMLGNU General Public License v3.0GPL-3.0

Parsing Microsoft xml based formats with parse_interpret_various_formats

Version: 0.0.1 Development kit

Primary feature : Parse, read and separate questions and answers from QA documents in Microsofts Word docx- and xlsx format.

Tools including xml.etree.ElementTree to effectivley parse internal xml files that is the fundamental markup language of many of Microsoft's format.

Installation of parse_interpret_various_formats

For macOS/UNIX

With python3 and pip:

Bash

git clone https://github.com/rojter-tech/parse_interpret_various_formats.git ~/parse_interpret_various_formats

Create a new environment inside the repo and source it

python -m venv ~/parse_interpret_various_formats
source ~/parse_interpret_various_formats/bin/activate
python -m pip install -U pip

Install dependencies

cd ~/parse_interpret_various_formats
pip install -r requirements/dev.txt
python setup.py bdist_wheel
pip install -e .

For Windows

With python3 and pip:

Powershell

git clone https://github.com/rojter-tech/parse_interpret_various_formats.git $HOME\parse_interpret_various_formats

Create a new environment inside the repo and source it

python -m venv $HOME\parse_interpret_various_formats
cd $HOME\parse_interpret_various_formats
.\Scripts\activate
.\Scripts\activate.bat
python -m pip install -U pip

Install dependencies

pip install -r .\requirements\dev.txt
python setup.py bdist_wheel
pip install -e .