- Create a virtual environment and activate it:
python -m venv venv venv\Scripts\activate
- Install the required packages:
pip install -r requirements.txt
- Ensure
Input.xlsx
is in thedata/
directory. - Run the extraction script:
Extracted articles will be saved in the
python scripts/extract_articles.py
data/extracted_articles/
directory.
- Run the text analysis script:
The results will be saved in the
python scripts/text_analysis.py
output/Output Data Structure.xlsx
file.