Task for the ATD course - DWS MSc Spring 2022
- Create a crawler to get articles and save them in csv files
- Add them to Postgres
- Connect Postgres with Python using a connector (psycopg3)
- Read credentials from config file
- Add create directory if not exists in
extract_body.py
- Fix
article_path.csv
- Add threshold to relevant docs in
text_query.py
- Add columns to show in
text_query.py
- Show lines that have keywords (grep maybe?)
- Add
requirements.txt
- Fix - In
text_query.py
:301 -> check if list empty - Move
links.csv
tocsv_files
-
Add show vector intext_query.py
output - Use GIN index on docvec column
- Displaying docvec troublesome in terminal
- Add comments