This Repository is a tool to help fact-checking. The application is a flask website that uses an elasticsearch database to retrieve relevant claims provided by CLaimsKG. The application retireves best matching claims for each setence and overall text. The relevant claims are retrieved by a custom Sentence-Bert model, trained on the trainings data of Clef 2020.
- Download and run an instance of elasticsearch
- Clone and navigate into the repository.
- Create a virtual environment.
pip install -r requirements.txt
python -m nltk.downloader 'punkt'
python merge/download_model.py
python merge/elastic_search_create.py
- Can be run with parameters for elasticsearch instance
--connection <elasticsearch>
, index name--index_name <string>
and input file--source <string>
containing relevant claims (for reference see merge/bin/data/vclaims.tsv).
- Can be run with parameters for elasticsearch instance
The application uses the maintext of news articles (parsed by news-fetch) or plaintex either as .txt file or direct input, as input.
- run elasticsearch
python merge/web.py
- Navigate to
localhost:5000
in your Browser.
- run elasticsearch
python merge/run.py --mode <url, file, text> --input <input>
- other parameters are
--index_name <string>
,--connection <elasticsearch>
,--output_path <path>
- other parameters are
- Output is saved in megre/output as json .file containing the retrieved claims.