softgitron/LeviRank

Relevant paper and code repository links

Closed this issue · 6 comments

Relevant paper and code repository links

Hi All,

We can post all the paper links here for helping each other and may be also express why that paper/code is important in the later stages.


Comparative argument retrieval paper links:

* Combined Approaches: {Mixed Methods}
    - Mixed Approaches, identify bottlenecks for improvements. {Paper: 3, Code: 4}
    - paper: http://ceur-ws.org/Vol-2936/paper-215.pdf, code: https://github.com/JanNiklasWeder/Touche-21-Task-2

* Pre-trained Language Models and NLU: {Important paper**}
    - Seq2Seq, Transformer, and BERT embedding. {Paper: 3, Code: 4.5**}
    - paper: http://ceur-ws.org/Vol-2696/paper_210.pdf, code: https://github.com/skoltech-nlp/touche

* DistilBERT-based Argumentation Retrieval: {Our Baseline Paper**}
    - query expansion, argument extraction, scoring, and sorting + BERT {Paper: 4, Code: 1}
    - paper: http://ceur-ws.org/Vol-2936/paper-209.pdf, code: https://github.com/jmriebold/BoilerPy3, https://github.com/adbar/trafilatura, https://github.com/UKPLab/sentence-transformers

* ensemble based retrieval methods + BERT re-ranking methodology:
    - comparative argument touche {Paper: 3, Code: 2}
    - paper: http://ceur-ws.org/Vol-2936/paper-211.pdf, code: https://github.com/Georgetown-IR-Lab/OpenNIR

* Search Engine Paper: Usage of whoosh and targer library {can be used to add intrepretability component at last stage}
    - whoosh library usage for index creation: https://whoosh.readthedocs.io/en/latest/quickstart.html {Paper: 2, Code: 1}
    - paper: http://www.dei.unipd.it/~ferro/CLEF-WN-Drafts/CLEF2020/paper_178.pdf, code: https://github.com/uhh-lt/targer
    
* Baseline paper approaches: http://ceur-ws.org/Vol-2696/paper_207.pdf
* Interesting Approach: http://ceur-ws.org/Vol-2696/paper_174.pdf
* [X] Not good code: https://github.com/touche-webis-de/gienapp21

Concepts Used:

https://github.com/touche-webis-de/weder21multi-stage
BERT ranker: https://arxiv.org/pdf/1910.14424.pdf
docT5query: https://github.com/castorini/docTTTTTquery
Document Query Expansion paper: https://arxiv.org/pdf/1904.08375.pdf

Query Expansion Methods discussion:

* Synonym and antonym generation method: {paper: http://ceur-ws.org/Vol-2936/paper-209.pdf, code: https://github.com/touche-webis-de/weder21/blob/main/src/preprocessing/QueryExpansion.py}
  - We should use synonyms and antonyms of comparative quantities + noun alternative replacements for constructing alternative queries in CNF/DNF form.
  
* More relevant approach, Doc2Query expando-mono-duo design-pattern (H-1): {paper: https://arxiv.org/pdf/2101.05667.pdf, code: https://www.tira.io/t/expanded-passages-for-the-touche-22-task-2-argument-retrieval-for-comparative-questions/578, touche blog link: https://www.tira.io/t/expanded-passages-for-the-touche-22-task-2-argument-retrieval-for-comparative-questions/578}
  - MS MACRO trained docT5queries are available for touche 22 documents, fine-tuning may be required for better performance (future work).
  - Ideal SOTA design needs to be followed for the implementation. 

Papers that can be relevant for evaluation metrics:

* QMUL-SDS's "step-by-step" binary classification approach: https://github.com/XiaZeng0223/sciverbinary
* SCIFACT task and its evaluation metrics : https://arxiv.org/pdf/2004.14974.pdf
* VERT5ERINI System Approach: https://aclanthology.org/2021.louhi-1.11.pdf 

Approaches are finalized, closing the issue!