Pinned Repositories
ChatGPT-RetrievalQA
A dataset for training/evaluating Question Answering Retrieval models on ChatGPT responses with the possibility to training/evaluating on real human responses.
featureextraction
Feature extraction scripts for the DISCOSUMO project, to be used for extractive summarization of discussion threads.
anonymization
This script anonymizes the comment fields ('omschrijving') in Dutch bank transactions, by removing all person names.
CL
PairwisePreferenceLearning
Performs pairwise preference ranking for a given trainfile and testfile with binary class labels (1 and not 1). The binary classification on the pairwise test data gives a prediction from each pair of test items: which of the two should be ranked higher. From these pairwise preferences a ranking can be created using a greedy sort algorithm.
PFM
Summarization module for the project Patient Forum Miner (with TNO)
RIVM
Shared work for the RIVM project (subproject 3: Patient empowerment in online support communities)
termprofiling
Implementation of the term scoring algorithm in Tomokiyo & Hurst (2003), based on Kullback-Leibler Divergence (kldiv). Given a foreground and background corpus, it returns the most descriptive terms of the foreground corpus in the form of a termcloud
textgen
Software that generates text in the style of the oeuvre that is added as argument (in plain text). Every run provides unique output, stored with a randomized integer in the output filename.
Women-in-IR
Scripts created in the context of Women in IR
suzanv's Repositories
suzanv/PairwisePreferenceLearning
Performs pairwise preference ranking for a given trainfile and testfile with binary class labels (1 and not 1). The binary classification on the pairwise test data gives a prediction from each pair of test items: which of the two should be ranked higher. From these pairwise preferences a ranking can be created using a greedy sort algorithm.
suzanv/termprofiling
Implementation of the term scoring algorithm in Tomokiyo & Hurst (2003), based on Kullback-Leibler Divergence (kldiv). Given a foreground and background corpus, it returns the most descriptive terms of the foreground corpus in the form of a termcloud
suzanv/Women-in-IR
Scripts created in the context of Women in IR
suzanv/anonymization
This script anonymizes the comment fields ('omschrijving') in Dutch bank transactions, by removing all person names.
suzanv/textgen
Software that generates text in the style of the oeuvre that is added as argument (in plain text). Every run provides unique output, stored with a randomized integer in the output filename.
suzanv/CL
suzanv/PFM
Summarization module for the project Patient Forum Miner (with TNO)
suzanv/RIVM
Shared work for the RIVM project (subproject 3: Patient empowerment in online support communities)