marthuis's Stars
gruns/icecream
🍦 Never use print() to debug again.
guidance-ai/guidance
A guidance language for controlling large language models.
squat/drae
A RESTful API for el Diccionario de la Real Academia Española
PySimpleGUI/PySimpleGUI
Python GUIs for Humans! PySimpleGUI is the top-rated Python application development environment. Launched in 2018 and actively developed, maintained, and supported in 2024. Transforms tkinter, Qt, WxPython, and Remi into a simple, intuitive, and fun experience for both hobbyists and expert users.
allenai/scientific-claim-generation
Generating claims for zero-shot scientific fact checking
nlpfromscratch/nlp-llms-resources
Master list of curated resources on NLP and LLMs
hltfbk/E3C-Corpus
E3C is a freely available multilingual corpus (Italian, English, French, Spanish, and Basque) of semantically annotated clinical narratives to allow for the linguistic analysis, benchmarking, and training of information extraction systems. It consists of two types of annotations: (i) clinical entities: pathologies, symptoms, procedures, body parts, etc., according to standard clinical taxonomies (i.e. SNOMED-CT, ICD-10); and (ii) temporal information and factuality: events, time expressions, and temporal relations according to the THYME standard. The corpus is organised into three layers, with different purposes. Layer 1: about 25K tokens per language with full manual annotation of clinical entities, temporal information and factuality, for benchmarkingand linguistic analysis. Layer 2: 50-100K tokens per language with semi-automatic annotations of clinical entities, to be used to train baseline systems. Layer 3: about 1M tokens per language of non-annotated medical documents to be exploited by semi-supervised approaches. Researchers can use the benchmark training and test splits of our corpus to develop and test their own models. We trained several deep learning based models and provide baselines using the benchmark. Both the corpus and the built models will be available through the ELG platform.
facebookresearch/DPR
Dense Passage Retriever - is a set of tools and models for open domain Q&A task.
getalp/wikIR
A python tool for building large scale Wikipedia-based Information Retrieval datasets
attardi/wikiextractor
A tool for extracting plain text from Wikipedia dumps
harpribot/awesome-information-retrieval
A curated list of awesome information retrieval resources
allenai/scifact
Data and models for the SciFact verification task.
wikifactcheck-english/wikifactcheck-english
Data and download script to accompany LREC2020 paper "Automated Fact-Checking of Claims from Wikipedia"
wikifactcheck-english/wfc-en-crawl
repository housing web-crawling and scraping code for WikiFactCheck-en evidence
jiho283/FactKG
Official repository of FactKG
orai-nlp/SpanishGLUE
Spanish NLU Evaluation Framework / Marco de Evaluación para NLU en Castellano
allenai/scitail
Given a pair of sentences (premise, hypothesis), the decomposed graph entailment model (DGEM) predicts whether the premise can be used to infer the hypothesis.
kay-wong/Wiki-Reliability
Wiki-Reliability: A Large Scale Dataset for Content Reliability on Wikipedia
artetxem/esxnli
A bilingual NLI dataset annotated in Spanish and human translated into English
XInfoTabS/dataset
The Official dataset for "XINFOTABS: Evaluating Multilingual Tabular Natural Language Inference", containing tables and corresponding hypothesis in 10 languages.
facebookresearch/XNLI
Evaluating Cross-lingual Sentence Representations
salesforce/factCC
Resources for the "Evaluating the Factual Consistency of Abstractive Text Summarization" paper
allenai/gooaq
Question-answers, collected from Google
allenai/OpenBookQA
Code for experiments on OpenBookQA from the EMNLP 2018 paper "Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering"
hadyelsahar/RE-NLG-Dataset
T-Rex : A Large Scale Alignment of Natural Language with Knowledge Base Triples
StonyBrookNLP/musique
Repository for MuSiQue: Multi-hop Questions via Single-hop Question Composition, TACL 2022
promptfoo/promptfoo
Test your prompts, agents, and RAGs. Red teaming, pentesting, and vulnerability scanning for LLMs. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and CI/CD integration.
google-research/true
Code and data accompanying the paper "TRUE: Re-evaluating Factual Consistency Evaluation".
cambridge-wtwt/emnlp2020-stander-news
explosion/spacy-llm
🦙 Integrating LLMs into structured NLP pipelines