VenkteshV's Stars
UKPLab/sentence-transformers
State-of-the-Art Text Embeddings
NVIDIA/FasterTransformer
Transformer related optimization, including BERT, GPT
lxe/simple-llm-finetuner
Simple UI for LLM Model Finetuning
AkariAsai/self-rag
This includes the original implementation of SELF-RAG: Learning to Retrieve, Generate and Critique through self-reflection by Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, and Hannaneh Hajishirzi.
openai/automated-interpretability
yuchenlin/LLM-Blender
[ACL2023] We introduce LLM-Blender, an innovative ensembling framework to attain consistently superior performance by leveraging the diverse strengths of multiple open-source LLMs. LLM-Blender cut the weaknesses through ranking and integrate the strengths through fusing generation to enhance the capability of LLMs.
princeton-nlp/ALCE
[EMNLP 2023] Enabling Large Language Models to Generate Text with Citations. Paper: https://arxiv.org/abs/2305.14627
Blealtan/RWKV-LM-LoRA
RWKV is a RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.
allenai/OLMo-Eval
Evaluation suite for LLMs
microsoft/MSMARCO-Passage-Ranking
MS MARCO(Microsoft Machine Reading Comprehension) is a large scale dataset focused on machine reading comprehension, question answering, and passage ranking. A variant of this task will be the part of TREC and AFIRM 2019. For Updates about TREC 2019 please follow This Repository Passage Reranking task Task Given a query q and a the 1000 most relevant passages P = p1, p2, p3,... p1000, as retrieved by BM25 a succeful system is expected to rerank the most relevant passage as high as possible. For this task not all 1000 relevant items have a human labeled relevant passage. Evaluation will be done using MRR
openjournals/joss-papers
Accepted JOSS papers
shuyanzhou/docprompting
Data and code for "DocPrompting: Generating Code by Retrieving the Docs" @ICLR 2023
asmeurer/removestar
Tool to automatically replace 'import *' in Python files with explicit imports
AndrewMayneProjects/Whisper
Whisper applications
neulab/retomaton
PyTorch code for the RetoMaton paper: "Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval" (ICML 2022)
openjournals/jose-reviews
Reviews for the Journal of Open Source Education (JOSE)
rmit-ir/polyfuse
Fusion for TREC run files with popular fusion techniques
keunwoochoi/tokenizer-vs-tokenizer
cognitivefactory/interactive-clustering
Python package used to apply NLP interactive clustering methods.
mrjleo/ranking-utils
Miscellaneous utilities for ranking models
llm-efficiency-challenge/llm-efficiency-challenge.github.io
Website for NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1 GPU + 1 Day
VenkteshV/DEXTER
Data-Science-for-Linguists-2022/Child-Vocab-Development
This project was originally my term project for a computational linguistics course at Pitt. It was turned into a research project later and I am working on publishing the work.
kevin-rn/Efficient-Fact-checking
Master thesis on supporting fact extraction on large data collections for a more efficient fact-checking process in real-world applications.
avishekanand/al-folio-homepage
A beautiful, simple, clean, and responsive Jekyll theme for academics
rg089/SCANING
[CIKM'23] Code and data for our paper 'James ate 5 oranges = Steve bought 5 pencils': Structure-Aware Denoising for Paraphrasing Word Problems
tomhanlei/20cikm-behavior
VenkteshV/IR_project_19
JSALT2023-FSM/asr-scoring
Common scripts for scoring JSALT 2023 ASR systems
VenkteshV/RAG_NLI_demo