ymurong's Stars
Azure-Samples/azure-search-openai-demo
A sample app for the Retrieval-Augmented Generation pattern running in Azure, using Azure AI Search for retrieval and Azure OpenAI large language models to power ChatGPT-style and Q&A experiences.
cjhutto/vaderSentiment
VADER Sentiment Analysis. VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media, and works well on texts from other domains.
rasbt/LLMs-from-scratch
Implementing a ChatGPT-like LLM in PyTorch from scratch, step by step
princeton-nlp/tree-of-thought-llm
[NeurIPS 2023] Tree of Thoughts: Deliberate Problem Solving with Large Language Models
google-research-datasets/swim-ir
SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 languages, generated using PaLM 2 and summarize-then-ask prompting.
d223302/A-Closer-Look-To-LLM-Evaluation
Code for EMNLP 2023 findings paper "A Closer Look into Using Large Language Models for Automatic Evaluation"
microsoft/generative-ai-for-beginners
18 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/
openai/evals
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
maastrichtlawtech/gdsr
🔗 A graph-augmented dense statute retriever. (EACL 2023)
UKPLab/sentence-transformers
Multilingual Sentence & Image Embeddings with BERT
maastrichtlawtech/lleqa
🤖 Long-form question answering in the legal domain. (AAAI 2024)
sebastian-hofstaetter/teaching
Open-Source Information Retrieval Courses @ TU Wien
sebastian-hofstaetter/neural-ir-explorer
Neural-IR-Explorer: A Content-Focused Tool to Explore Neural Re-Ranking Results
sebastian-hofstaetter/matchmaker
Training & evaluation library for text-based neural re-ranking and dense retrieval models built with PyTorch
Law-AI/ecir2023tutorial
This repository contains the relevant materials for the tutorial "Legal IR and NLP: the History, Challenges, and State-of-the-Art", held at ECIR 2023, 6th April, 2023.
abdoelsayed2016/Legal-Question-Answering-Review
facebookresearch/ELI5
Scripts and links to recreate the ELI5 dataset.
project-miracl/miracl
A large-scale multilingual dataset for Information Retrieval. Thorough human-annotations across 18 diverse languages.
maastrichtlawtech/bsard
🔍 A statutory article retrieval dataset in French. (ACL 2022)
unicamp-dl/mMARCO
A multilingual version of MS MARCO passage ranking dataset
nyu-dl/dl4ir-doc2query
Amsterdam-Internships/Automatic-Answering-of-City-Council-Questions
jingtaozhan/disentangled-retriever
An easy-to-use python toolkit for flexibly adapting various neural ranking models to any target domain.
RUC-GSAI/YuLan-IR
YuLan-IR: Information Retrieval Boosted LMs
yuanpaner/Recommender_prj
collaborative filter, matrix factorization by SVD, NN
ZhangShiyue/EmailSum
The data and code for EmailSum
microsoft/causica
microsoft/responsible-ai-toolbox
Responsible AI Toolbox is a suite of tools providing model and data exploration and assessment user interfaces and libraries that enable a better understanding of AI systems. These interfaces and libraries empower developers and stakeholders of AI systems to develop and monitor AI more responsibly, and take better data-driven actions.
chdb-io/chdb
chDB is an in-process OLAP SQL Engine 🚀 powered by ClickHouse
microsoft/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.