schen149

I process language.

MicrosoftRedmond

schen149's Stars

vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python31.2k 253 5.5k4.8k
qdrant/qdrant
Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/
Language:Rust20.8k 126 1.3k1.4k
google-deepmind/pysc2
StarCraft II Learning Environment
Language:Python8k 347 2821.2k
mit-han-lab/streaming-llm
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
Language:Python6.7k 65 83369
allenai/OLMo
Modeling, training, eval, and inference code for OLMo
Language:Python4.8k 48 202492
srush/Tensor-Puzzles
Solve puzzles. Improve your pytorch.
Language:Jupyter Notebook3.3k 13 20282
google/BIG-bench
Beyond the Imitation Game collaborative benchmark for measuring and extrapolating the capabilities of language models
Language:Python2.9k 51 151592
castorini/pyserini
Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.
Language:Python1.7k 18 549379
allenai/natural-instructions
Expanding natural instructions
Language:Python962 21 161190
jxmorris12/vec2text
utilities for decoding deep representations (like sentence embeddings) back to text
Language:Python750 12 6285
EdinburghNLP/awesome-hallucination-detection
List of papers on hallucination detection in LLMs.
691 25 457
mlfoundations/task_vectors
Editing Models with Task Arithmetic
Language:Python433 9 1836
luyug/GradCache
Run Effective Large Batch Contrastive Learning Beyond GPU/TPU Memory Constraint
Language:Python363 9 2924
swj0419/detect-pretrain-code
This repository provides an original implementation of Detecting Pretraining Data from Large Language Models by *Weijia Shi, *Anirudh Ajith, Mengzhou Xia, Yangsibo Huang, Daogao Liu , Terra Blevins , Danqi Chen , Luke Zettlemoyer.
Language:Python209 1 1623
GFNOrg/gfn-lm-tuning
Language:Jupyter Notebook153 3 1020
chentong0/factoid-wiki
Dense X Retrieval: What Retrieval Granularity Should We Use?
136 9 109
chaitanyamalaviya/ExpertQA
[Data + code] ExpertQA : Expert-Curated Questions and Attributed Answers
Language:Python122 6 613
ludwigwinkler/JaxLightning
Running Jax in PyTorch Lightning
Language:Python82 4 11
schen149/sub-sentence-encoder
The official code repo for "Sub-Sentence Encoder: Contrastive Learning of Propositional Semantic Representations".
Language:Python75 1 30
danieldeutsch/repro
Repro is a library for easily running code from published papers via Docker.
Language:Python40 1 106
ryokamoi/wice
This repository contains the dataset and code for "WiCE: Real-World Entailment for Claims in Wikipedia" in EMNLP 2023.
Language:Python40 2 41
shadowkiller33/Contrast-Instruction
Language:Python19 3 03
google-research-datasets/PropSegmEnt
PropSegmEnt is an annotated dataset for segmenting English text into propositions, and recognizing proposition-level entailment relations - whether a different, related document entails each proposition, contradicts it, or neither. It consists of clusters of closely related documents from the news and Wikipedia domains.
18 3 12
CogComp/MultiOpEd
MULTIOPED: A Corpus of Multi-Perspective News Editorials.
Language:Python10 6 02
naimenz/inverse-scaling-eval-pipeline
Basic pipeline for running different sized GPT models and plotting the results
Language:Python9 2 016
TRUMANCFY/MixGR
Language:Jupyter Notebook5 1 00
JHU-CLSP/Cost-Effective-Experiment
Scripts and docs that help us run cost effective experiment with OpenAI APIs
Language:Python4 6 02
CogComp/transformer-lm-demo
A simple demo of transformer language models, mostly for our internal use: http://dickens.seas.upenn.edu:4001
Language:Python3 1 12
schen149/PropSegmEnt
PropSegmEnt is an annotated dataset for segmenting English text into propositions, and recognizing proposition-level entailment relations - whether a different, related document entails each proposition, contradicts it, or neither. It consists of clusters of closely related documents from the news and Wikipedia domains.
2 0 00
stevenysw/causal_gfl
Language:R1 1 00

schen149

schen149's Stars

vllm-project/vllm

qdrant/qdrant

google-deepmind/pysc2

mit-han-lab/streaming-llm

allenai/OLMo

srush/Tensor-Puzzles

google/BIG-bench

castorini/pyserini

allenai/natural-instructions

jxmorris12/vec2text

EdinburghNLP/awesome-hallucination-detection

mlfoundations/task_vectors

luyug/GradCache

swj0419/detect-pretrain-code

GFNOrg/gfn-lm-tuning

chentong0/factoid-wiki

chaitanyamalaviya/ExpertQA

ludwigwinkler/JaxLightning

schen149/sub-sentence-encoder

danieldeutsch/repro

ryokamoi/wice

shadowkiller33/Contrast-Instruction

google-research-datasets/PropSegmEnt

CogComp/MultiOpEd

naimenz/inverse-scaling-eval-pipeline

TRUMANCFY/MixGR

JHU-CLSP/Cost-Effective-Experiment

CogComp/transformer-lm-demo

schen149/PropSegmEnt

stevenysw/causal_gfl