innerfirexy

San Diego

innerfirexy's Stars

karpathy/llm.c
LLM training in simple, raw C/CUDA
Language:Cuda24.9k 252 1412.8k
eugeneyan/open-llms
📋 A list of open LLMs available for commercial use.
11.4k 244 37765
RUCAIBox/LLMSurvey
The official GitHub page for the survey paper "A Survey of Large Language Models".
Language:Python10.7k 159 65833
statsmodels/statsmodels
Statsmodels: statistical modeling and econometrics in Python
Language:Python10.3k 287 5.5k3.2k
karpathy/minbpe
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
Language:Python9.3k 85 38877
lancopku/pkuseg-python
pkuseg多领域中文分词工具; The pkuseg toolkit for multi-domain Chinese word segmentation
Language:Python6.6k 208 167988
CLUEbenchmark/CLUEDatasetSearch
搜索所有中文NLP数据集，附常用英文NLP数据集
Language:Python4.2k 62 12614
baidu/lac
百度NLP：分词，词性标注，命名实体识别，词重要性
Language:C++3.9k 105 248595
LnL7/nix-darwin
nix modules for darwin
Language:Nix3.4k 35 587472
ownthink/Jiagu
Jiagu深度学习自然语言处理工具知识图谱关系抽取中文分词词性标注命名实体识别情感分析新词发现关键词文本摘要文本聚类
Language:Python3.3k 87 71614
hankcs/pyhanlp
中文分词
Language:Python3.1k 85 0810
google/BIG-bench
Beyond the Imitation Game collaborative benchmark for measuring and extrapolating the capabilities of language models
Language:Python2.9k 51 151595
EleutherAI/pythia
The hub for EleutherAI's work on interpretability and learning dynamics
Language:Jupyter Notebook2.3k 33 109176
thunlp/UltraChat
Large-scale, Informative, and Diverse Multi-round Chat Data (and Models)
Language:Python2.3k 40 30117
HillZhang1999/llm-hallucination-survey
Reading list of hallucination in LLMs. Check out our new survey paper: "Siren’s Song in the AI Ocean: A Survey on Hallucination in Large Language Models"
960 12 352
rowanz/grover
Code for Defending Against Neural Fake News, https://rowanzellers.com/grover/
Language:Python918 35 58222
stanfordnlp/pyvene
Stanford NLP Python Library for Understanding and Improving PyTorch Models via Interventions
Language:Python671 9 6169
mapull/chinese-dictionary
中文汉语拼音辞典，汉字拼音字典，词典，成语词典，常用字、多音字字典数据库
514 6 12123
likenneth/honest_llama
Inference-Time Intervention: Eliciting Truthful Answers from a Language Model
Language:Python489 9 3838
jlko/semantic_uncertainty
Codebase for reproducing the experiments of the semantic uncertainty paper (short-phrase and sentence-length experiments).
Language:Python264 3 1025
FreedomIntelligence/Huatuo-26M
The Largest-scale Chinese Medical QA Dataset： with 26,000,000 question answer pairs.
230 9 1024
gmftbyGMFTBY/Copyisallyouneed
[ICLR 2023] Codebase for Copy-Generator model, including an implementation of kNN-LM
Language:Python184 4 1422
ttzHome/AnchiBERT
AnchiBERT: A Pre-Trained Model for Ancient Chinese Language Understanding and Generation(古文预训练模型)
62 1 34
gpoesia/minbert-default-final-project
CS 224N Winter 2023 Default Final Project: Multitask BERT
Language:Python25 4 049
GongFuXiong/Chinese-Medical-Question-Answering-System
TensorFlow for Chinese Medical Question Answering(question Answer matching) by LSTM/CNN/LSTM_ATTENTION/IARNN-GATE
Language:Python23 3 29
dayihengliu/a2m_chineseNMT
Dataset for TALLIP2019 paper "Ancient-Modern Chinese Translation with a New Large Training Dataset"
224
viking-sudo-rm/rusty-dawg
Rust library for indexing and quickly searching large pretraining corpora
Language:Rust22 3 163
Andrea-de-Varda/surprisal-across-languages
Code to calculate surprisal values from multilingual XGLM models.
Language:Python4 1 01
bstee615/shared-hf-cache
Language:Shell4 2 00
tpimentelms/probability-of-a-word
Code to compute a word's probability using the fixes from "How to Compute the Probability of a Word"
Language:Python4 1 02

innerfirexy

innerfirexy's Stars

karpathy/llm.c

eugeneyan/open-llms

RUCAIBox/LLMSurvey

statsmodels/statsmodels

karpathy/minbpe

lancopku/pkuseg-python

CLUEbenchmark/CLUEDatasetSearch

baidu/lac

LnL7/nix-darwin

ownthink/Jiagu

hankcs/pyhanlp

google/BIG-bench

EleutherAI/pythia

thunlp/UltraChat

HillZhang1999/llm-hallucination-survey

rowanz/grover

stanfordnlp/pyvene

mapull/chinese-dictionary

likenneth/honest_llama

jlko/semantic_uncertainty

FreedomIntelligence/Huatuo-26M

gmftbyGMFTBY/Copyisallyouneed

ttzHome/AnchiBERT

gpoesia/minbert-default-final-project

GongFuXiong/Chinese-Medical-Question-Answering-System

dayihengliu/a2m_chineseNMT

viking-sudo-rm/rusty-dawg

Andrea-de-Varda/surprisal-across-languages

bstee615/shared-hf-cache

tpimentelms/probability-of-a-word