innerfirexy's Stars
GrowingGit/GitHub-Chinese-Top-Charts
:cn: GitHub中文排行榜,各语言分设「软件 | 资料」榜单,精准定位中文好项目。各取所需,高效学习。
karpathy/nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
vpncn/vpncn.github.io
2024**翻墙软件VPN推荐以及科学上网避坑,稳定好用。对比SSR机场、蓝灯、V2ray、老王VPN、VPS搭建梯子等科学上网与翻墙软件,**最新科学上网翻墙梯子VPN下载推荐,访问Chatgpt。
pwxcoo/chinese-xinhua
:orange_book: 中华新华字典数据库。包括歇后语,成语,词语,汉字。
statsmodels/statsmodels
Statsmodels: statistical modeling and econometrics in Python
crownpku/Awesome-Chinese-NLP
A curated list of resources for Chinese NLP 中文自然语言处理相关资料
esbatmop/MNBVC
MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。
princeton-nlp/SimCSE
[EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821
CLUEbenchmark/SuperCLUE
SuperCLUE: 中文通用大模型综合性基准 | A Benchmark for Foundation Models in Chinese
rguthrie3/DeepLearningForNLPInPytorch
An IPython Notebook tutorial on deep learning for natural language processing, including structure prediction.
NiuTrans/Classical-Modern
非常全的文言文(古文)-现代文平行语料
Ruzim/NSFC-application-template-latex
国家自然科学基金申请书正文(面上项目)LaTeX 模板(非官方)
mathiasuy/Soluciones-Klenberg
Algorithm Design (Kleinberg Tardos 2005) - Solutions
inverse-scaling/prize
A prize for finding tasks that cause large language models to show inverse scaling
sgrvinod/a-PyTorch-Tutorial-to-Sequence-Labeling
Empower Sequence Labeling with Task-Aware Neural Language Model | a PyTorch Tutorial to Sequence Labeling
garywill/cc-visualize
既适合程序员,也适合中文电子文字整编人员(in beta)。汉字繁、简、异、兼、笔、变等关联关系可视化。非寻常汉字字符、同形字符攻击、不可打印字符等检视工具。结合OpenCC、Unicode等数据 | Chinese characters relations or vatiants (simplified, traditional etc) visualization. Potential Unihan/UCD homograph/punycode attack/phishing, non-printable invisible characters inspector
gentaiscool/code-switching-papers
A curated list of research papers and resources on code-switching
floriankark/cs224n-win2223
Code and written solutions of the assignments of the Stanford CS224N: Natural Language Processing with Deep Learning course from winter 2022/2023
baoguangsheng/fast-detect-gpt
Code base for "Fast-DetectGPT: Efficient Zero-Shot Detection of Machine-Generated Text via Conditional Probability Curvature".
Anwarvic/Dan-Jurafsky--Chris-Manning--NLP
My solution to the Natural Language Processing course made by Dan Jurafsky, Chris Manning in Winter 2012.
ari-holtzman/degen
Official Repository for "The Curious Case of Neural Text Degeneration"
matthen/dstc
Dialog State Tracking Challenge 2 & 3 Data
garywill/vert-cjk-web
(in alpha) 网页竖排。右起縱書。像古代一样。Make webs vertical lined layout, like traditional CJK writing method in east asian culture circle.(招日韩蒙越翻译)
danielgrittner/nanoGPT-LoRA
The simplest, fastest repository for training/finetuning medium-sized GPTs with LoRA support.
gpoesia/minbert-default-final-project
CS 224N Winter 2023 Default Final Project: Multitask BERT
daandouwe/neural-ngram
Neural ngram language model in PyTorch.
baaraban/pytorch_ner
LSTM based model for Named Entity Recognition Task using pytorch and GloVe embeddings
devSuchit/nlp-cky-PCFG
This repository contains an implementation of the CKY parsing for English. (NLP)
almostimplemented/alg_design
Java implementations of algorithms and structures from "Algorithm Design" by Kleinberg and Tardos.
shawntai/NLP-HW2