ZXXSG

...

ZXXSG's Stars

SmirkCao/Lihang
Statistical learning methods, 统计学习方法(第2版)[李航] [笔记, 代码, notebook, 参考文献, Errata, lihang]
Language:Python6k1.6k
hktxt/Learn-Statistical-Learning-Method
Implementation of Statistical Learning Method, Second Edition.《统计学习方法》第二版，算法实现。
Language:Jupyter Notebook822272
RUC-GSAI/Yulan-GARDEN
Official Repository for SIGIR2024 Demo Paper "An Integrated Data Processing Framework for Pretraining Foundation Models"
Language:Python538
LLMBook-zh/LLMBook-zh.github.io
《大语言模型》作者：赵鑫，李军毅，周昆，唐天一，文继荣
2.2k149
npubird/KnowledgeGraphCourse
东南大学《知识图谱》研究生课程
3.9k1.1k
AlibabaResearch/AdvancedLiterateMachinery
A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.
Language:C++1.4k165
chaoswork/sft_datasets
开源SFT数据集整理,随时补充
41533
RUCAIBox/LLMSurvey
The official GitHub page for the survey paper "A Survey of Large Language Models".
Language:Python10.1k797
chakki-works/seqeval
A Python framework for sequence labeling evaluation(named-entity recognition, pos tagging, etc...)
Language:Python1.1k129
FlagOpen/FlagEmbedding
Retrieval and Retrieval-augmented LLMs
Language:Python7k511
ChatGPTNextWeb/ChatGPT-Next-Web
A cross-platform ChatGPT/Gemini UI (Web / PWA / Linux / Win / MacOS). 一键拥有你自己的跨平台 ChatGPT/Gemini 应用。
Language:TypeScript75.4k58.9k
MuQiuJun-AI/bert4pytorch
超轻量级bert的pytorch版本，大量中文注释，容易修改结构，持续更新
Language:Python40368
apple/corenet
CoreNet: A library for training deep neural networks
Language:Python6.9k540
airaria/TextBrewer
A PyTorch-based knowledge distillation toolkit for natural language processing
Language:Python1.6k240
datawhalechina/hugging-llm
HuggingLLM, Hugging Future.
Language:Jupyter Notebook2.7k351
AimeeLee77/keyword_extraction
利用Python实现中文文本关键词抽取，分别采用TF-IDF、TextRank、Word2Vec词聚类三种方法。
Language:Python1.1k378
dongrixinyu/JioNLP
中文 NLP 预处理、解析工具包，准确、高效、易用 A Chinese NLP Preprocessing & Parsing Package www.jionlp.com
Language:Python3.3k400
km1994/AwesomeNLP
此项目完成了关于 NLP-Beginner：自然语言处理入门练习的所有任务（文本分类、信息抽取、知识图谱、机器翻译、问答系统、文本生成、Text-to-SQL、文本纠错、文本挖掘、知识蒸馏、模型加速、OCR、TTS、Prompt、embedding等），所有代码都经过测试,可以正常运行。
16718
wmathor/nlp-tutorial
Natural Language Processing Tutorial for Deep Learning Researchers
Language:Jupyter Notebook1.1k356
crownpku/Awesome-Chinese-NLP
A curated list of resources for Chinese NLP 中文自然语言处理相关资料
7.8k1.7k
X-PLUG/ChatPLUG
A Chinese Open-Domain Dialogue System
Language:Python31027
nl8590687/ASRT_SpeechRecognition
A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统
Language:Python7.8k1.9k
PaddlePaddle/PaddleSpeech
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
Language:Python11k1.8k
double22a/speech_dataset
The dataset of Speech Recognition
38372
InternLM/InternLM-XComposer
InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
Language:Python2.5k153
ayaka14732/bert-tokenizer-cantonese
BERT Tokenizer with vocabulary tailored for Cantonese
Language:Python183
esbatmop/MNBVC
MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化，也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。
3.4k233
thu-coai/COLDataset
The official repository of the paper: COLD: A Benchmark for Chinese Offensive Language Detection
20718
HIT-SCIR/ltp
Language Technology Platform
Language:Python4.9k1k
iflytek/HFL-Anthology
Collections of resources from Joint Laboratory of HIT and iFLYTEK Research (HFL)
Language:Markdown35940

ZXXSG

ZXXSG's Stars

SmirkCao/Lihang

hktxt/Learn-Statistical-Learning-Method

RUC-GSAI/Yulan-GARDEN

LLMBook-zh/LLMBook-zh.github.io

npubird/KnowledgeGraphCourse

AlibabaResearch/AdvancedLiterateMachinery

chaoswork/sft_datasets

RUCAIBox/LLMSurvey

chakki-works/seqeval

FlagOpen/FlagEmbedding

ChatGPTNextWeb/ChatGPT-Next-Web

MuQiuJun-AI/bert4pytorch

apple/corenet

airaria/TextBrewer

datawhalechina/hugging-llm

AimeeLee77/keyword_extraction

dongrixinyu/JioNLP

km1994/AwesomeNLP

wmathor/nlp-tutorial

crownpku/Awesome-Chinese-NLP

X-PLUG/ChatPLUG

nl8590687/ASRT_SpeechRecognition

PaddlePaddle/PaddleSpeech

double22a/speech_dataset

InternLM/InternLM-XComposer

ayaka14732/bert-tokenizer-cantonese

esbatmop/MNBVC

thu-coai/COLDataset

HIT-SCIR/ltp

iflytek/HFL-Anthology