RucLuke's Stars
microsoft/LLM2CLIP
LLM2CLIP makes SOTA pretrained CLIP model more SOTA ever.
microsoft/graphrag
A modular graph-based Retrieval-Augmented Generation (RAG) system
baidubce/bce-qianfan-sdk
Provide best practices for LMOps, as well as elegant and convenient access to the features of the Qianfan MaaS Platform. (提供大模型工具链最佳实践,以及优雅且便捷地访问千帆大模型平台)
FlagOpen/FlagEmbedding
Retrieval and Retrieval-augmented LLMs
qdrant/qdrant
Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/
yanqiangmiffy/Chinese-LangChain
中文langchain项目|小必应,Q.Talk,强聊,QiangTalk
BradyFU/Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
PaddlePaddle/PaddleMIX
Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks, including end-to-end large-scale multi-modal pretrain models and diffusion model toolbox. Equipped with high performance and flexibility.
OFA-Sys/Chinese-CLIP
Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
Hannibal046/Awesome-LLM
Awesome-LLM: a curated list of Large Language Model
Mooler0410/LLMsPracticalGuide
A curated list of practical guide resources of LLMs (LLMs Tree, Examples, Papers)
gaussic/tf-idf-keyword
Keyword extraction based on TF-IDF on specific corpus. 基于特定语料库的TF-IDF的中文关键词提取
yanyiwu/simhash
中文文档simhash值计算
mli/paper-reading
深度学习经典、新论文逐段精读
idealo/imagededup
😎 Finding duplicate images made easy!
PaddlePaddle/PaddleSlim
PaddleSlim is an open-source library for deep model compression and architecture search.
ludwig-ai/ludwig
Low-code framework for building custom LLMs, neural networks, and other AI models
karpathy/nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
harvardnlp/annotated-transformer
An annotated implementation of the Transformer paper.
Dao-AILab/flash-attention
Fast and memory-efficient exact attention
HqWu-HITCS/Awesome-Chinese-LLM
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
PaddlePaddle/PaddleRec
Recommendation Algorithm大规模推荐算法库,包含推荐系统经典及最新算法LR、Wide&Deep、DSSM、TDM、MIND、Word2Vec、Bert4Rec、DeepWalk、SSR、AITM,DSIN,SIGN,IPREC、GRU4Rec、Youtube_dnn、NCF、GNN、FM、FFM、DeepFM、DCN、DIN、DIEN、DLRM、MMOE、PLE、ESMM、ESCMM, MAML、xDeepFM、DeepFEFM、NFM、AFM、RALM、DMR、GateNet、NAML、DIFM、Deep Crossing、PNN、BST、AutoInt、FGCNN、FLEN、Fibinet、ListWise、DeepRec、ENSFM,TiSAS,AutoFIS等,包含经典推荐系统数据集criteo 、movielens等
PaddlePaddle/ERNIE
Official implementations for various pre-training models of ERNIE-family, covering topics of Language Understanding & Generation, Multimodal Understanding & Generation, and beyond.
PaddlePaddle/PaddleNLP
👑 Easy-to-use and powerful NLP and LLM library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search, ❓ Question Answering, ℹ️ Information Extraction, 📄 Document Intelligence, 💌 Sentiment Analysis etc.
baidu/lac
百度NLP:分词,词性标注,命名实体识别,词重要性
DA-southampton/NLP_ability
总结梳理自然语言处理工程师(NLP)需要积累的各方面知识,包括面试题,各种基础知识,工程能力等等,提升核心竞争力
wzhe06/Ad-papers
Papers on Computational Advertising
metarank/lightgbm4j
Java LightGBM binding
shap/shap
A game theoretic approach to explain the output of any machine learning model.
PhantomGrapes/MGeo
MGeo: Multi-Modal Geographic Language Model Pre-Training