Pinned Repositories
bert_distill_lstm
Distilling Task-Specific Knowledge from BERT into Simple Neural Networks.
ChineseSemanticKB
ChineseSemanticKB,chinese semantic knowledge base, 面向中文处理的12类、百万规模的语义常用词典,包括34万抽象语义库、34万反义语义库、43万同义语义库等,可支持句子扩展、转写、事件抽象与泛化等多种应用场景。
dssm
A BiGRU-Attention DSSM implementation with tensorflow estimator.
FastDFA
DFA: an efficient string matching algorithm implementation.
keras_bert_classification
Bert-classification and bert-dssm implementation with keras.
KnowledgeDistillation
Knowledge distillation in text classification with pytorch. 知识蒸馏,中文文本分类,教师模型BERT、XLNET,学生模型biLSTM。
mvdssm
A Multi-View DSSM for Recommendation System with tensorflow estimator.
paddledssm
dssm code with paddle
tf-ncf
A Neural Collaborative Filtering implementation with tensorflow estimator.
two_tower_recommendation_system
A two tower recommendation system implementation with tensorflow estimator, for CTR or Recall.
cdj0311's Repositories
cdj0311/llama-moe
⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training
cdj0311/self-rewarding-lm-pytorch
Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI
cdj0311/2021-GAIIC-phase3-idea
cdj0311/autogen
Enable Next-Gen Large Language Model Applications. Join our Discord: https://discord.gg/pAbnFJrkgZ
cdj0311/CodeR
cdj0311/devika
Devika is an Agentic AI Software Engineer that can understand high-level human instructions, break them down into steps, research relevant information, and write code to achieve the given objective. Devika aims to be a competitive open-source alternative to Devin by Cognition AI.
cdj0311/EasyContext
Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.
cdj0311/fastllm
纯c++的全平台llm加速库,支持python调用,chatglm-6B级模型单卡可达10000+token / s,支持glm, llama, moss基座,手机端流畅运行
cdj0311/fms-fsdp
🚀 Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flash attention v2.
cdj0311/free-programming-books
:books: Freely available programming books
cdj0311/grok-1
Grok open release
cdj0311/InfiniTransformer
Unofficial PyTorch/🤗Transformers(Gemma/Llama3) implementation of Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention
cdj0311/lightllm
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
cdj0311/LLaMA-Efficient-Tuning
Easy-to-use fine-tuning framework using PEFT (PT+SFT+RLHF with QLoRA)
cdj0311/LLM-Shearing
Preprint: Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning
cdj0311/long-context-attention
Sequence Parallel Attention for Long Context LLM Model Training and Inference
cdj0311/long_llama
LongLLaMA is a large language model capable of handling long contexts. It is based on OpenLLaMA and fine-tuned with the Focused Transformer (FoT) method.
cdj0311/LongLM
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
cdj0311/Medusa
Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads
cdj0311/Megatron-LLM
distributed trainer for LLMs
cdj0311/mergekit
Tools for merging pretrained large language models.
cdj0311/MergeLM
Codebase for Merging Language Models
cdj0311/MNBVC
MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。
cdj0311/OpenMoE
A family of open-sourced Mixture-of-Experts (MoE) Large Language Models
cdj0311/Pai-Megatron-Patch
The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.
cdj0311/Qwen-7B
cdj0311/S-LoRA
S-LoRA: Serving Thousands of Concurrent LoRA Adapters
cdj0311/self-rag
This includes the original implementation of SELF-RAG: Learning to Retrieve, Generate and Critique through self-reflection by Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, and Hannaneh Hajishirzi.
cdj0311/streaming-llm
Efficient Streaming Language Models with Attention Sinks
cdj0311/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.