Pinned Repositories
Langchain-Chatchat
Langchain-Chatchat(原Langchain-ChatGLM)基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM, Qwen and Llama) RAG and Agent app with langchain
cn-llm-codes
中文LLM的代码合集
CUDA-notes
debuged-Evolve-GCN
对evolve-gcn的源代码进行了debug
ds-chat-bloom
ds-chat 针对bloom进行了debug
hzg0601.github.io
langchain-ChatGLM-annotation
对langchain-ChatGLM项目各模块进行注释,增加了一些新的特性,修复了一些bug
LLM-Notes
大模型技术栈一览
speedai
大规模AI加速方法笔记
ZeQLoRA
ZeQLoRA: Efficient Finetuning of Quantized LLMs with ZeRO and LoRA
hzg0601's Repositories
hzg0601/LLM-Notes
大模型技术栈一览
hzg0601/langchain-ChatGLM-annotation
对langchain-ChatGLM项目各模块进行注释,增加了一些新的特性,修复了一些bug
hzg0601/cn-llm-codes
中文LLM的代码合集
hzg0601/speedai
大规模AI加速方法笔记
hzg0601/CUDA-notes
hzg0601/finetune-embedding
hzg0601/mii-dev
dev for deepspeed-mii
hzg0601/qwen-trt-llm-notion
hzg0601/chat-gpt-langchain-fork
fork from https://huggingface.co/spaces/JavaFXpert/Chat-GPT-LangChain
hzg0601/ds-chat-bloom
ds-chat 针对bloom进行了debug
hzg0601/hzg0601.github.io
hzg0601/lit-llama-cn-annotated
Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.
hzg0601/peft-cn-annotated
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
hzg0601/qlora-zero-cn
qlora+zero算法,加速模型训练,降低显存要求; qlora各模块的中文注释
hzg0601/ZeQLoRA
ZeQLoRA: Efficient Finetuning of Quantized LLMs with ZeRO and LoRA
hzg0601/api-for-open-llm-fork
Openai style api for open large language models, using LLMs just as chatgpt! Support for LLaMA, LLaMA-2, BLOOM, Falcon, Baichuan, Qwen, Xverse, SqlCoder, CodeLLaMA, ChatGLM, ChatGLM2, ChatGLM3 etc. 开源大模型的统一后端接口
hzg0601/CHFS_clean
hzg0601/clash-for-linux-backup
clash for linux备份仓库
hzg0601/cs-224w-cn
cs224w课程的中文笔记
hzg0601/DeepKE-fork
An Open Toolkit for Knowledge Graph Extraction and Construction published at EMNLP2022 System Demonstrations.
hzg0601/Fast-Chatchat
FastChat中文注释见cn_annotation分支,新特性见readme
hzg0601/fastllm-fork
纯c++的全平台llm加速库,支持python调用,chatglm-6B级模型单卡可达10000+token / s,支持glm, llama, moss基座,手机端流畅运行
hzg0601/gnn-translation-books
hzg0601/GPTCache-dev
Semantic cache for LLMs. Fully integrated with LangChain and llama_index.
hzg0601/graphrag-fork
A modular graph-based Retrieval-Augmented Generation (RAG) system
hzg0601/inference-dev
Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
hzg0601/LightChat
一个用于提供LLM服务的轻量级工具
hzg0601/Megatron-LM-fork
Ongoing research training transformer models at scale
hzg0601/TensorRT-LLM-dev
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
hzg0601/weaviate-abc
weaviate入门