Pinned Repositories
AdaLoRA
AdaLoRA: Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning (ICLR 2023).
Agent-Attention
Official repository of Agent Attention
DeepLearningExamples
State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.
dynamic-sparse-flash-attention
ECSE6320-Advanced-computer-Systems-Fall2023
This repository contains the projects of the course RPI ECSE6320/4320 Advanced Computer Systems Fall2023.
ECSE6680-VLSI-Course-Project
flash-attention
Fast and memory-efficient exact attention
LLM
💻 The Experiments of LLMs
LLMs-Acceleration
📕Large Language Models Acceleration Paper List
Paper-Reading
📖 One Paper a Day, Unemployment Away!
Zhenyu001225's Repositories
Zhenyu001225/LLMs-Acceleration
📕Large Language Models Acceleration Paper List
Zhenyu001225/AdaLoRA
AdaLoRA: Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning (ICLR 2023).
Zhenyu001225/Agent-Attention
Official repository of Agent Attention
Zhenyu001225/DeepLearningExamples
State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.
Zhenyu001225/dynamic-sparse-flash-attention
Zhenyu001225/ECSE6320-Advanced-computer-Systems-Fall2023
This repository contains the projects of the course RPI ECSE6320/4320 Advanced Computer Systems Fall2023.
Zhenyu001225/ECSE6680-VLSI-Course-Project
Zhenyu001225/flash-attention
Fast and memory-efficient exact attention
Zhenyu001225/LLM
💻 The Experiments of LLMs
Zhenyu001225/Paper-Reading
📖 One Paper a Day, Unemployment Away!
Zhenyu001225/Learn2Perturb
Zhenyu001225/llama
Inference code for LLaMA models
Zhenyu001225/LLAMA-FACTORY
Zhenyu001225/llm-action
本项目旨在分享大模型相关技术原理以及实战经验。
Zhenyu001225/LoRA
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
Zhenyu001225/on-the-adversarial-robustness-of-visual-transformer
Code for the paper "On the Adversarial Robustness of Visual Transformers"
Zhenyu001225/peft
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
Zhenyu001225/streaming-llm
Efficient Streaming Language Models with Attention Sinks
Zhenyu001225/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Zhenyu001225/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Zhenyu001225/Zhenyu001225.github.io