Pinned Repositories
aaa
abc
alpaca-lora
Instruct-tune LLaMA on consumer hardware
d2l-zh
《动手学深度学习》:面向中文读者、能运行、可讨论。中英文版被60多个国家的400多所大学用于教学。
gitdemo
1
lm-evaluation-harness
A framework for few-shot evaluation of language models.
mergekit
Tools for merging pretrained large language models.
MergeLM
Codebase for Merging Language Models
ModuleFormer
ModuleFormer is a MoE-based architecture that includes two different types of experts: stick-breaking attention heads and feedforward experts. We released a collection of ModuleFormer-based Language Models (MoLM) ranging in scale from 4 billion to 8 billion parameters.
pr
sasgkhgw's Repositories
sasgkhgw/aaa
sasgkhgw/abc
sasgkhgw/alpaca-lora
Instruct-tune LLaMA on consumer hardware
sasgkhgw/d2l-zh
《动手学深度学习》:面向中文读者、能运行、可讨论。中英文版被60多个国家的400多所大学用于教学。
sasgkhgw/gitdemo
1
sasgkhgw/lm-evaluation-harness
A framework for few-shot evaluation of language models.
sasgkhgw/mergekit
Tools for merging pretrained large language models.
sasgkhgw/MergeLM
Codebase for Merging Language Models
sasgkhgw/ModuleFormer
ModuleFormer is a MoE-based architecture that includes two different types of experts: stick-breaking attention heads and feedforward experts. We released a collection of ModuleFormer-based Language Models (MoLM) ranging in scale from 4 billion to 8 billion parameters.
sasgkhgw/pr
sasgkhgw/Prompt-Engineering-Guide
🐙 Guides, papers, lecture, notebooks and resources for prompt engineering
sasgkhgw/sparsegpt
Code for the ICML 2023 paper "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot".
sasgkhgw/stanford_alpaca
Code and documentation to train Stanford's Alpaca models, and generate the data.
sasgkhgw/TransportationNetworks
Transportation Networks for Research
sasgkhgw/wanda
A simple and effective LLM pruning approach.