FasterDecoding/Medusa

Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads

Jupyter NotebookApache-2.0

Pinned issues

Roadmap

#3 opened a year ago by ctlllll

Open15

Issues

Is Medusa(-2) compatible with vision language models (VLMs) ?
#129 opened 22 days ago by MoritzLaurer
0
Question about the Tree Attention Mechanism
#127 opened a month ago by chansonzhang
0
About Code compatability
#126 opened 2 months ago by kimjoohyungsd
0
Ask for data recipe to reproduce Medusa-2
#125 opened 2 months ago by Achazwl
0
[report bug] Encountered when inferencing with Mistral models
#124 opened 2 months ago by shrango
0
ImportError: cannot import name 'is_flash_attn_available' from 'transformers.utils'
#98 opened 8 months ago by imneov
2
How to run inference of a baseline model without medusa support?
#123 opened 3 months ago by kailashg26
0
jinja2.exceptions.UndefinedError: dict object has no element 0
#115 opened 5 months ago by LLLL114
2
About the Tree Sparsity
#122 opened 3 months ago by PineTreeWss
0
Training code is not working
#118 opened 4 months ago by ksajan
2
[Retraining] Use Liger Kernel to avoid multi-head logits materialization and scale the context length by N times
#119 opened 4 months ago by ByronHsu
1
Training Medusa heads
#70 opened a year ago by mmilunovic-mdcs
6
Instruct data format
#117 opened 4 months ago by orhan6116
0
Are Medusa Heads computed in parallel or serially?
#116 opened 5 months ago by userljz
0
updated medusa models in huggingface?
#114 opened 5 months ago by hustxiayang
0
[ISSUE] The Pull Request at https://github.com/FasterDecoding/Medusa/pull/97 from Narsil/medusa2 needs to be rolled back.
#112 opened 5 months ago by super-ahn
0
do you support Amd gpu -- rocm ??
#111 opened 6 months ago by amd-maheshs3
0
Errors occurred during the environment and training
#110 opened 6 months ago by blacker521
2
Using Medusa with Whisper
#100 opened 8 months ago by AvivSham
5
Does Medusa support beam search decoding strategy?
#108 opened 6 months ago by xs229
0
The implementation of stage 2 with axolotl
#107 opened 7 months ago by boxiaowave
0
PPL compute
#106 opened 7 months ago by yuyangxie96
0
Token-wise the same generalization?
#99 opened 7 months ago by Ageliss
2
Containerization with Dockerfile to setup medusa
#104 opened 7 months ago by gangooteli
0
Conversation roles must alternate user/assistant/user/assistant/
#102 opened 7 months ago by gangooteli
0
How to use the finetuned mistal model for inference with Medusa
#75 opened a year ago by pradeepdev-1995
7
Medusa Training Loss
#95 opened 9 months ago by TomYang-TZ
5
[bug] fix preprocess function
#101 opened 8 months ago by xiezipeng-ML
0
Is there no way to inference without training?
#77 opened a year ago by MoOo2mini
3
Is there a bug in gen_model_answer_baseline.py?
#96 opened 8 months ago by qspang
1
train medusa stage-2
#94 opened 9 months ago by smartliuhw
1
mistral.json
#93 opened 9 months ago by Git-L1
0
which dataset should i use when training medusa heads with llama2 7b
#92 opened 9 months ago by tu2022
0
Why medusa-2 train llama2 with no such great improvement?
#85 opened 10 months ago by MeJerry215
2
Cant it support chatgllm?
#91 opened 9 months ago by PeterXiaTian
0
HYDRA support?
#90 opened 9 months ago by arunpatala
0
Misleading Name LLM Name MEDUSA
#89 opened 10 months ago by Pittconnect
0
about Medusa mask details
#88 opened 10 months ago by dhcode-cpp
0
release medusa-llm v1.0
#84 opened 10 months ago by zhyncs
1
[Dynamic Batching] Concerns about whether features are not supported using Medusa
#82 opened 10 months ago by Ageliss
0
Encounter an CUDA error when set Medusa head
#81 opened 10 months ago by 1649759610
0
Why the speed up of Medusa 1 on vicuna changed?
#79 opened a year ago by niyunsheng
2
deepspeed support
#78 opened a year ago by jiangix-paper
0
medusa-2 HF repo has no 'medusa_num_heads' in config
#76 opened a year ago by HaebinShin
1
Medusa 1 and 2 speed up
#73 opened a year ago by LotuSrc
2
OSError
#69 opened a year ago by qspang
3
About changing LLM from LLAMA to LLAMA-2
#68 opened a year ago by dydrkfl06
2
how did you construct the sparse tree architecture
#67 opened a year ago by pengfeiwu1999
2
Question about Heads warmup
#74 opened a year ago by eloooooon
1
Clarifications on Models + Batch Size
#66 opened a year ago by RonanKMcGovern
5