FasterDecoding/Medusa
Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads
Jupyter NotebookApache-2.0
Issues
- 0
- 0
Question about the Tree Attention Mechanism
#127 opened by chansonzhang - 0
About Code compatability
#126 opened by kimjoohyungsd - 0
Ask for data recipe to reproduce Medusa-2
#125 opened by Achazwl - 0
- 2
ImportError: cannot import name 'is_flash_attn_available' from 'transformers.utils'
#98 opened by imneov - 0
- 2
- 0
About the Tree Sparsity
#122 opened by PineTreeWss - 2
Training code is not working
#118 opened by ksajan - 1
[Retraining] Use Liger Kernel to avoid multi-head logits materialization and scale the context length by N times
#119 opened by ByronHsu - 6
Training Medusa heads
#70 opened by mmilunovic-mdcs - 0
Instruct data format
#117 opened by orhan6116 - 0
Are Medusa Heads computed in parallel or serially?
#116 opened by userljz - 0
updated medusa models in huggingface?
#114 opened by hustxiayang - 0
[ISSUE] The Pull Request at https://github.com/FasterDecoding/Medusa/pull/97 from Narsil/medusa2 needs to be rolled back.
#112 opened by super-ahn - 0
do you support Amd gpu -- rocm ??
#111 opened by amd-maheshs3 - 2
- 5
Using Medusa with Whisper
#100 opened by AvivSham - 0
Does Medusa support beam search decoding strategy?
#108 opened by xs229 - 0
The implementation of stage 2 with axolotl
#107 opened by boxiaowave - 0
PPL compute
#106 opened by yuyangxie96 - 2
Token-wise the same generalization?
#99 opened by Ageliss - 0
Containerization with Dockerfile to setup medusa
#104 opened by gangooteli - 0
- 7
- 5
Medusa Training Loss
#95 opened by TomYang-TZ - 0
[bug] fix preprocess function
#101 opened by xiezipeng-ML - 3
Is there no way to inference without training?
#77 opened by MoOo2mini - 1
Is there a bug in gen_model_answer_baseline.py?
#96 opened by qspang - 1
train medusa stage-2
#94 opened by smartliuhw - 0
mistral.json
#93 opened by Git-L1 - 0
- 2
- 0
Cant it support chatgllm?
#91 opened by PeterXiaTian - 0
HYDRA support?
#90 opened by arunpatala - 0
Misleading Name LLM Name MEDUSA
#89 opened by Pittconnect - 0
about Medusa mask details
#88 opened by dhcode-cpp - 1
release medusa-llm v1.0
#84 opened by zhyncs - 0
[Dynamic Batching] Concerns about whether features are not supported using Medusa
#82 opened by Ageliss - 0
Encounter an CUDA error when set Medusa head
#81 opened by 1649759610 - 2
- 0
deepspeed support
#78 opened by jiangix-paper - 1
- 2
Medusa 1 and 2 speed up
#73 opened by LotuSrc - 3
- 2
About changing LLM from LLAMA to LLAMA-2
#68 opened by dydrkfl06 - 2
- 1
Question about Heads warmup
#74 opened by eloooooon - 5
Clarifications on Models + Batch Size
#66 opened by RonanKMcGovern