Is this work for other models, such as GPT-2?
anhdang000 opened this issue · 1 comments
anhdang000 commented
Is this work for other models, such as GPT-2?
Thanks
Eric-mingjie commented
We haven't tried smaller models like GPT-2. But the method, in particular our pruning metric, should be general for Transformer based LLMs.