Pinned Repositories
destiny
NoTrainNoGain
Revisiting Efficient Training Algorithms For Transformer-based Language Models (NeurIPS 2023)
cramming
Cramming the training of a (BERT-type) language model into limited compute.
cramming
Cramming the training of a (BERT-type) language model into limited compute.
DeepLearning-500-questions
深度学习500问,以问答形式对常用的概率知识、线性代数、机器学习、深度学习、计算机视觉等热点问题进行阐述,以帮助自己及有需要的读者。 全书分为18个章节,50余万字。由于水平有限,书中不妥之处恳请广大读者批评指正。 未完待续............ 如有意合作,联系scutjy2015@163.com 版权所有,违权必究 Tan 2018.06
elc-bert
evaluation-pipeline-2024
The evaluation pipeline for the 2024 BabyLM Challenge.
Probing
Probing BERT-like models
shiwenqin.github.io
Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes
shiwenqin's Repositories
shiwenqin/cramming
Cramming the training of a (BERT-type) language model into limited compute.
shiwenqin/DeepLearning-500-questions
深度学习500问,以问答形式对常用的概率知识、线性代数、机器学习、深度学习、计算机视觉等热点问题进行阐述,以帮助自己及有需要的读者。 全书分为18个章节,50余万字。由于水平有限,书中不妥之处恳请广大读者批评指正。 未完待续............ 如有意合作,联系scutjy2015@163.com 版权所有,违权必究 Tan 2018.06
shiwenqin/elc-bert
shiwenqin/evaluation-pipeline-2024
The evaluation pipeline for the 2024 BabyLM Challenge.
shiwenqin/Probing
Probing BERT-like models
shiwenqin/shiwenqin.github.io
Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes