hpi-xnor/BNext

Model size and test speed for help

Closed this issue · 4 comments

Sorry to disturb you again. As I use the script "run_distributed_on_disk_a6k5_AdamW_Curicullum_Large_assistant_teacher_num_3_aa.sh" offered to train, I get the model size is 3715.30M and the pretrained Bnext_large model size is 1246.96M. Is there something wrong with me, can you help me ?Forthermore, the table of paper say that the BNext-L param is 106.1M, what is the matter.

There is other problem : how can I test the quant model speed on the cpu. can you give me same advices?
Thank you so much!

Dear Shan Yang,

I will try to answer the second question. You cannot easily test the speed using existing open-source software toolkit. But we are working on it. We plan to support for both CPU and GPU hardware, and please stay tuned.

Hi Shan Yang,

For your first question: you can check the code here (

if not args.multiprocessing_distributed or args.local_rank == 0:
save_checkpoint({
'epoch': epoch,
'train_loss': training_loss,
'train_top1': training_top1,
'train_top5': training_top5,
'test_loss': testing_loss,
'test_top1': testing_top1,
'test_top5': testing_top5,
'state_dict': model_student.state_dict(),
'best_top1_acc': best_top1_acc,
'optimizer' : optimizer.state_dict(),
'temp': training_temperature,
'alpha': alpha,
}, is_best, args.save + "_" + "{}_optimizer_{}_mixup_{}_cutmix_{}_aug_repeats_{}_KD_{}_assistant_{}_{}_HK_{}_{}_aa_{}__elm_{}_recoup_{}_{}_amp".format(args.model, args.optimizer, args.mixup, args.cutmix, args.aug_repeats, args.teacher_num, args.assistant_teacher_num, args.weak_teacher, args.hard_knowledge, args.hard_knowledge_grains, args.aa, args.elm_attention, args.infor_recoupling, args.gpu, args.epochs), epoch = epoch)
), we save not only the model state_dict, but also optimizer state_dict and the training procedure information, which explain why the checkpoint size is way larger than model size.

For your second question: The existing model is still saved using torch.save() function, which only supports 32-bit representation. In this case, it is impossible to directly get a 106.1M BNext-L using the torch library, even though all weights in HardBinaryConv are represented as +1&-1. We plan to support a BNN-specific torch extension toolkit in the near future, please stay tuned.

Thanks for your answers!

please check the binary layers implemented in bitorch-engine.