aespresso/a_journey_into_math_of_ml

您好,请问,transformer 第二节课,2千万的参数怎么计算出来的呢

Opened this issue · 3 comments

如题,谢谢 @aespresso

@aespresso 强烈建议参数的计算单独做个视频~

部分参数如下 :L:12,H:768,A:12,只截取了第一个块
INFO:tensorflow: name = bert/embeddings/word_embeddings:0, shape = (21154, 768), INIT_FROM_CKPT
INFO:tensorflow: name = bert/embeddings/token_type_embeddings:0, shape = (2, 768), INIT_FROM_CKPT
INFO:tensorflow: name = bert/embeddings/position_embeddings:0, shape = (512, 768), INIT_FROM_CKPT
INFO:tensorflow: name = bert/embeddings/LayerNorm/beta:0, shape = (768,), INIT_FROM_CKPT
INFO:tensorflow: name = bert/embeddings/LayerNorm/gamma:0, shape = (768,), INIT_FROM_CKPT
INFO:tensorflow: name = bert/encoder/layer_0/attention/self/query/kernel:0, shape = (768, 768), INIT_FROM_CKPT
INFO:tensorflow: name = bert/encoder/layer_0/attention/self/query/bias:0, shape = (768,), INIT_FROM_CKPT
INFO:tensorflow: name = bert/encoder/layer_0/attention/self/key/kernel:0, shape = (768, 768), INIT_FROM_CKPT
INFO:tensorflow: name = bert/encoder/layer_0/attention/self/key/bias:0, shape = (768,), INIT_FROM_CKPT
INFO:tensorflow: name = bert/encoder/layer_0/attention/self/value/kernel:0, shape = (768, 768), INIT_FROM_CKPT
INFO:tensorflow: name = bert/encoder/layer_0/attention/self/value/bias:0, shape = (768,), INIT_FROM_CKPT
INFO:tensorflow: name = bert/encoder/layer_0/attention/output/dense/kernel:0, shape = (768, 768), INIT_FROM_CKPT
INFO:tensorflow: name = bert/encoder/layer_0/attention/output/dense/bias:0, shape = (768,), INIT_FROM_CKPT
INFO:tensorflow: name = bert/encoder/layer_0/attention/output/LayerNorm/beta:0, shape = (768,), INIT_FROM_CKPT
INFO:tensorflow: name = bert/encoder/layer_0/attention/output/LayerNorm/gamma:0, shape = (768,), INIT_FROM_CKPT
INFO:tensorflow: name = bert/encoder/layer_0/intermediate/dense/kernel:0, shape = (768, 3072), INIT_FROM_CKPT
INFO:tensorflow: name = bert/encoder/layer_0/intermediate/dense/bias:0, shape = (3072,), INIT_FROM_CKPT
INFO:tensorflow: name = bert/encoder/layer_0/output/dense/kernel:0, shape = (3072, 768), INIT_FROM_CKPT
INFO:tensorflow: name = bert/encoder/layer_0/output/dense/bias:0, shape = (768,), INIT_FROM_CKPT
INFO:tensorflow: name = bert/encoder/layer_0/output/LayerNorm/beta:0, shape = (768,), INIT_FROM_CKPT
INFO:tensorflow: name = bert/encoder/layer_0/output/LayerNorm/gamma:0, shape = (768,), INIT_FROM_CKPT

image

你好, pytorch和tensorflow里面都有可以直接计算模型参数量的函数