/bert-pretrainer

BERT paper implementation from scratch- Pytorch

Primary LanguagePython

BERT Paper Implementation

BERT implementation pytorch code

Architecture

Embedding

  • Positional Encoding
  • Word Embedding
  • Segment Embedding

Transformer Encoder

  • Multi-head Attention
  • Position-wise Fead Forward Network
  • ResNet + NormLayer

Pre-train for Two Task

  • Next Sentence Prediction
  • Masked Language Model

Refer