/wakong

Wakong: A mathematically-rigorous and robust masking algorithm for generating the training objective of text infilling

Primary LanguagePythonCreative Commons Zero v1.0 UniversalCC0-1.0

Wakong

Wakong: An appropriate and robust masking algorithm for generating the training objective of text infilling

This project is the Python library of ARP 1: The Wakong Algorithm and Its Python Implementation.

This project is supported by Cloud TPUs from Google's TPU Research Cloud (TRC) as a part of my project on large-scale language model pre-training.

Installation

Wakong supports Python 3.10 and above:

pip install wakong

You can also install from source:

flit install

Usage

from wakong import Wakong
wakong = Wakong(seed=42)
sentence = 'I can eat glass , it does not hurt me .'.split(' ')
print(wakong(sentence))

Output:

['I', '<mask>', 'eat', 'glass', '<mask>', ',', 'it', 'does', 'not', 'hurt', 'me', '.']