Tokenization Principle
Why are some tokenizations better than others?
Read the paper here, watch a 3-minute video of the paper or take a look at the poster.
Cite as:
@inproceedings{tokenization_noiseless,
title={Tokenization and the Noiseless Channel},
author={Zouhar, Vilém and Meister, Clara and Gastaldi, Juan Luis and Sachan, Mrinmaya and Cotterell, Ryan},
booktitle={Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics},
year={2023},
url={https://aclanthology.org/2023.acl-long.284/},
}