/text-sed

Implementation of Self-conditioned Embedding Diffusion for Text Generation

Primary LanguagePythonMIT LicenseMIT

text-sed (wip) 📚🔥💭

Implementation of Strudel et al.'s "Self-conditioned Embedding Diffusion for Text Generation" in PyTorch and some other goodies 🍭.

Install 💻

pip install -e ".[dev, train]"

Experimental Samples

Non-cherry picked samples from various experiments to track progress with the eventual goal of reproducing the results from the paper using large scale pre-training.

  • Unconditionallly generated samples obtained from training text-sed on the simple E2E dataset for 17k steps with the config here.
    ➜ The Golden Palace is a mid priced restaurant that has a rating of 1 out of 5.
    ➜ There is a children friendly priced restaurant that offers English food called The Twenty Two.
    ➜ Taste of Cambridge is a nice family friendly fast food pub in the riverside area, near The Sorrento.
    ➜ In the city centre lies The Golden Palace, a above average coffee shop serving French cuisine. Previous its customers,, unfortunately because it 3 out of 5.
    ➜ Strada specializes in Chinese food. They are a pub near Yippee Noodle Bar and has a rating of 1 out of 5.
    ➜ The Vaults is a high priced restaurant serving Indian food. It is kid friendly and is moderately priced.
    ➜ The Waterman is a kid friendly restaurant that serves Japanese food near the city center. They are moderately priced.
    ➜ The Punter is a Chinese restaurant, with an average rating. The food is cheap.

TODOs

  • Add span masking and cfg for conditional generation.
  • Add EMA warmup.
  • Add Karras samplers.
  • Try linearly increasing the guidance scale through the sampling procedure to reduce diversity hits (see Section 2.7 of Muse).
  • Add conditional generation examples/samples.

Appreciation

Citations

@article{strudel2022self,
  title={Self-conditioned Embedding Diffusion for Text Generation},
  author={Strudel, Robin and Tallec, Corentin and Altch{\'e}, Florent and Du, Yilun and Ganin, Yaroslav and Mensch, Arthur and Grathwohl, Will and Savinov, Nikolay and Dieleman, Sander and Sifre, Laurent and others},
  journal={arXiv preprint arXiv:2211.04236},
  year={2022}
}
@article{dieleman2022continuous,
  title={Continuous diffusion for categorical data},
  author={Dieleman, Sander and Sartran, Laurent and Roshannai, Arman and Savinov, Nikolay and Ganin, Yaroslav and Richemond, Pierre H and Doucet, Arnaud and Strudel, Robin and Dyer, Chris and Durkan, Conor and others},
  journal={arXiv preprint arXiv:2211.15089},
  year={2022}
}
@article{Chen2022AnalogBG,
    title   = {Analog Bits: Generating Discrete Data using Diffusion Models with Self-Conditioning},
    author  = {Ting Chen and Ruixiang Zhang and Geoffrey E. Hinton},
    journal = {ArXiv},
    year    = {2022},
    volume  = {abs/2208.04202}
}
@article{Li-2022-DiffusionLM,
  title={Diffusion-LM Improves Controllable Text Generation},
  author={Xiang Lisa Li and John Thickstun and Ishaan Gulrajani and Percy Liang and Tatsunori Hashimoto},
  journal={ArXiv},
  year={2022},
  volume={abs/2205.14217}
}