/lairgpt

Inference code in Pytorch for GPT-like models, such as PAGnol, a family of models with up to 1.5B parameters, trained on datasets in French.

Primary LanguagePythonMIT LicenseMIT

LairGPT

GitHub license Twitter

A Python package in Pytorch by LightOn AI Research that allows to perform inference with PAGnol models. You can test the generation capabilities of PAGnol on our interactive demo website.

Install

Requirements

The package is tested with Python 3.9. After cloning this repository, you can create a conda environment with the necessary dependencies from its root by

conda env create --file=environment.yml

If you prefer control on your environment, the dependencies are

pytorch==1.8.1
tokenizers==0.10
python-wget==3.2

pip

Simply run pip install . from the root of this repository.

Text generation

The simplest way to generate text with PAGnol using lairgpt is

from lairgpt.models import PAGnol

pagnol = PAGnol.small()
pagnol("Salut PAGnol, comment ça va ?")

We include a demo script main.py in this repository that takes the path to models and tokenizers, and an input text, and generates sentences from it. To use it:

python main.py --size large --text "LightOn est une startup technologique"

To generate text we rely on the infer method of the TextGenerator class that takes the usual parameters:

  • mode: (default: "nucleus")
    • "greedy": always select the most likely word as its next word.
    • "top-k": filter to the K most likely next words and redistribute the probability mass among only those K next words.
    • "nucleus": filter to the smallest possible set of words whose cumulative probability exceeds the probability p and redistribute the probability mass among this set of words.
  • temperature: a control over randomness. As this value approaches zero, the model becomes more deterministic. (default: 1.0)
  • k: size of the set of words to consider for "top-k" sampling (default: 5)
  • p: a control over diversity in nucleus sampling. A value of 0.5 means that half of the options are considered. (default: 0.9)
  • max_decoding_steps: number of tokens to generate. (default: 32)
  • skip_eos: when True, generation does not stop at end of sentence. (default: True)

More on LightOn

LightOn is a company that produces hardware for machine learning. To lease a LightOn Appliance, please visit: https://lighton.ai/lighton-appliance/

To request access to LightOn Cloud and try our photonic co-processor, please visit: https://cloud.lighton.ai/ For researchers, we also have a LightOn Cloud for Research program, please visit https://cloud.lighton.ai/lighton-research/ for more information.

Citation

The preprint will be on arXiv soon, in the meantime, you can find it here.