/laion-dreams

Aim for the moon. If you miss, you may hit a star.

Laion dreams

Chat on discord

Overarching goal: Enable the open source community to openly build datasets, papers, models and tools in order to let AGI benefit humankind even faster.

Intro

Laion was initiated with the Laion5B project that successfully produced a 5B (image, text) pairs dataset by processing commoncrawl and filtering with clip. That method proved that it’s cheap to collect large scale dataset from the web using models like clip that give the similarity between items from 2 modalities.

Many models have been trained on laion400m proving the value of this method, with in particular openclip that reproduced the same results that the initial openai clip.

Let’s reproduce that method to more modalities!

Overall rationale

These projects and directions are projects that we would like to promote and help. We do not claim ownership as an organization on these projects. The people that build the projects own these projects.

Directions

Methods

  • Open source: releasing everything openly
    • Code: on github with an open license
    • Model: freely distributed models
    • Dataset: freely distributed datasets
  • Open development: development is done in public on github and discord, everyone is encouraged to participate, whatever their nationality, age and diploma

Axis of work:

  • Open tools
    • Dataset collection
    • Dataset preparation
    • Distributed inference
    • Distributed training
    • Evaluation
  • Datasets
    • Open distribution
    • Papers
  • Models
    • Open training
    • Open distribution

Scientific domains

  • All modalities dataset building
    • Text image
    • Text audio
    • Text video
    • Text 3d
  • Contrastive and generative
    • Contrastive
      • Text image
      • Text audio
      • Text video
    • Generative
      • Text to image
      • Image to text

Projects

These projects are collaborations between many people. If you want to know who, check the links and ask in discord. We are open to new collaborators!

Dataset

Name Modality Status Notes
Laion400m image/text Done > 10 papers using it
Laion5B image/text Done Largest open text/image dataset
Laion5B high-resolution image/text Done Largest open high-resolution text/image dataset
Laion5B balanced image/text Just started Balanced LAION-5B dataset for more efficient training
laion3d 3d/image/text Just started Trying to expand the laion idea to 3d
Audio dataset text/audio Started Started to be used to train an audio clip

Model

Name Modality Kind Status Notes
Openclip B/16 image/text contrastive released Reproduced openai clip
Dalle2 prior/decoder image/text generative Just started Trying to reproduce dalle2
Clipcap image/text generative works Generate text from embedding
Audio clip audio/text contrastive Training on going
Video clip video/text contrastive Just started
Mclip vit-l/14 image/text contrastive Just started Aligning a text encoder to be in clip space. Collaboration with mclip author
Super-resolution image->image generative Just started Using a high-resolution subset of LAION-5B for the training
Medical CLIP image/text contrastive Just started Using CLIP to improve MRI -> image synthesis (see project outline).
NSFW detection image/text contrastive Done Using CLIP to detect NSFW in images.
Watermark detection image/text contrastive Done Using CLIP to detect watermarks in images.
electric sheep image/text/audio/video contrastive/generative Just started Train contrastive and generative models on all modalities.

Tools

Name Modality Status Notes
img2dataset image/text working Used to download laion5B in a week, twice
Clip retrieval image/text working Used to compute 5B Vit-L/14 embeddings
Crawlingathome-gpu-hcloud image/text done Filtering common crawl using clip
clip benchmark image/text wip Evaluating clip performances easily

Papers

Name Modality Status Notes
Laion400m image/text In arvix Cited many times
laion5B image/text started