Laion dreams

Overarching goal: Enable the open source community to openly build datasets, papers, models and tools in order to let AGI benefit humankind even faster.

Intro

Laion was initiated with the Laion5B project that successfully produced a 5B (image, text) pairs dataset by processing commoncrawl and filtering with clip. That method proved that it’s cheap to collect large scale dataset from the web using models like clip that give the similarity between items from 2 modalities.

Many models have been trained on laion400m proving the value of this method, with in particular openclip that reproduced the same results that the initial openai clip.

Let’s reproduce that method to more modalities!

Overall rationale

These projects and directions are projects that we would like to promote and help. We do not claim ownership as an organization on these projects. The people that build the projects own these projects.

Directions

Methods

Open source: releasing everything openly
- Code: on github with an open license
- Model: freely distributed models
- Dataset: freely distributed datasets
Open development: development is done in public on github and discord, everyone is encouraged to participate, whatever their nationality, age and diploma

Axis of work:

Open tools
- Dataset collection
- Dataset preparation
- Distributed inference
- Distributed training
- Evaluation
Datasets
- Open distribution
- Papers
Models
- Open training
- Open distribution

Scientific domains

All modalities dataset building
- Text image
- Text audio
- Text video
- Text 3d
Contrastive and generative
- Contrastive
  - Text image
  - Text audio
  - Text video
- Generative
  - Text to image
  - Image to text

Projects

These projects are collaborations between many people. If you want to know who, check the links and ask in discord. We are open to new collaborators!

Dataset

Name	Modality	Status	Notes
Laion400m	image/text	Done	> 10 papers using it
Laion5B	image/text	Done	Largest open text/image dataset
Laion5B high-resolution	image/text	Done	Largest open high-resolution text/image dataset
Laion5B balanced	image/text	Just started	Balanced LAION-5B dataset for more efficient training
laion3d	3d/image/text	Just started	Trying to expand the laion idea to 3d
Audio dataset	text/audio	Started	Started to be used to train an audio clip

Model

Name	Modality	Kind	Status	Notes
Openclip B/16	image/text	contrastive	released	Reproduced openai clip
Dalle2 prior/decoder	image/text	generative	Just started	Trying to reproduce dalle2
Clipcap	image/text	generative	works	Generate text from embedding
Audio clip	audio/text	contrastive	Training on going
Video clip	video/text	contrastive	Just started
Mclip vit-l/14	image/text	contrastive	Just started	Aligning a text encoder to be in clip space. Collaboration with mclip author
Super-resolution	image->image	generative	Just started	Using a high-resolution subset of LAION-5B for the training
Medical CLIP	image/text	contrastive	Just started	Using CLIP to improve MRI -> image synthesis (see project outline).
NSFW detection	image/text	contrastive	Done	Using CLIP to detect NSFW in images.
Watermark detection	image/text	contrastive	Done	Using CLIP to detect watermarks in images.
electric sheep	image/text/audio/video	contrastive/generative	Just started	Train contrastive and generative models on all modalities.

Tools

Name	Modality	Status	Notes
img2dataset	image/text	working	Used to download laion5B in a week, twice
Clip retrieval	image/text	working	Used to compute 5B Vit-L/14 embeddings
Crawlingathome-gpu-hcloud	image/text	done	Filtering common crawl using clip
clip benchmark	image/text	wip	Evaluating clip performances easily

Papers

Name	Modality	Status	Notes
Laion400m	image/text	In arvix	Cited many times
laion5B	image/text	started

LAION-AI/laion-dreams