rom1504
Interested in machine learning (computer vision, natural language processing, deep learning), node.js (network, bots, web), and programming in general
@googleParis
Pinned Repositories
minecraft-data
Language independent module providing minecraft data for minecraft clients, servers and libraries.
mineflayer
Create Minecraft bots with a powerful, stable, and high level JavaScript API.
node-minecraft-protocol
Parse and serialize minecraft packets, plus authentication and encryption.
awesome-semantic-search
Semantic search with embeddings: index anything
cc2dataset
Easily convert common crawl to a dataset of caption and document. Image/text Audio/text Video/text, ...
clip-retrieval
Easily compute clip embeddings and build a clip retrieval system with them
image_embeddings
Using efficientnet to provide embeddings for retrieval
img2dataset
Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.
laion-prepro
Get hundred of million of image+url from the crawling at home dataset and preprocess them
MinecraftChat
Minecraft web based chat client
rom1504's Repositories
rom1504/img2dataset
Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.
rom1504/clip-retrieval
Easily compute clip embeddings and build a clip retrieval system with them
rom1504/cc2dataset
Easily convert common crawl to a dataset of caption and document. Image/text Audio/text Video/text, ...
rom1504/laion-prepro
Get hundred of million of image+url from the crawling at home dataset and preprocess them
rom1504/image_embeddings
Using efficientnet to provide embeddings for retrieval
rom1504/embedding-reader
Efficiently read embedding in streaming from any filesystem
rom1504/gpu-tester
gpu tester detects broken and slow gpus in a cluster
rom1504/any2dataset
Turn any collection of files into a dataset
rom1504/CLIP
Contrastive Language-Image Pretraining
rom1504/python-template
Simple python template
rom1504/audio2dataset
Easily turn large sets of audio urls to an audio dataset.
rom1504/slurm-tracking-bot
Simple slurm tracking bot to check usage
rom1504/word_knn
Quickly find closest words using an efficient knn and word embeddings
rom1504/ideas
Ideas
rom1504/open_clip
An open source implementation of CLIP.
rom1504/rom1504.github.io
Personal website
rom1504/accelerate
🚀 A simple way to train and use PyTorch models with multi-GPU, TPU, mixed-precision
rom1504/distributed-shuffle
A simple implementation of distributed shuffle, intended for learning
rom1504/k-diffusion
Karras et al. (2022) diffusion models for PyTorch
rom1504/rom1504
Profile readme
rom1504/video2numpy
Optimized library for large-scale extraction of frames and audio from video.
rom1504/rom1504.fr
My site
rom1504/task_adaptation
rom1504/whisper
Robust Speech Recognition via Large-Scale Weak Supervision
rom1504/aria2
aria2 is a lightweight multi-protocol & multi-source, cross platform download utility operated in command-line. It supports HTTP/HTTPS, FTP, SFTP, BitTorrent and Metalink.
rom1504/embedbase
The native Software 3.0 stack
rom1504/EnMicroMsg.db-Password-Cracker
Crack the password of EnMicroMsg.db with brute-force attack.
rom1504/prismarine-web-client
mineflayer, running in your browser
rom1504/v-diffusion-pytorch
v objective diffusion inference code for PyTorch.
rom1504/wechat-dump
Dump wechat messages from android