minienglish1's Stars
google-research/google-research
Google Research
facebookresearch/audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
qarmin/czkawka
Multi functional app to find duplicates, empty folders, similar images etc.
voxel51/fiftyone
Refine high-quality datasets and visual AI models
Fanghua-Yu/SUPIR
SUPIR aims at developing Practical Algorithms for Photo-Realistic Image Restoration In the Wild. Our new online demo is also released at suppixel.ai.
philz1337x/clarity-upscaler
Clarity AI | AI Image Upscaler & Enhancer - free and open-source Magnific Alternative
layerdiffusion/sd-forge-layerdiffuse
[WIP] Layer Diffusion for WebUI (via Forge)
xinyu1205/recognize-anything
Open-source and strong foundation image recognition models.
KichangKim/DeepDanbooru
AI based multi-label girl image classification system, implemented by using TensorFlow.
Bionus/imgbrd-grabber
Very customizable imageboard/booru downloader with powerful filenaming features.
toriato/stable-diffusion-webui-wd14-tagger
Labeling extension for Automatic1111's Web UI
ermig1979/AntiDupl
A program to search similar and defect pictures on the disk
derrian-distro/LoRA_Easy_Training_Scripts
A UI made in Pyside6 to make training LoRA/LoCon and other LoRA type models in sd-scripts easy
kevinhendricks/KindleUnpack
python based software to unpack Amazon / Kindlegen generated ebooks
jiayev/GPT4V-Image-Captioner
pythongosssss/ComfyUI-WD14-Tagger
A ComfyUI extension allowing for the interrogation of booru tags from images.
mosaicml/diffusion
picobyte/stable-diffusion-webui-wd14-tagger
Labeling extension for Automatic1111's Web UI
snap-research/Panda-70M
[CVPR 2024] Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers
SalesforceAIResearch/DiffusionDPO
Code for "Diffusion Model Alignment Using Direct Preference Optimization"
mihirp1998/AlignProp
AlignProp uses direct reward backpropogation for the alignment of large-scale text-to-image diffusion models. Our method is 25x more sample and compute efficient than reinforcement learning methods (PPO) for finetuning Stable Diffusion
HybridShivam/Pokemon
The highest quality Pokemon Images and Assets.
johnoneil/MangaTextDetection
Experiments in text localization and detection in raw manga scans. Mostly using OpenCV python API.
djghosh13/geneval
GenEval: An object-focused framework for evaluating text-to-image alignment
pedrovgs/DeepPanel
Finding a panel inside a comic page is the hardest thing I've ever done in computer science!
thu-ml/low-bit-optimizers
Low-bit optimizers for PyTorch
SmilingWolf/SW-CV-ModelZoo
Repo for my Tensorflow/Keras CV experiments. Mostly revolving around the Danbooru20xx dataset
joshua-stone/DerPyBooru
Python bindings for Derpibooru's API
LexCybermac/smlr
A Simple Image Clustering Script using CLIP and Hierarchial Clustering
KutsuyaYuki/WD14Tagger
Automatically tag images with booru tags