seruva19

Sergey Gornostaev

N-H-LabsSt. Petersburg, Russia

seruva19's Stars

omidsakhi/lorakit
A simple SDXL fine-tuning toolkit based on the DreamBooth branch of AutoTrain Advanced from 🤗, inspired by the way ai-toolkit approaches configuration.
Language:Python11
unclecode/crawl4ai
🔥🕷️ Crawl4AI: Open-source LLM Friendly Web Crawler & Scrapper
Language:Python11.1k776
genforce/ctrl-x
Official implementation of "Ctrl-X: Controlling Structure and Appearance for Text-To-Image Generation Without Guidance" (NeurIPS 2024)
Language:Python2047
mindee/doctr
docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.
Language:Python3.7k425
instantX-research/InstantStyle
InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation 🔥
Language:Jupyter Notebook1.6k100
facefusion/facefusion
Industry leading face manipulation platform
Language:Python18.5k2.8k
NirDiamant/GenAI_Agents
This repository provides tutorials and implementations for various Generative AI Agent techniques, from basic to advanced. It serves as a comprehensive guide for building intelligent, interactive AI systems.
Language:Jupyter Notebook3.2k319
facebookresearch/AudioMAE
This repo hosts the code and models of "Masked Autoencoders that Listen".
Language:Python52644
ivcylc/qa-mdt
SOTA Text-to-music (TTM) Generation (OpenMusic)
Language:Python38738
NexaAI/nexa-sdk
Nexa SDK is a comprehensive toolkit for supporting ONNX and GGML models. It supports text generation, image generation, vision-language models (VLM), auto-speech-recognition (ASR), and text-to-speech (TTS) capabilities.
Language:Python1.6k221
QiuYannnn/Local-File-Organizer
An AI-powered file management tool that ensures privacy by organizing local texts, images. Using Llama3.2 3B and Llava v1.6 models with the Nexa SDK, it intuitively scans, restructures, and organizes files for quick, seamless access and easy retrieval.
Language:Python1.3k86
zml/zml
High performance AI inference stack. Built for production. @ziglang / @openxla / MLIR / @bazelbuild
Language:Zig1.5k51
av/klmbr
klmbr - a prompt pre-processing technique to break through the barrier of entropy while generating text with LLMs
Language:TeX483
kyutai-labs/moshi
Language:Python6k446
dvlab-research/ControlNeXt
Controllable video and image Generation, SVD, Animate Anyone, ControlNet, ControlNeXt, LoRA
Language:Python1.3k60
VectorSpaceLab/OmniGen
65812
jishengpeng/WavTokenizer
SOTA discrete acoustic codec models with 40 tokens per second for audio language modeling
Language:Python68639
aigc-apps/CogVideoX-Fun
📹 A more flexible CogVideoX that can generate videos at any resolution and creates videos from images.
Language:Python30720
JusperLee/Apollo
Music repair method to convert lossy MP3 compressed music to lossless music.
Language:Python968
minzwon/sota-music-tagging-models
Language:Python39764
ALEEEHU/Awesome-Text2X-Resources
This is an open collection of state-of-the-art (SOTA), novel Text to X (X can be everything) methods (papers, codes and datasets).
1407
pinokiofactory/cogstudio
Language:Python21013
voideditor/void
Language:TypeScript7.3k332
kousw/experimental-consistory
Language:Python926
turbo-llm/turbo-alignment
Library for industrial alignment.
Language:Python652
Vchitect/Vchitect-2.0
Vchitect-2.0: Parallel Transformer for Scaling Up Video Diffusion Models
Language:Python59916
microsoft/Graphormer
Graphormer is a general-purpose deep learning backbone for molecular modeling.
Language:Python2.1k334
Tencent/DepthCrafter
DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos
Language:Python66026
OutofAi/OutofFocus
An AI focused photo manipulation tool based on Gradio
Language:Python17313
aredden/flux-fp8-api
Flux diffusion model implementation using quantized fp8 matmul & remaining layers use faster half precision accumulate, which is ~2x faster on consumer devices.
Language:Python12712

seruva19

seruva19's Stars

omidsakhi/lorakit

unclecode/crawl4ai

genforce/ctrl-x

mindee/doctr

instantX-research/InstantStyle

facefusion/facefusion

NirDiamant/GenAI_Agents

facebookresearch/AudioMAE

ivcylc/qa-mdt

NexaAI/nexa-sdk

QiuYannnn/Local-File-Organizer

zml/zml

av/klmbr

kyutai-labs/moshi

dvlab-research/ControlNeXt

VectorSpaceLab/OmniGen

jishengpeng/WavTokenizer

aigc-apps/CogVideoX-Fun

JusperLee/Apollo

minzwon/sota-music-tagging-models

ALEEEHU/Awesome-Text2X-Resources

pinokiofactory/cogstudio

voideditor/void

kousw/experimental-consistory

turbo-llm/turbo-alignment

Vchitect/Vchitect-2.0

microsoft/Graphormer

Tencent/DepthCrafter

OutofAi/OutofFocus

aredden/flux-fp8-api