aroslanov's Stars
LuminanceHDR/LuminanceHDR
A complete workflow for HDR imaging
fallenshock/FlowEdit
Official implementation of the paper: "FlowEdit: Inversion-Free Text-Based Editing Using Pre-Trained Flow Models"
TencentARC/StereoCrafter
A framework to convert any 2D videos to immersive stereoscopic 3D
LucipherDev/ComfyUI-AniDoc
ComfyUI Custom Nodes for "AniDoc: Animation Creation Made Easier". This approach automates line art video colorization using a novel model that aligns color information from references, ensures temporal consistency, and reduces manual effort in animation production.
akatz-ai/ComfyUI-Environment-Manager
A Pinokio application to manage ComfyUI environments.
nerlfield/iss-urine-tank-monitor-bot
browser-use/browser-use
Make websites accessible for AI agents
iamxym/Deep-Fourier-based-Arbitrary-scale-Super-resolution-for-Real-time-Rendering
SIGGRAPH 2024 Conference Paper: Deep Fourier-based Arbitrary-scale Super-resolution for Real-time Rendering
colmap/colmap
COLMAP - Structure-from-Motion and Multi-View Stereo
DS4SD/docling
Get your documents ready for gen AI
Genesis-Embodied-AI/Genesis
A generative world for general-purpose robotics & embodied AI learning.
Helicone/helicone
🧊 Open source LLM observability platform. One line of code to monitor, evaluate, and experiment. YC W23 🍓
janreges/siteone-crawler
SiteOne Crawler is a cross-platform website crawler and analyzer for SEO, security, accessibility, and performance optimization—ideal for developers, DevOps, QA engineers, and consultants. Supports Windows, macOS, and Linux (x64 and arm64).
janreges/siteone-crawler-gui
SiteOne Crawler GUI is a cross-platform website crawler and analyzer for SEO, security, accessibility, and performance optimization—ideal for developers, DevOps, QA engineers, and consultants. Supports Windows, macOS, and Linux (x64 and arm64).
google-gemini/cookbook
Examples and guides for using the Gemini API
snap-research/InstantRestore
Official Implementation for "InstantRestore: Single-Step Personalized Face Restoration with Shared-Image Attention"
NexaAI/nexa-sdk
Nexa SDK is a comprehensive toolkit for supporting GGML and ONNX models. It supports text generation, image generation, vision-language models (VLM), Audio Language Model, auto-speech-recognition (ASR), and text-to-speech (TTS) capabilities.
Purfview/whisper-standalone-win
Whisper & Faster-Whisper standalone executables for those who don't want to bother with Python.
kijai/ComfyUI-MMAudio
arifyaman/Face-Depth-Frame-Mancer
Face Depth Frame Mancer Documentation
Shubhamsaboo/awesome-llm-apps
Collection of awesome LLM apps with RAG using OpenAI, Anthropic, Gemini and opensource models.
hkchengrex/MMAudio
[arXiv 2024] Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis
Francis-Rings/StableAnimator
We present StableAnimator, the first end-to-end ID-preserving video diffusion framework, which synthesizes high-quality videos without any post-processing, conditioned on a reference image and a sequence of poses.
souzatharsis/podcastfy
An Open Source Python alternative to NotebookLM's podcast feature: Transforming Multimodal Content into Captivating Multilingual Audio Conversations with GenAI
Automattic/harper
The Grammar Checker for Developers
DroneSplat/anonymous_code
DroneSplat: 3D Gaussian Splatting for Robust 3D Reconstruction from In-the-Wild Drone Imagery
microsoft/TRELLIS
Official repo for paper "Structured 3D Latents for Scalable and Versatile 3D Generation".
cline/cline
Autonomous coding agent right in your IDE, capable of creating/editing files, executing commands, using the browser, and more with your permission every step of the way.
jiah-cloud/Align3R
[arXiv'24] Align3R: Aligned Monocular Depth Estimation for Dynamic Videos
yformer/EfficientTAM
Efficient Track Anything