Pinned Repositories
anything-llm
A multi-user ChatGPT for any LLMs and vector database. Unlimited documents, messages, and storage in one privacy-focused app. Now available as a desktop application!
audioseal
Localized watermarking for AI-generated speech audios, with SOTA on robustness and very fast detector
AVT
Code release for ICCV 2021 paper "Anticipative Video Transformer"
dinov2
PyTorch code and models for the DINOv2 self-supervised learning method.
LLaVA
Large Language-and-Vision Assistant built towards multimodal GPT-4 level capabilities.
mysqlc
mysql c example
pytorch-image-models
PyTorch image models, scripts, pretrained weights -- (SE)ResNet/ResNeXT, DPN, EfficientNet, MixNet, MobileNet-V3/V2, MNASNet, Single-Path NAS, FBNet, and more
StreamMultiDiffusion
Official code for the paper "StreamMultiDiffusion: Real-Time Interactive Generation with Region-Based Semantic Control."
udptunnel-1.1
udptunnel-1.1
UDT
UDT: Breaking the Data Transfer Bottleneck UDT is a reliable UDP based application level data transport protocol for distributed data intensive applications over wide area high-speed networks. UDT uses UDP to transfer bulk data with its own reliability control and congestion control mechanisms. The new protocol can transfer data at a much higher speed than TCP does. UDT is also a highly configurable framework that can accommodate various congestion control algorithms.
gary109's Repositories
gary109/anything-llm
A multi-user ChatGPT for any LLMs and vector database. Unlimited documents, messages, and storage in one privacy-focused app. Now available as a desktop application!
gary109/StreamMultiDiffusion
Official code for the paper "StreamMultiDiffusion: Real-Time Interactive Generation with Region-Based Semantic Control."
gary109/AniPortrait
AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation
gary109/AnyGPT
Code for "AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling"
gary109/AnyTool
gary109/ATLAS
A principled instruction benchmark on formulating effective queries and prompts for large language models (LLMs). Our paper: https://arxiv.org/abs/2312.16171
gary109/bark
🔊 Text-Prompted Generative Audio Model
gary109/ChatDev
Create Customized Software using Natural Language Idea (through LLM-powered Multi-Agent Collaboration)
gary109/distil-whisper
gary109/distrifuser
[CVPR 2024] DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models
gary109/DiT
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
gary109/EMO
gary109/EmoDance
EmoDance
gary109/faster-whisper
Faster Whisper transcription with CTranslate2
gary109/GenAI-Hw5
repo of Introduction to GenAI Hw5
gary109/GenAI_hw6_dataset
gary109/llama3
The official Meta Llama 3 GitHub site
gary109/LLM-Agent-Paper-List
The paper list of the 86-page paper "The Rise and Potential of Large Language Model Based Agents: A Survey" by Zhiheng Xi et al.
gary109/MemGPT
Building persistent LLM agents with long-term memory 📚🦙
gary109/mtg-jamendo-dataset
Metadata, scripts and baselines for the MTG-Jamendo dataset
gary109/OOTDiffusion
Official implementation of OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on
gary109/Prompt-Engineering-Guide
🐙 Guides, papers, lecture, notebooks and resources for prompt engineering
gary109/SALMONN
SALMONN: Speech Audio Language Music Open Neural Network
gary109/Seeing-and-Hearing
[CVPR 2024] Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Aligners
gary109/songcomposer
gary109/SoraReview
The official GitHub page for the review paper "Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models".
gary109/symusic
A cross platform note level midi decoding library with lightening speed, based on minimidi.
gary109/tango
Hosts a family of diffusion models for text-to-audio generation.
gary109/Voyager
An Open-Ended Embodied Agent with Large Language Models
gary109/yolov9
Implementation of paper - YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information