jneygor8504's Stars
landing-ai/vision-agent
Vision agent
HVision-NKU/StoryDiffusion
Accepted as [NeurIPS 2024] Spotlight Presentation Paper
6drf21e/ChatTTS_colab
🚀 一键部署(含离线整合包)!基于 ChatTTS ,支持流式输出、音色抽卡、长音频生成和分角色朗读。简单易用,无需复杂安装。
microsoft/Phi-3CookBook
This is a Phi-3 book for getting started with Phi-3. Phi-3, a family of open sourced AI models developed by Microsoft. Phi-3 models are the most capable and cost-effective small language models (SLMs) available, outperforming models of the same size and next size up across a variety of language, reasoning, coding, and math benchmarks.
dcrebbin/meta-vision-api
Hacky Meta Glasses API with GPT4 Vision Integration
EllAchE/llama-out-loud
AR (Augmented Reading) for the meta llama 3 h4ckathon 🦙
comfyanonymous/ComfyUI
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
lllyasviel/IC-Light
More relighting!
microsoft/LLMLingua
[EMNLP'23, ACL'24] To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.
zhaoolee/pi
树莓派教程,树莓派防吃灰小分队,让树莓派不再吃灰~
xlang-ai/OSWorld
[NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
TMElyralab/MuseTalk
MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting
TMElyralab/MuseV
MuseV: Infinite-length and High Fidelity Virtual Human Video Generation with Visual Conditioned Parallel Denoising
pybind/pybind11
Seamless operability between C++11 and Python
lean-dojo/LeanCopilot
LLMs as Copilots for Theorem Proving in Lean
harry0703/MoneyPrinterTurbo
利用AI大模型,一键生成高清短视频 Generate short videos with one click using AI LLM.
AlexanderKoch-Koch/low_cost_robot
yodaos-project/yodaos
Yet another Linux distribution for voice-enabled IoT and embrace Web standards
facebookresearch/seamless_communication
Foundational Models for State-of-the-Art Speech and Text Translation
dvlab-research/MGM
Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"
advaitpaliwal/insight
stanford-oval/storm
An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.
netease-youdao/QAnything
Question and Answer based on Anything.
nicedouble/StreamlitAntdComponents
A Streamlit component to display Antd-Design
datawhalechina/self-llm
《开源大模型食用指南》针对**宝宝量身打造的基于Linux环境快速微调(全参数/Lora)、部署国内外开源大模型(LLM)/多模态大模型(MLLM)教程
karpathy/llm.c
LLM training in simple, raw C/CUDA
suno-ai/bark
🔊 Text-Prompted Generative Audio Model
lobehub/lobe-chat
🤯 Lobe Chat - an open-source, modern-design AI chat framework. Supports Multi AI Providers( OpenAI / Claude 3 / Gemini / Ollama / Qwen / DeepSeek), Knowledge Base (file upload / knowledge management / RAG ), Multi-Modals (Vision/TTS/Plugins/Artifacts). One-click FREE deployment of your private ChatGPT/ Claude application.
stitionai/devika
Devika is an Agentic AI Software Engineer that can understand high-level human instructions, break them down into steps, research relevant information, and write code to achieve the given objective. Devika aims to be a competitive open-source alternative to Devin by Cognition AI.
All-Hands-AI/OpenHands
🙌 OpenHands: Code Less, Make More