vision

There are 1902 repositories under vision topic.

BVLC/caffe
Caffe: a fast open framework for deep learning.
Language:C++34.6k 2.1k 4.8k18.6k
XTLS/Xray-core
Xray, Penetrates Everything. Also the best v2ray-core. Where the magic happens. An open platform for various uses.
Language:Go31.2k 367 2.6k4.6k
danny-avila/LibreChat
Enhanced ChatGPT Clone: Features Agents, MCP, DeepSeek, Anthropic, AWS, OpenAI, Responses API, Azure, Groq, o1, GPT-5, Mistral, OpenRouter, Vertex AI, Gemini, Artifacts, AI model switching, message search, Code Interpreter, langchain, DALL-E-3, OpenAPI Actions, Functions, Secure Multi-User Auth, Presets, open-source for self-hosting. Active.
Language:TypeScript30k 172 3.4k5.7k
bytedance/UI-TARS-desktop
The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra
Language:TypeScript18.8k 155 4591.8k
mediar-ai/screenpipe
AI app store powered by 24/7 desktop history. open source | 100% local | dev friendly | 24/7 screen, mic recording
Language:TypeScript15.6k 91 1k1.2k
Skyvern-AI/skyvern
Automate browser-based workflows with LLMs and Computer Vision
Language:Python14.3k 78 1941.2k
mrousavy/react-native-vision-camera
📸 A powerful, high-performance React Native Camera library.
Language:Swift8.8k 57 2k1.3k
Dooy/chatgpt-web-midjourney-proxy
One UI is all done with chatgpt web, midjourney, gpts,suno,luma,runway,viggle,flux,ideogram,realtime,pika,udio; Simultaneous support Web / PWA / Linux / Win / MacOS platform
Language:JavaScript6.4k 39 5891.6k
autorope/donkeycar
Open source hardware and software platform to build a small scale self driving car.
Language:Python3.3k 159 4671.3k
artemnovichkov/iOS-11-by-Examples
👨🏻‍💻 Examples of new iOS 11 APIs
Language:Swift3.3k 119 17308
VainF/Torch-Pruning
[CVPR 2023] DepGraph: Towards Any Structural Pruning; LLMs, Vision Foundation Models, etc.
Language:Python3.1k 34 427363
sightmachine/SimpleCV
The Open Source Framework for Machine Vision
Language:Python2.7k 196 327797
NextLevel/NextLevel
⬆️ Media Capture in Swift
Language:Swift2.3k 63 233284
GoogleCloudPlatform/java-docs-samples
Java and Kotlin Code samples used on cloud.google.com
Language:Java1.8k 290 2.2k2.9k
andyzeng/tsdf-fusion-python
Python code to fuse multiple RGB-D images into a TSDF voxel volume.
Language:Python1.4k 24 37229
roatienza/Deep-Learning-Experiments
Videos, notes and experiments to understand deep learning
Language:Jupyter Notebook1.2k 98 20765
jenly1314/MLKit
🌝 MLKit是一个强大易用的工具包。通过ML Kit您可以很轻松的实现文字识别、条码识别、图像标记、人脸检测、对象检测等功能。
Language:Java1.1k 13 57187
AravisProject/aravis
A vision library for genicam based cameras
Language:C1.1k 45 599374
valentinfrlch/ha-llmvision
Let Home Assistant see!
Language:Python1k 9 19883
aheze/OpenFind
An app to find text in real life.
Language:Swift1k 11 570
lucidrains/mlp-mixer-pytorch
An All-MLP solution for Vision, from Google AI
Language:Python1k 11 13108
KevinGong2013/ChineseIDCardOCR
[Deprecated] 🇨🇳中国二代身份证光学识别
Language:Swift1k 46 18186
andyzeng/visual-pushing-grasping
Train robotic agents to learn to plan pushing and grasping actions for manipulation with deep reinforcement learning.
Language:Python1k 33 90317
Celebrandil/CudaSift
A CUDA implementation of SIFT for NVidia GPUs (1.2 ms on a GTX 1060)
Language:Cuda917 42 86289
deepdrive/deepdrive
Deepdrive is a simulator that allows anyone with a PC to push the state-of-the-art in self-driving
Language:Python917 61 64151
anupamchugh/iowncode
A curated collection of iOS, ML, AR resources sprinkled with some UI additions
Language:Swift910 32 9322
2013fangwentao/Multi_Sensor_Fusion
Multi-Sensor Fusion (GNSS, IMU, Camera) 多源多传感器融合定位 GPS/INS组合导航 PPP/INS紧组合
Language:C++893 41 16253
andyzeng/3dmatch-toolbox
3DMatch - a 3D ConvNet-based local geometric descriptor for aligning 3D meshes and point clouds.
Language:C++879 46 34188
onmyway133/awesome-machine-learning
🎰 A curated list of machine learning resources, preferably CoreML
811 46 1106
jasmcaus/caer
High-performance Vision library in Python. Scale your research, not boilerplate.
Language:Python804 20 15103
andyzeng/tsdf-fusion
Fuse multiple depth frames into a TSDF voxel volume.
Language:Cuda783 31 28135
evilgix/Evil
Optical Character Recognition in Swift for iOS&macOS. 银行卡、身份证、门牌号光学识别
Language:Swift697 20 598
lucidrains/bottleneck-transformer-pytorch
Implementation of Bottleneck Transformer in Pytorch
Language:Python676 17 1683
mees/calvin
CALVIN - A benchmark for Language-Conditioned Policy Learning for Long-Horizon Robot Manipulation Tasks
Language:Python673 6 10581
mostafasadeghi97/design2code
Convert any web design screenshot to clean HTML/CSS code
Language:TypeScript657 18 5102
cary-sas/v2ray_bin
梅林380 固件的魔改科学上网插件
Language:Classic ASP624 15 59134

vision

BVLC/caffe

XTLS/Xray-core

danny-avila/LibreChat

bytedance/UI-TARS-desktop

mediar-ai/screenpipe

Skyvern-AI/skyvern

mrousavy/react-native-vision-camera

Dooy/chatgpt-web-midjourney-proxy

autorope/donkeycar

artemnovichkov/iOS-11-by-Examples

VainF/Torch-Pruning

sightmachine/SimpleCV

NextLevel/NextLevel

GoogleCloudPlatform/java-docs-samples

andyzeng/tsdf-fusion-python

roatienza/Deep-Learning-Experiments

jenly1314/MLKit

AravisProject/aravis

valentinfrlch/ha-llmvision

aheze/OpenFind

lucidrains/mlp-mixer-pytorch

KevinGong2013/ChineseIDCardOCR

andyzeng/visual-pushing-grasping

Celebrandil/CudaSift

deepdrive/deepdrive

anupamchugh/iowncode

2013fangwentao/Multi_Sensor_Fusion

andyzeng/3dmatch-toolbox

onmyway133/awesome-machine-learning

jasmcaus/caer

andyzeng/tsdf-fusion

evilgix/Evil

lucidrains/bottleneck-transformer-pytorch

mees/calvin

mostafasadeghi97/design2code

cary-sas/v2ray_bin