vision
There are 1657 repositories under vision topic.
BVLC/caffe
Caffe: a fast open framework for deep learning.
danny-avila/LibreChat
Enhanced ChatGPT Clone: Features Anthropic, AWS, OpenAI, Assistants API, Azure, Groq, o1, GPT-4o, Mistral, OpenRouter, Vertex AI, Gemini, Artifacts, AI model switching, message search, langchain, DALL-E-3, ChatGPT Plugins, OpenAI Functions, Secure Multi-User System, Presets, completely open-source for self-hosting. Actively in public development.
PaddlePaddle/PaddleHub
Awesome pre-trained models toolkit based on PaddlePaddle. (400+ models including Image, Text, Audio, Video and Cross-Modal with Easy Inference & Serving)【安全加固,暂停交互,请耐心等待】
Skyvern-AI/skyvern
Automate browser-based workflows with LLMs and Computer Vision
mediar-ai/screenpipe
rewind.ai x cursor.com = your AI assistant that has all the context. 24/7 screen & voice recording for the age of super intelligence. get your data ready or be left behind
mrousavy/react-native-vision-camera
📸 A powerful, high-performance React Native Camera library.
Dooy/chatgpt-web-midjourney-proxy
One UI is all done with chatgpt web, midjourney, gpts,suno,luma,runway,viggle,flux,ideogram,realtime,pika,udio; Simultaneous support Web / PWA / Linux / Win / MacOS platform
artemnovichkov/iOS-11-by-Examples
👨🏻💻 Examples of new iOS 11 APIs
autorope/donkeycar
Open source hardware and software platform to build a small scale self driving car.
sightmachine/SimpleCV
The Open Source Framework for Machine Vision
NextLevel/NextLevel
⬆️ Media Capture in Swift
GoogleCloudPlatform/java-docs-samples
Java and Kotlin Code samples used on cloud.google.com
TEN-framework/TEN-Agent
TEN Agent is the world’s first real-time multimodal agent integrated with the OpenAI Realtime API, RTC, and features weather checks, web search, vision, and RAG capabilities.
andyzeng/tsdf-fusion-python
Python code to fuse multiple RGB-D images into a TSDF voxel volume.
roatienza/Deep-Learning-Experiments
Videos, notes and experiments to understand deep learning
KevinGong2013/ChineseIDCardOCR
[Deprecated] 🇨🇳**二代身份证光学识别
lucidrains/mlp-mixer-pytorch
An All-MLP solution for Vision, from Google AI
aheze/OpenFind
An app to find text in real life.
jenly1314/MLKit
🌝 MLKit是一个强大易用的工具包。通过ML Kit您可以很轻松的实现文字识别、条码识别、图像标记、人脸检测、对象检测等功能。
andyzeng/visual-pushing-grasping
Train robotic agents to learn to plan pushing and grasping actions for manipulation with deep reinforcement learning.
AravisProject/aravis
A vision library for genicam based cameras
deepdrive/deepdrive
Deepdrive is a simulator that allows anyone with a PC to push the state-of-the-art in self-driving
anupamchugh/iowncode
A curated collection of iOS, ML, AR resources sprinkled with some UI additions
Celebrandil/CudaSift
A CUDA implementation of SIFT for NVidia GPUs (1.2 ms on a GTX 1060)
andyzeng/3dmatch-toolbox
3DMatch - a 3D ConvNet-based local geometric descriptor for aligning 3D meshes and point clouds.
2013fangwentao/Multi_Sensor_Fusion
Multi-Sensor Fusion (GNSS, IMU, Camera) 多源多传感器融合定位 GPS/INS组合导航 PPP/INS紧组合
onmyway133/awesome-machine-learning
🎰 A curated list of machine learning resources, preferably CoreML
jasmcaus/caer
High-performance Vision library in Python. Scale your research, not boilerplate.
andyzeng/tsdf-fusion
Fuse multiple depth frames into a TSDF voxel volume.
evilgix/Evil
Optical Character Recognition in Swift for iOS&macOS. 银行卡、身份证、门牌号光学识别
lucidrains/bottleneck-transformer-pytorch
Implementation of Bottleneck Transformer in Pytorch
cary-sas/v2ray_bin
梅林380 固件的魔改科学上网插件
OvidijusParsiunas/myvision
Computer vision based ML training data generation tool :rocket:
mostafasadeghi97/design2code
Convert any web design screenshot to clean HTML/CSS code
google-research/ravens
Train robotic agents to learn pick and place with deep learning for vision-based manipulation in PyBullet. Transporter Nets, CoRL 2020.
anki/vector-python-sdk
Anki Vector Python SDK