michaelyuancb

Now Intern@MoonshotAI; Graduate@IIIS, Tsinghua University; EmbodiedAI & Agent; Simple+Elegant leads to AGI

Tsinghua UniversityBeijing, China

michaelyuancb's Stars

Junyi42/monst3r
Official Implementation of paper "MonST3R: A Simple Approach for Estimating Geometry in the Presence of Motion"
Language:Python55417
yatengLG/ISAT_with_segment_anything
Labeling tool with SAM(segment anything model),supports SAM, SAM2, sam-hq, MobileSAM EdgeSAM etc.交互式半自动图像标注工具
Language:Python1.3k133
xiongyiheng/ARKit-Scanner
The scanner app acquires RGB-D scans using iPhone LiDAR sensor and ARKit API, stores color, depth and IMU data on local memory and then uploads to PC for processing.
Language:Swift232
YanjieZe/Paper-List
A paper list of my history reading. Robotics, Learning, Vision.
23910
suyukun666/UFO
Official PyTorch implementation of the “A Unified Transformer Framework for Co-Segmentation, Co-Saliency Detection and Video Salient Object Detection”. (TMM2023)
Language:Python29149
TEN-framework/TEN-Agent
TEN Agent is the world’s first real-time multimodal agent integrated with the OpenAI Realtime API, RTC, and features weather checks, web search, vision, and RAG capabilities.
Language:Python948114
michaelyuancb/ego_hoi_model
A model combined 100DoH, Semantic-SAM and EgoHOS for hand-object state classification, detection, segmentation.
Language:Python3
lucidrains/vit-pytorch
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
Language:Python20.1k3k
THUDM/GLM-4
GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
Language:Python4.9k408
haotian-liu/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Language:Python19.8k2.2k
lyhue1991/torchkeras
Pytorch❤️ Keras 😋😋
Language:Jupyter Notebook1.7k224
tejpshah/interview-pilot-ai
Ace interviews with AI practice. Our agent role-plays personalized interview tailored to your background, listening and replying like a real interviewer. Train across personas for any situation.
Language:Python10118
jaidevshriram/realmdreamer
Code for RealmDreamer: Text-Driven 3D Scene Generation with Inpainting and Depth Diffusion [Arxiv 2024]
1986
stitionai/devika
Devika is an Agentic AI Software Engineer that can understand high-level human instructions, break them down into steps, research relevant information, and write code to achieve the given objective. Devika aims to be a competitive open-source alternative to Devin by Cognition AI.
Language:Python18.4k2.4k
lavague-ai/LaVague
Large Action Model framework to develop AI Web Agents
Language:Python5.4k489
RoboFlamingo/RoboFlamingo
Code for RoboFlamingo
Language:Python29724
prejudice666/whu-thesis-latex-template
武汉大学2019级本科毕业论文Latex模板
Language:TeX2
leptonai/search_with_lepton
Building a quick conversation-based search demo with Lepton AI.
Language:TypeScript7.8k994
michaelyuancb/general_flow
Repository for "General Flow as Foundation Affordance for Scalable Robot Learning"
Language:Python34
GengYiran/GengYiran.github.io
my blog
Language:HTML5
jonbarron/website
Language:HTML2.6k2.1k
voxposer/voxposer.github.io
Language:HTML347
ddshan/hand_object_detector
Project and dataset webpage:
Language:Python23067
idejie/ego_hand_detecor
pretrained_model from Shan et. al . “ Understanding Human Hands in Contact at Internet Scale (CVPR 2020, Oral).”
Language:Python5
real-stanford/diffusion_policy
[RSS 2023] Diffusion Policy Visuomotor Policy Learning via Action Diffusion
Language:Python1.5k277
ematvey/pybacktest
Vectorized backtesting framework in Python / pandas, designed to make your backtesting easier — compact, simple and fast
Language:Python808241
jingyi0000/VLM_survey
Collection of AWESOME vision-language models for vision tasks
2.3k208
hassony2/useful-computer-vision-phd-resources
Lists of resources useful for my PhD in computer vision
53599
luca-medeiros/lang-segment-anything
SAM with text prompt
Language:Python1.6k174
xlang-ai/xlang-paper-reading
Paper collection on building and evaluating language model agents via executable language grounding
33612

michaelyuancb

michaelyuancb's Stars

Junyi42/monst3r

yatengLG/ISAT_with_segment_anything

xiongyiheng/ARKit-Scanner

YanjieZe/Paper-List

suyukun666/UFO

TEN-framework/TEN-Agent

michaelyuancb/ego_hoi_model

lucidrains/vit-pytorch

THUDM/GLM-4

haotian-liu/LLaVA

lyhue1991/torchkeras

tejpshah/interview-pilot-ai

jaidevshriram/realmdreamer

stitionai/devika

lavague-ai/LaVague

RoboFlamingo/RoboFlamingo

prejudice666/whu-thesis-latex-template

leptonai/search_with_lepton

michaelyuancb/general_flow

GengYiran/GengYiran.github.io

jonbarron/website

voxposer/voxposer.github.io

ddshan/hand_object_detector

idejie/ego_hand_detecor

real-stanford/diffusion_policy

ematvey/pybacktest

jingyi0000/VLM_survey

hassony2/useful-computer-vision-phd-resources

luca-medeiros/lang-segment-anything

xlang-ai/xlang-paper-reading