Pinned Repositories
android_world
AndroidWorld is an environment and benchmark for autonomous agents
ADS-Cap
A Framework for Accurate and Diverse Stylized Captioning with Unpaired Stylistic Corpora
KnowCap
Code for Beyond Generic: Enhancing Image Captioning with Real-World Knowledge using Vision-Language Pre-Training Model
MM-Self-Improve
A Self-Training Framework for Vision-Language Reasoning
nlpSummerCamp2022_caption
An Image Caption Framework
SeeClick
The model, data and code for the visual GUI Agent SeeClick
Awesome-Code-Intelligence
Neural Code Intelligence Survey 2024; Reading lists and resources
Qwen-VL
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
webarena
Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"
OSWorld
[NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
njucckevin's Repositories
njucckevin/SeeClick
The model, data and code for the visual GUI Agent SeeClick
njucckevin/MM-Self-Improve
A Self-Training Framework for Vision-Language Reasoning
njucckevin/KnowCap
Code for Beyond Generic: Enhancing Image Captioning with Real-World Knowledge using Vision-Language Pre-Training Model
njucckevin/ADS-Cap
A Framework for Accurate and Diverse Stylized Captioning with Unpaired Stylistic Corpora
njucckevin/nlpSummerCamp2022_caption
An Image Caption Framework