bowen-upenn
Ph.D. candidate in Computer and Information Science
GRASP Lab, University of PennsylvaniaPhiladelphia, United States
Pinned Repositories
Agent_Rationality
This is the official repository of the paper "Towards Rationality in Language and Multimodal Agents: A Survey"
AnyText
Official implementation code of the paper <AnyText: Multilingual Visual Text Generation And Editing>
Awesome-LLM-Reasoning
Reasoning in Large Language Models: Papers and Resources, including Chain-of-Thought and OpenAI o1 🍓
CCD
[ICCV2023] Self-supervised Character-to-Character Distillation for Text Recognition
CFR_VQA
Coarse-to-Fine Reasoning for Visual Question Answering (CVPRW'22)
llm_token_bias
[EMNLP 2024] This is the official implementation of the paper "A Peek into Token Bias: Large Language Models Are Not Yet Genuine Reasoners" in PyTorch.
Multi-Agent-VQA
[CVPR 2024 CVinW] Multi-Agent VQA: Exploring Multi-Agent Foundation Models on Zero-Shot Visual Question Answering
Rethinking-Text-Segmentation
[CVPR 2021] Rethinking Text Segmentation: A Novel Dataset and A Text-Specific Refinement Approach
scene_graph_commonsense
[WACV 2025] This is the official implementation of the paper "Enhancing Scene Graph Generation with Hierarchical Relationships and Commonsense Knowledge" in PyTorch.
WildfireGPT
bowen-upenn's Repositories
bowen-upenn/scene_graph_commonsense
[WACV 2025] This is the official implementation of the paper "Enhancing Scene Graph Generation with Hierarchical Relationships and Commonsense Knowledge" in PyTorch.
bowen-upenn/MMMA_Rationality
This is the official repository of the paper "Multi-Modal and Multi-Agent Systems Meet Rationality: A Survey"
bowen-upenn/llm_token_bias
[EMNLP 2024] This is the official implementation of the paper "A Peek into Token Bias: Large Language Models Are Not Yet Genuine Reasoners" in PyTorch.
bowen-upenn/Multi-Agent-VQA
[CVPR 2024 CVinW] Multi-Agent VQA: Exploring Multi-Agent Foundation Models on Zero-Shot Visual Question Answering
bowen-upenn/AnyText
Official implementation code of the paper <AnyText: Multilingual Visual Text Generation And Editing>
bowen-upenn/Awesome-LLM-Reasoning
Reasoning in Large Language Models: Papers and Resources, including Chain-of-Thought and OpenAI o1 🍓
bowen-upenn/CCD
[ICCV2023] Self-supervised Character-to-Character Distillation for Text Recognition
bowen-upenn/CFR_VQA
Coarse-to-Fine Reasoning for Visual Question Answering (CVPRW'22)
bowen-upenn/Rethinking-Text-Segmentation
[CVPR 2021] Rethinking Text Segmentation: A Novel Dataset and A Text-Specific Refinement Approach
bowen-upenn/SeeAct
[ICML'24] SeeAct is a system for generalist web agents that autonomously carry out tasks on any given website, with a focus on large multimodal models (LMMs) such as GPT-4V(ision).
bowen-upenn/unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
bowen-upenn/VLSAT
CVPR2023 : VL-SAT: Visual-Linguistic Semantics Assisted Training for 3D Semantic Scene Graph Prediction in Point Cloud