holen-zhang

Waterloo, Canada

holen-zhang's Stars

sola-st/wasm-r3
Record-Reduce-Replay for Realistic and Standalone WebAssembly Benchmarks
Language:Jupyter Notebook203
iMeanAI/WebCanvas
Connect agents to live web environments evaluation.
Language:Python1849
ultrafunkamsterdam/undetected-chromedriver
Custom Selenium Chromedriver | Zero-Config | Passes ALL bot mitigation systems (like Distil / Imperva/ Datadadome / CloudFlare IUAM)
Language:Python9.7k1.1k
landing-ai/vision-agent
Vision agent
Language:Python1.2k125
stanfordnlp/dspy
DSPy: The framework for programming—not prompting—foundation models
Language:Python17.1k1.3k
fuzz4all/fuzz4all
🌌️Fuzz4All: Universal Fuzzing with Large Language Models
Language:Python16224
seketeam/EvoCodeBench
An Evolving Code Generation Benchmark Aligned with Real-world Code Repositories
Language:Python392
THUDM/ChatGLM3
ChatGLM3 series: Open Bilingual Chat LLMs | 开源双语对话语言模型
Language:Python13.3k1.6k
phlippe/uvadlc_notebooks
Repository of Jupyter notebook tutorials for teaching the Deep Learning Course at the University of Amsterdam (MSc AI), Fall 2023
Language:Jupyter Notebook2.5k564
Troyanovsky/Local-LLM-Comparison-Colab-UI
Compare the performance of different LLM that can be deployed locally on consumer hardware. Run yourself with Colab WebUI.
Language:Jupyter Notebook960142
OSU-NLP-Group/Mind2Web
[NeurIPS'23 Spotlight] "Mind2Web: Towards a Generalist Agent for the Web"
Language:Jupyter Notebook65794
THUDM/AutoWebGLM
An LLM-based Web Navigating Agent (KDD'24)
Language:Python58146
shulin16/MMInA
Official implementation of the paper "MMInA: Benchmarking Multihop Multimodal Internet Agents"
Language:Python363
VisualWebBench/VisualWebBench
Evaluation framework for paper "VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?"
Language:Python391
MinorJerry/WebVoyager
Code for "WebVoyager: WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models"
Language:Python24330
jun0wanan/awesome-large-multimodal-agents
30219
mnotgod96/AppAgent
AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.
Language:Python4.9k524
QwenLM/Qwen-VL
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
Language:Python4.8k369
haotian-liu/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Language:Python19.5k2.1k
cooelf/Auto-GUI
Official implementation for "You Only Look at Screens: Multimodal Chain-of-Action Agents" (Findings of ACL 2024)
Language:Python18215
Leolty/repobench
✨ RepoBench: Benchmarking Repository-Level Code Auto-Completion Systems - ICLR 2024
Language:Python1307
princeton-nlp/SWE-bench
[ICLR 2024] SWE-Bench: Can Language Models Resolve Real-world Github Issues?
Language:Python1.8k310
NLP-Core-Team/RealCode_eval
Language:Python9
FlagOpen/TACO
Language:Python1357
IBM/Project_CodeNet
This repository is to support contributions for tools for the Project CodeNet dataset hosted in DAX
Language:Python1.5k191
daniel-furman/sft-demos
Lightweight demos for finetuning LLMs. Powered by 🤗 transformers and open-source datasets.
Language:Jupyter Notebook648
zzxslp/MM-Navigator
GPT-4V in Wonderland: LMMs as Smartphone Agents
Language:Python1232
FudanSELab/ClassEval
Benchmark ClassEval for class-level code generation.
Language:Python1208
magicgh/Self-MAP
[ACL 2024] On the Multi-turn Instruction Following for Conversational Web Agents
Language:Python11
xai-org/grok-1
Grok open release
Language:Python49.4k8.3k

holen-zhang

holen-zhang's Stars

sola-st/wasm-r3

iMeanAI/WebCanvas

ultrafunkamsterdam/undetected-chromedriver

landing-ai/vision-agent

stanfordnlp/dspy

fuzz4all/fuzz4all

seketeam/EvoCodeBench

THUDM/ChatGLM3

phlippe/uvadlc_notebooks

Troyanovsky/Local-LLM-Comparison-Colab-UI

OSU-NLP-Group/Mind2Web

THUDM/AutoWebGLM

shulin16/MMInA

VisualWebBench/VisualWebBench

MinorJerry/WebVoyager

jun0wanan/awesome-large-multimodal-agents

mnotgod96/AppAgent

QwenLM/Qwen-VL

haotian-liu/LLaVA

cooelf/Auto-GUI

Leolty/repobench

princeton-nlp/SWE-bench

NLP-Core-Team/RealCode_eval

FlagOpen/TACO

IBM/Project_CodeNet

daniel-furman/sft-demos

zzxslp/MM-Navigator

FudanSELab/ClassEval

magicgh/Self-MAP

xai-org/grok-1