qcwthu

PhD Student at Nanyang Technological University, Singapore

qcwthu's Stars

microsoft/autogen
A programming framework for agentic AI 🤖 PyPi: autogen-agentchat Discord: https://aka.ms/autogen-discord Office Hour: https://aka.ms/autogen-officehour
Language:Python36.2k 416 2.2k5.2k
e2b-dev/awesome-ai-agents
A list of AI autonomous agents
12.3k 224 34917
microsoft/LoRA
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
Language:Python11k 70 108695
huggingface/trl
Train transformer language models with reinforcement learning.
Language:Python10.4k 77 1.3k1.3k
FlagOpen/FlagEmbedding
Retrieval and Retrieval-augmented LLMs
Language:Python8k 49 1.1k592
OpenBMB/ToolBench
[ICLR'24 spotlight] An open platform for training, serving, and evaluating large language model for tool learning.
Language:Python4.9k 49 301430
open-compass/opencompass
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
Language:Python4.4k 26 572460
OpenNMT/CTranslate2
Fast inference engine for Transformer models
Language:C++3.5k 60 719309
google/BIG-bench
Beyond the Imitation Game collaborative benchmark for measuring and extrapolating the capabilities of language models
Language:Python2.9k 51 151594
allenai/open-instruct
Language:Python2.2k 20 144246
embeddings-benchmark/mteb
MTEB: Massive Text Embedding Benchmark
Language:Jupyter Notebook2k 15 554289
stanford-crfm/helm
Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09110). This framework is also used to evaluate text-to-image models in HEIM (https://arxiv.org/abs/2311.04287) and vision-language models in VHELM (https://arxiv.org/abs/2410.07112).
Language:Python2k 36 1.1k263
MineDojo/MineDojo
Building Open-Ended Embodied Agents with Internet-Scale Knowledge
Language:Java1.8k 29 123164
WisdomShell/codeshell
A series of code large language models developed by PKU-KCL
Language:Python1.6k 25 79120
THUDM/AgentTuning
AgentTuning: Enabling Generalized Agent Abilities for LLMs
Language:Python1.4k 16 5395
princeton-nlp/MeZO
[NeurIPS 2023] MeZO: Fine-Tuning Language Models with Just Forward Passes. https://arxiv.org/abs/2305.17333
Language:Python1.1k 19 4066
hendrycks/math
The MATH Dataset (NeurIPS 2021)
Language:Python939 12 2089
GanjinZero/RRHF
[NIPS2023] RRHF & Wombat
Language:Python800 10 4949
ruixiangcui/AGIEval
Language:Python715 9 2748
haonan-li/CMMLU
CMMLU: Measuring massive multitask language understanding in Chinese
Language:Python705 11 3759
sail-sg/lorahub
[COLM 2024] LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition
Language:Python604 11 2236
OpenLemur/Lemur
[ICLR 2024] Lemur: Open Foundation Models for Language Agents
Language:Python540 9 635
declare-lab/instruct-eval
This repository contains code to quantitatively evaluate instruction-tuned models such as Alpaca and Flan-T5 on held-out tasks.
Language:Python534 13 3043
onejune2018/Awesome-LLM-Eval
Awesome-LLM-Eval: a curated list of tools, datasets/benchmark, demos, leaderboard, papers, docs and models, mainly for Evaluation on LLMs. 一个由工具、基准/数据、演示、排行榜和大模型等组成的精选列表，主要面向基础大模型评测，旨在探求生成式AI的技术边界.
447 9 143
suzgunmirac/BIG-Bench-Hard
Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them
443 3 928
FlagAI-Open/Aquila2
The official repo of Aquila2 series proposed by BAAI, including pretrained & chat large language models.
Language:Python440 5 6830
GAIR-NLP/abel
SOTA Math Opensource LLM
Language:Python323 10 1319
microsoft/SmartPlay
SmartPlay is a benchmark for Large Language Models (LLMs). Uses a variety of games to test various important LLM capabilities as agents. SmartPlay is designed to be easy to use, and to support future development of LLMs.
Language:Python126 5 716
abhishekpanigrahi1996/Skill-Localization-by-grafting
Language:Python47 1 34
srhthu/LM-CompEval-Legal
Code for the paper "A Comprehensive Evaluation of Large Language Models on Legal Judgment Prediction"
Language:Python11 1 21

qcwthu

qcwthu's Stars

microsoft/autogen

e2b-dev/awesome-ai-agents

microsoft/LoRA

huggingface/trl

FlagOpen/FlagEmbedding

OpenBMB/ToolBench

open-compass/opencompass

OpenNMT/CTranslate2

google/BIG-bench

allenai/open-instruct

embeddings-benchmark/mteb

stanford-crfm/helm

MineDojo/MineDojo

WisdomShell/codeshell

THUDM/AgentTuning

princeton-nlp/MeZO

hendrycks/math

GanjinZero/RRHF

ruixiangcui/AGIEval

haonan-li/CMMLU

sail-sg/lorahub

OpenLemur/Lemur

declare-lab/instruct-eval

onejune2018/Awesome-LLM-Eval

suzgunmirac/BIG-Bench-Hard

FlagAI-Open/Aquila2

GAIR-NLP/abel

microsoft/SmartPlay

abhishekpanigrahi1996/Skill-Localization-by-grafting

srhthu/LM-CompEval-Legal