yuanrr
Ph.D student, focusing on image and video understanding, i.e., visual question answering, video question answering, etc.
yuanrr's Stars
f/awesome-chatgpt-prompts
This repo includes ChatGPT prompt curation to use ChatGPT better.
chatanywhere/GPT_API_free
Free ChatGPT API Key,免费ChatGPT API,支持GPT4 API(免费),ChatGPT国内可用免费转发API,直连无需代理。可以搭配ChatBox等软件/插件使用,极大降低接口使用成本。国内即可无限制畅快聊天。
Hannibal046/Awesome-LLM
Awesome-LLM: a curated list of Large Language Model
01-ai/Yi
A series of large language models trained from scratch by developers @01-ai
Lightning-AI/lit-llama
Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.
pytorch-labs/gpt-fast
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
Lightning-AI/lit-gpt
Hackable implementation of state-of-the-art open-source LLMs based on nanoGPT. Supports flash attention, 4-bit and 8-bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.
lyogavin/Anima
33B Chinese LLM, DPO QLORA, 100K context, AirLLM 70B inference with single 4GB GPU
PKU-YuanGroup/Video-LLaVA
Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
frgfm/torch-cam
Class activation maps for your PyTorch models (CAM, Grad-CAM, Grad-CAM++, Smooth Grad-CAM++, Score-CAM, SS-CAM, IS-CAM, XGrad-CAM, Layer-CAM)
AGI-Edgerunners/LLM-Adapters
Code for our EMNLP 2023 Paper: "LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-Tuning of Large Language Models"
yunlong10/Awesome-LLMs-for-Video-Understanding
🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.
CircleRadon/Osprey
[CVPR2024] The code for "Osprey: Pixel Understanding with Visual Instruction Tuning"
dvlab-research/LLaMA-VID
Official Implementation for LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models
OpenGVLab/OmniQuant
[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.
csuhan/OneLLM
[CVPR 2024] OneLLM: One Framework to Align All Modalities with Language
shenyunhang/APE
[CVPR 2024] Aligning and Prompting Everything All at Once for Universal Visual Perception
Yangyi-Chen/Multimodal-AND-Large-Language-Models
Paper list about multimodal and large language models, only used to record papers I read in the daily arxiv for personal needs.
zhenyingfang/Awesome-Temporal-Action-Detection-Temporal-Action-Proposal-Generation
Temporal Action Detection & Weakly Supervised Temporal Action Detection & Temporal Action Proposal Generation
llava-rlhf/LLaVA-RLHF
Aligning LMMs with Factually Augmented RLHF
taesiri/ArXivQA
WIP - Automated Question Answering for ArXiv Papers with Large Language Models (https://arxiv.taesiri.xyz/)
mlpc-ucsd/BLIVA
(AAAI 2024) BLIVA: A Simple Multimodal LLM for Better Handling of Text-rich Visual Questions
nbasyl/LLM-FP4
The official implementation of the EMNLP 2023 paper LLM-FP4
OPPOMKLab/u-LLaVA
u-LLaVA: Unifying Multi-Modal Tasks via Large Language Model
AGI-Edgerunners/LLM-Continual-Learning-Papers
Must-read Papers on Large Language Model (LLM) Continual Learning
j-min/HiREST
Hierarchical Video-Moment Retrieval and Step-Captioning (CVPR 2023)
minghangz/cpl
CPL: Weakly Supervised Temporal Sentence Grounding with Gaussian-based Contrastive Proposal Learning
jayleicn/VideoLanguageFuturePred
[EMNLP 2020] What is More Likely to Happen Next? Video-and-Language Future Event Prediction
RenShuhuai-Andy/TESTA
[EMNLP 2023] TESTA: Temporal-Spatial Token Aggregation for Long-form Video-Language Understanding
yl3800/TranSTR