jhCOR's Stars
X-PLUG/mPLUG-Owl
mPLUG-Owl: The Powerful Multi-modal Large Language Model Family
findalexli/mllm-dpo
[ACL 2024] Multi-modal preference alignment remedies regression of visual instruction tuning on language model
mzbac/llama2-fine-tune
Scripts for fine-tuning Llama2 via SFT and DPO.
opendatalab/HA-DPO
Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization
nyunAI/Faster-LLM-Survey
Vision-CAIR/MiniGPT-4
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
mytechnotalent/Gemini
Google Gemini AI model w/speech recognition and voice.
tsb0601/MMVP
GuyTevet/diversity-eval
Official Github repo for the paper "Evaluating the Evaluation of Diversity in Natural Language Generation"
HadiZayer/eyenerf
palchenli/VL-Instruction-Tuning
geuk-hub/-Dacon-Multimodal-vqa
LLaVA-VL/LLaVA-NeXT
chuangchuangtan/LLaVA-NeXT-Image-Llama3-Lora
LLaVA-NeXT-Image-Llama3-Lora, Modified from https://github.com/arielnlee/LLaVA-1.6-ft
teddysum/Korean_DCS_2024
facebookresearch/dinov2
PyTorch code and models for the DINOv2 self-supervised learning method.
mlfoundations/open_clip
An open source implementation of CLIP.
RevenueCat/purchases-android
Android in-app purchases and subscriptions made easy.
google/oboe
Oboe is a C++ library that makes it easy to build high-performance audio apps on Android.
ThuCCSLab/FigStep
Jailbreaking Large Vision-language Models via Typographic Visual Prompts
Unispac/Visual-Adversarial-Examples-Jailbreak-Large-Language-Models
Repository for the Paper (AAAI 2024, Oral) --- Visual Adversarial Examples Jailbreak Large Language Models
salesforce/LAVIS
LAVIS - A One-stop Library for Language-Vision Intelligence
AttentionX/InstructBLIP_PEFT
lioryariv/idr
bluer555/CR-GAN
Yu Tian et al. "CR-GAN: Learning Complete Representations for Multi-view Generation", IJCAI 2018
HRI-UESTC/CFM-HRI-RGB-D-action-database
UESTC RGB-D Varying-view action database. This multi-view action database is captured by Kinect v2.0 with modality of RGB video, 3D skeleton sequences and depth map sequences.
Totoro97/NeuS
Code release for NeuS
googlearchive/android-Camera2Raw
Migrated:
junyangwang0410/AMBER
An LLM-free Multi-dimensional Benchmark for Multi-modal Hallucination Evaluation
za-cheng/WildLight
official implementation of our CVPR 2023 paper "In-the-wild Inverse Rendering with a Flashlight"