chrisx599
Hi,im chr1ce. A undergraduate student from BUPT, majoring computer science and teconology.If you like my respositories, please star it, thank you!
Beijing University of Post and TelecommunicationBeijing
chrisx599's Stars
ugorsahin/Generative-Negative-Mining
Enhancing Multimodal Compositional Reasoning of Visual Language Models with Generative Negative Mining, WACV 2024
HumanSignal/label-studio
Label Studio is a multi-type data labeling and annotation tool with standardized output format
phohenecker/switch-cuda
A simple bash script for switching between installed versions of CUDA.
BAAI-DCAI/Multimodal-Robustness-Benchmark
tdurieux/anonymous_github
Anonymous Github is a proxy server to support anonymous browsing of Github repositories for open-science code and data.
iamadamdev/bypass-paywalls-chrome
Bypass Paywalls web browser extension for Chrome and Firefox.
mertyg/vision-language-models-are-bows
Experiments and data for the paper "When and why vision-language models behave like bags-of-words, and what to do about it?" Oral @ ICLR 2023
zilliztech/GPTCache
Semantic cache for LLMs. Fully integrated with LangChain and llama_index.
friedrichor/Awesome-Multimodal-Papers
A curated list of awesome Multimodal studies.
jy0205/LaVIT
LaVIT: Empower the Large Language Model to Understand and Generate Visual Content
chrisx599/DSMD
ZPdesu/Barbershop
Barbershop: GAN-based Image Compositing using Segmentation Masks (SIGGRAPH Asia 2021)
labelmeai/labelme
Image Polygonal Annotation with Python (polygon, rectangle, circle, line, point and image-level flag annotation).
HumanSignal/labelImg
LabelImg is now part of the Label Studio community. The popular image annotation tool created by Tzutalin is no longer actively being developed, but you can check out Label Studio, the open source data labeling tool for images, text, hypertext, audio, video and time-series data.
01-ai/Yi
A series of large language models trained from scratch by developers @01-ai
lijiannuist/Efficient-Multimodal-LLMs-Survey
Efficient Multimodal Large Language Models: A Survey
OpenBMB/MiniCPM-V
MiniCPM-Llama3-V 2.5: A GPT-4V Level Multimodal LLM on Your Phone
dvlab-research/MGM
Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"
naklecha/llama3-from-scratch
llama3 implementation one matrix multiplication at a time
microsoft/Swin-Transformer
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
luban-agi/Awesome-Domain-LLM
收集和梳理垂直领域的开源模型、数据集及评测基准。
Jittor/JittorLLMs
计图大模型推理库,具有高性能、配置要求低、中文支持好、可移植等特点
HqWu-HITCS/Awesome-Chinese-LLM
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
FuxiaoLiu/LRV-Instruction
[ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning
RVC-Boss/GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
dvlab-research/LISA
Project Page for "LISA: Reasoning Segmentation via Large Language Model"
facebookresearch/llm-transparency-tool
LLM Transparency Tool (LLM-TT), an open-source interactive toolkit for analyzing internal workings of Transformer-based language models. *Check out demo at* https://huggingface.co/spaces/facebook/llm-transparency-tool-demo
showlab/Awesome-MLLM-Hallucination
📖 A curated list of resources dedicated to hallucination of multimodal large language models (MLLM).
OpenRL-Lab/DeepFakeFace
DeepFake Face Datasets. Code accompanying the paper "Robustness and Generalizability of Deepfake Detection: A Study with Diffusion Models".
iperov/DeepFaceLab
DeepFaceLab is the leading software for creating deepfakes.