wyczzy

wyczzy's Stars

Rapisurazurite/FFDN
Implementation for Enhancing Tampered Text Detection Through Frequency Feature Fusion and Decomposition
Language:C5
chuangchuangtan/C2P-CLIP-DeepfakeDetection
C2P-CLIP-DeepfakeDetection
Language:Python372
grip-unina/AdversarialRobustnessCLIP
Language:Jupyter Notebook4
modelscope/awesome-deep-reasoning
Collect every awesome work about r1!
Language:Python1694
Horizonll/AI-Detector
2024年第六届全球校园人工智能算法精英大赛AI生成人脸图像鉴别
Language:Python7
manjaryp/MCE-ViT
A Robust Approach Towards Distinguishing Natural and Computer Generated Images using Multi-Colorspace fused and Enriched Vision Transformer
Language:Jupyter Notebook11
dzhng/deep-research
An AI-powered research assistant that performs iterative, deep research on any topic by combining search engines, web scraping, and large language models. The goal of this repo is to provide the simplest implementation of a deep research agent - e.g. an agent that can refine its research direction overtime and deep dive into a topic.
Language:TypeScript12.2k1.2k
open-webui/open-webui
User-friendly AI Interface (Supports Ollama, OpenAI API, ...)
Language:JavaScript75.4k8.9k
VARGPT-family/VARGPT
VARGPT: Unified Understanding and Generation in a Visual Autoregressive Multimodal Large Language Model
Language:Python16110
JCruan519/PETRobustness
(ARXIV24) This is the official code repository for "Understanding Robustness of Parameter-Efficient Tuning for Image Classification".
Language:Python7
freddiewalchwmf25/JieMa
2024年最新国外短信接码平台推荐（免费+付费）
35131
dwgoon/jpegio
A python package for accessing the internal variables of JPEG file format such as DCT coefficients and quantization tables
Language:C7518
forever208/DCTdiff
Official code for the paper 'DCTdiff: Intriguing Properties of Image Generative Modeling in the DCT Space'
Language:Python19
PKU-Alignment/align-anything
Align Anything: Training All-modality Model with Feedback
Language:Python2k278
dummerchen/HFST
Language:Python1
speedlab-git/SimCLIP
Language:Jupyter Notebook5
HashmatShadab/Robust-LLaVA
Robust-LLaVA: On the Effectiveness of Large-Scale Robust Image Encoders for Multi-modal Large Language Models
Language:Python15
Deep-Agent/R1-V
Witness the aha moment of VLM with less than $3.
Language:Python2.5k195
yoziru/nextjs-vllm-ui
Fully-featured, beautiful web interface for vLLM - built with NextJS.
Language:TypeScript10615
lucidrains/transfusion-pytorch
Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAI
Language:Python94242
Alpha-VLLM/Lumina-mGPT
Official Implementation of "Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining"
Language:Python54124
AILab-CVC/SEED-X
Multimodal Models in Real World
Language:Jupyter Notebook43620
FoundationVision/Infinity
Infinity ∞ : Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis
Language:Python95640
daixiangzi/VAR-CLIP
Implements VAR+CLIP for text-to-image (T2I) generation
Language:Python1192
baaivision/Emu3
Next-Token Prediction is All You Need
Language:Python2k78
deepseek-ai/Janus
Janus-Series: Unified Multimodal Understanding and Generation Models
Language:Python15.9k2.1k
facebookresearch/chameleon
Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.
Language:Python1.9k113
showlab/Show-o
[ICLR 2025] Repository for Show-o, One Single Transformer to Unify Multimodal Understanding and Generation.
Language:Python1.2k51
FoundationVision/LlamaGen
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
Language:Python1.6k70
gq-max/AdvDiffVLM
Language:Jupyter Notebook17