cjl09's Stars
facebookresearch/faiss
A library for efficient similarity search and clustering of dense vectors.
lucidrains/vit-pytorch
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
WooooDyy/LLM-Agent-Paper-List
The paper list of the 86-page paper "The Rise and Potential of Large Language Model Based Agents: A Survey" by Zhiheng Xi et al.
wenda-LLM/wenda
闻达:一个LLM调用平台。目标为针对特定环境的高效内容生成,同时考虑个人和中小企业的计算资源局限性,以及知识安全和私密性问题
THUDM/CogVLM
a state-of-the-art-level open visual language model | 多模态预训练模型
InternLM/MindSearch
🔍 An LLM-based Multi-agent Framework of Web Search Engine (like Perplexity.ai Pro and SearchGPT)
developersdigest/llm-answer-engine
Build a Perplexity-Inspired Answer Engine Using Next.js, Groq, Llama-3, Langchain, OpenAI, Upstash, Brave & Serper
kimiyoung/transformer-xl
QwenLM/Qwen-Agent
Agent framework and applications built upon Qwen>=2.0, featuring Function Calling, Code Interpreter, RAG, and Chrome extension.
rashadphz/farfalle
🔍 AI search engine - self-host with local or cloud LLMs
dandelin/ViLT
Code for the ICML 2021 (long talk) paper: "ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision"
BangguWu/ECANet
Code for ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks
pprp/awesome-attention-mechanism-in-cv
Awesome List of Attention Modules and Plug&Play Modules in Computer Vision
thunlp/WebCPM
Official codes for ACL 2023 paper "WebCPM: Interactive Web Search for Chinese Long-form Question Answering"
Xnhyacinth/Awesome-LLM-Long-Context-Modeling
📰 Must-read papers and blogs on LLM based Long Context Modeling 🔥
jokieleung/awesome-visual-question-answering
A curated list of Visual Question Answering(VQA)(Image/Video Question Answering),Visual Question Generation ,Visual Dialog ,Visual Commonsense Reasoning and related area.
gaopengcuhk/CLIP-Adapter
Zhaozixiang1228/MMIF-CDDFuse
[CVPR 2023] Official implementation for "CDDFuse: Correlation-Driven Dual-Branch Feature Decomposition for Multi-Modality Image Fusion."
yuanmaoxun/Awesome-RGBT-Fusion
A collection of deep learning based RGB-T-Fusion methods, codes, and datasets. The main directions involved are Multispectral Pedestrian Detection, RGB-T Aerial Object Detection, RGB-T Semantic Segmentation, RGB-T Crowd Counting, RGB-T Fusion Tracking.
jiawen-zhu/ViPT
[CVPR23] Visual Prompt Multi-Modal Tracking
YuchenLiu98/COMM
Pytorch code for paper From CLIP to DINO: Visual Encoders Shout in Multi-modal Large Language Models
Wilson-ZheLin/GPT-4-Web-Browsing
GPT-4 Enhanced with Real-Time Web Browsing 🔗
ptonlix/LangChain-SearXNG
AI Q&A Search Engine ➡️ 基于LangChain和SearXNG打造的开源AI搜索引擎
songrise/CLIP-Count
[ACM MM23] CLIP-Count: Towards Text-Guided Zero-Shot Object Counting
Ruru-Xu/chapter5-learning_CSRNet
code-visualization-pytorch
vmarinowski/infini-attention
An unofficial pytorch implementation of 'Efficient Infinite Context Transformers with Infini-attention'
cha15yq/CUT
Segmentation assisted U-shaped multi-scale transformer for crowd counting
IndigoPurple/CrowdCount-MCNN
Single-Image Crowd Counting via Multi-Column Convolutional Neural Network
dreaming-coder/nuist-thesis
the LaTeX template of Nuist
nguyen1312/LoViTCrowd