zl535320706
Li Zhang is currently pursuing the Ph.D. degree with the Harbin Institute of Technology. His current research interest is multi-modal learning.
zl535320706's Stars
fudan-generative-vision/hallo
Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation
zbezj/HEU_KMS_Activator
Charmve/Surface-Defect-Detection
📈 目前最大的工业缺陷检测数据库及论文集 Constantly summarizing open source dataset and critical papers in the field of surface defect research which are of great importance.
sarahESL/PubMedCLIP
Fine-tuning CLIP using ROCO dataset which contains image-caption pairs from PubMed articles.
jaysonlong/webvideo-downloader
Web video downloader for Bilibili, iQIYI, Tencent Video, MGTV and WeTV. 网站视频下载器,主要支持Bilibili、爱奇艺、腾讯视频、芒果TV、WeTV、愛奇藝台灣站。
linhandev/dataset
医学影像数据集列表 『An Index for Medical Imaging Datasets』
ultralytics/ultralytics
Ultralytics YOLO11 🚀
pyg-team/pytorch_geometric
Graph Neural Network Library for PyTorch
josStorer/RWKV-Runner
A RWKV management and startup tool, full automation, only 8MB. And provides an interface compatible with the OpenAI API. RWKV is a large language model that is fully open source and available for commercial use.
ai-shifu/ChatALL
Concurrently chat with ChatGPT, Bing Chat, Bard, Alpaca, Vicuna, Claude, ChatGLM, MOSS, 讯飞星火, 文心一言 and more, discover the best answers
hiroi-sora/Umi-OCR
OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片,PDF文档识别,排除水印/页眉页脚,扫描/生成二维码。内置多国语言库。
fei-aiart/courses
课件:数字图像处理,深度学习,计算机视觉,机器学习
jianchang512/pyvideotrans
Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言,同时支持语音识别转录、语音合成、字幕翻译。
Arthurzhangsheng/CodeFormer_GUI
CodeFormer人脸清晰化工具图形界面版,自带环境解压即用
sunlicai/EMT-DLFR
Efficient Multimodal Transformer with Dual-Level Feature Restoration for Robust Multimodal Sentiment Analysis (TAC 2023)
thuiar/MMSA-FET
A Tool for extracting multimodal features from videos.
thuiar/MMSA
MMSA is a unified framework for Multimodal Sentiment Analysis.
wujieliulan/forum
Z-Siqi/Clash-for-Windows_Chinese
clash for windows汉化版. 提供clash for windows的汉化版, 汉化补丁及汉化版安装程序
binary-husky/gpt_academic
为GPT/GLM等LLM大语言模型提供实用化交互接口,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, moss等。
yaoyz96/els-cas-templates
Elsevier template 'els-cas-templates'.
nickchen121/cyd-selected-journal
microsoft/vcpkg
C++ Library Manager for Windows, Linux, and MacOS
naganandy/graph-based-deep-learning-literature
links to conference publications in graph-based deep learning
VL-Group/PENET
[CVPR 2023]Official Pytorch code for paper "Prototype-based Embedding Network for Scene Graph Generation"
CrossmodalGroup/CMCAN
Implementation of our AAAI2022 paper, Show Your Faith: Cross-Modal Confidence-Aware Network for Image-Text Matching.
CrossmodalGroup/NAAF
Implementation of our CVPR2022 paper, Negative-Aware Attention Framework for Image-Text Matching.
naver-ai/eccv-caption
Extended COCO Validation (ECCV) Caption dataset (ECCV 2022)
visual-text-QA/VTQA-Demo
visual-text-QA/Visuak-Text-QA-Challenge