zl535320706

Li Zhang is currently pursuing the Ph.D. degree with the Harbin Institute of Technology. His current research interest is multi-modal learning.

zl535320706's Stars

fudan-generative-vision/hallo
Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation
Language:Python9.5k1.3k
zbezj/HEU_KMS_Activator
30.3k3.1k
Charmve/Surface-Defect-Detection
📈 目前最大的工业缺陷检测数据库及论文集 Constantly summarizing open source dataset and critical papers in the field of surface defect research which are of great importance.
Language:Python3.2k530
sarahESL/PubMedCLIP
Fine-tuning CLIP using ROCO dataset which contains image-caption pairs from PubMed articles.
Language:Python14027
jaysonlong/webvideo-downloader
Web video downloader for Bilibili, iQIYI, Tencent Video, MGTV and WeTV. 网站视频下载器，主要支持Bilibili、爱奇艺、腾讯视频、芒果TV、WeTV、愛奇藝台灣站。
Language:Python1.1k244
linhandev/dataset
医学影像数据集列表『An Index for Medical Imaging Datasets』
2.7k370
ultralytics/ultralytics
Ultralytics YOLO11 🚀
Language:Python33.3k6.4k
pyg-team/pytorch_geometric
Graph Neural Network Library for PyTorch
Language:Python21.5k3.7k
josStorer/RWKV-Runner
A RWKV management and startup tool, full automation, only 8MB. And provides an interface compatible with the OpenAI API. RWKV is a large language model that is fully open source and available for commercial use.
Language:TypeScript5.3k506
ai-shifu/ChatALL
Concurrently chat with ChatGPT, Bing Chat, Bard, Alpaca, Vicuna, Claude, ChatGLM, MOSS, 讯飞星火, 文心一言 and more, discover the best answers
Language:JavaScript15.3k1.6k
hiroi-sora/Umi-OCR
OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片，PDF文档识别，排除水印/页眉页脚，扫描/生成二维码。内置多国语言库。
Language:Python27.6k2.8k
fei-aiart/courses
课件：数字图像处理，深度学习，计算机视觉，机器学习
Language:HTML30267
jianchang512/pyvideotrans
Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言，同时支持语音识别转录、语音合成、字幕翻译。
Language:Python10.9k1.2k
Arthurzhangsheng/CodeFormer_GUI
CodeFormer人脸清晰化工具图形界面版，自带环境解压即用
72877
sunlicai/EMT-DLFR
Efficient Multimodal Transformer with Dual-Level Feature Restoration for Robust Multimodal Sentiment Analysis (TAC 2023)
Language:Python535
thuiar/MMSA-FET
A Tool for extracting multimodal features from videos.
Language:Python14121
thuiar/MMSA
MMSA is a unified framework for Multimodal Sentiment Analysis.
Language:Python702110
wujieliulan/forum
29571
Z-Siqi/Clash-for-Windows_Chinese
clash for windows汉化版. 提供clash for windows的汉化版, 汉化补丁及汉化版安装程序
Language:JavaScript21.7k2.8k
binary-husky/gpt_academic
为GPT/GLM等LLM大语言模型提供实用化交互接口，特别优化论文阅读/润色/写作体验，模块化设计，支持自定义快捷按钮&函数插件，支持Python和C++等项目剖析&自译解功能，PDF/LaTex论文翻译&总结功能，支持并行问询多种LLM模型，支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, moss等。
Language:Python66k8.1k
yaoyz96/els-cas-templates
Elsevier template 'els-cas-templates'.
Language:TeX16318
nickchen121/cyd-selected-journal
Language:Python21825
microsoft/vcpkg
C++ Library Manager for Windows, Linux, and MacOS
Language:CMake23.4k6.5k
naganandy/graph-based-deep-learning-literature
links to conference publications in graph-based deep learning
Language:Jupyter Notebook4.8k780
VL-Group/PENET
[CVPR 2023]Official Pytorch code for paper "Prototype-based Embedding Network for Scene Graph Generation"
Language:Jupyter Notebook477
CrossmodalGroup/CMCAN
Implementation of our AAAI2022 paper, Show Your Faith: Cross-Modal Confidence-Aware Network for Image-Text Matching.
Language:Python364
CrossmodalGroup/NAAF
Implementation of our CVPR2022 paper, Negative-Aware Attention Framework for Image-Text Matching.
Language:Python11111
naver-ai/eccv-caption
Extended COCO Validation (ECCV) Caption dataset (ECCV 2022)
Language:Python562
visual-text-QA/VTQA-Demo
Language:Python41
visual-text-QA/Visuak-Text-QA-Challenge
5

zl535320706

zl535320706's Stars

fudan-generative-vision/hallo

zbezj/HEU_KMS_Activator

Charmve/Surface-Defect-Detection

sarahESL/PubMedCLIP

jaysonlong/webvideo-downloader

linhandev/dataset

ultralytics/ultralytics

pyg-team/pytorch_geometric

josStorer/RWKV-Runner

ai-shifu/ChatALL

hiroi-sora/Umi-OCR

fei-aiart/courses

jianchang512/pyvideotrans

Arthurzhangsheng/CodeFormer_GUI

sunlicai/EMT-DLFR

thuiar/MMSA-FET

thuiar/MMSA

wujieliulan/forum

Z-Siqi/Clash-for-Windows_Chinese

binary-husky/gpt_academic

yaoyz96/els-cas-templates

nickchen121/cyd-selected-journal

microsoft/vcpkg

naganandy/graph-based-deep-learning-literature

VL-Group/PENET

CrossmodalGroup/CMCAN

CrossmodalGroup/NAAF

naver-ai/eccv-caption

visual-text-QA/VTQA-Demo

visual-text-QA/Visuak-Text-QA-Challenge