FuLy2002

Northwestern Polytechnical University

FuLy2002's Stars

wangzhifengharrison/HTNet
Language:Python649
wangyuchi369/LaDiC
[NAACL 2024] LaDiC: Are Diffusion Models Really Inferior to Autoregressive Counterparts for Image-to-text Generation?
Language:Python372
krahets/hello-algo
《Hello 算法》：动画图解、一键运行的数据结构与算法教程。支持 Python, Java, C++, C, C#, JS, Go, Swift, Rust, Ruby, Kotlin, TS, Dart 代码。简体版和繁体版同步更新，English version ongoing
Language:Java105k13.2k
drboog/Shifted_Diffusion
Code for Shifted Diffusion for Text-to-image Generation (CVPR 2023)
Language:Python16011
jianjieluo/PCM-Net
[ECCV24] Unleashing Text-to-Image Diffusion Prior for Zero-Shot Image Captioning
Language:Python42
YehLi/xmodaler
X-modaler is a versatile and high-performance codebase for cross-modal analytics(e.g., image captioning, video captioning, vision-language pre-training, visual question answering, visual commonsense reasoning, and cross-modal retrieval).
Language:Python1k111
ailab-kyunghee/CM2_DVC
[CVPR 2024] Do you remember? Dense Video Captioning with Cross-Modal Memory Retrieval
Language:Python472
SatyamGaba/image_captioning
Image Captioning with CNN, LSTM and RNN using PyTorch on COCO Dataset
Language:Python142
jianjieluo/SCD-Net
[CVPR23] A cascaded diffusion captioning model with a novel semantic-conditional diffusion process that upgrades conventional diffusion model with additional semantic prior.
Language:Python585
aanna0701/SPT_LSA_ViT
Implementation of Visual Transformer for Small-size Datasets
Language:Python11715
whwu95/Cap4Video
【CVPR'2023 Highlight & TPAMI】Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?
Language:Python24920
liujf69/EPP-Net-Action
[TRIT 2024] Implementation of the paper “Explore Human Parsing Modality for Action Recognition”.
Language:Python372
YashiGoyal-02/Smart-Bridge-Yoga-Pose-Classification
The project "Yoga Pose Classification" aims to develop a system which can classify various yoga poses from static images and real-time poses captured through a camera.
Language:Jupyter Notebook1
frank-1150/frank-1150.github.io
try to clone Tesla.com using HTML, JS and CSS
Language:HTML31
rishikksh20/ViViT-pytorch
Implementation of ViViT: A Video Vision Transformer
Language:Python52066
uark-cviu/Micron-BERT
[CVPR 2023] Micron-BERT: BERT-based Facial Micro-Expression Recognition
Language:Python13610
mingyuefusu/library_manage_system
使用jsp、layui、mysql完成的图书馆系统，包含用户图书借阅、图书管理员、系统管理员界面，功能齐全
Language:Java697
GroverZhu/Online-Library-System
基于MVC设计模式的在线图书馆管理系统
Language:Java14338
ahangchen/torch_base
Quickly bring up your PyTorch project(a skeleton)
Language:Python695120
songquanpeng/pytorch-template
To be the world's best PyTorch project template.
Language:Python47765
leftatrium2/AIDemo
深度学习，CV 例子项目，人体部分，包含：姿势骨架、手势骨架、姿势识别、手势识别、面部侦测等等
Language:Java6621
IW276/IW276WS20-P12
2D Pose Based Action Recognition
Language:Python51
kenshohara/3D-ResNets-PyTorch
3D ResNets for Action Recognition (CVPR 2018)
Language:Python3.9k930
Rahul5430/Speech-Emotion-Recognition-System
It is a system through which various audio speech files are classified into different emotions such as happy, sad, anger and neutral by computer. SER can be used in areas such as the medical field or customer call centers.
Language:Jupyter Notebook9
CS-BAOYAN/CS-BAOYAN-2023
Language:HTML1k103

FuLy2002

FuLy2002's Stars

wangzhifengharrison/HTNet

wangyuchi369/LaDiC

krahets/hello-algo

drboog/Shifted_Diffusion

jianjieluo/PCM-Net

YehLi/xmodaler

ailab-kyunghee/CM2_DVC

SatyamGaba/image_captioning

jianjieluo/SCD-Net

aanna0701/SPT_LSA_ViT

whwu95/Cap4Video

liujf69/EPP-Net-Action

YashiGoyal-02/Smart-Bridge-Yoga-Pose-Classification

frank-1150/frank-1150.github.io

rishikksh20/ViViT-pytorch

uark-cviu/Micron-BERT

mingyuefusu/library_manage_system

GroverZhu/Online-Library-System

ahangchen/torch_base

songquanpeng/pytorch-template

leftatrium2/AIDemo

IW276/IW276WS20-P12

kenshohara/3D-ResNets-PyTorch

Rahul5430/Speech-Emotion-Recognition-System

CS-BAOYAN/CS-BAOYAN-2023