Pinned Repositories
.tmux
Oh My Tmux! My pretty + versatile tmux configuration that just works (imho the best tmux configuration)
2019-CCF-BDCI-OCR-MCZJ-fake_data_generator
2019CCF-BDCI大赛 OCR赛题第一名 天晨破晓团队 仿真数据生成方案源码
activityrecognition
Information about activity recognition
AlphaTree-graphic-deep-neural-network
将深度神经网络中的一些模型 进行统一的图示,便于大家对模型的理解
chinese-ocr
运用tensorflow实现自然场景文字检测,keras/pytorch实现crnn+ctc实现不定长中文OCR识别
chromium-for-android-56-debug-video
chromium_org
android5.0的chromium源码
pose-residual-network
Code for 'MultiPoseNet: Fast Multi-Person Pose Estimation using Pose Residual Network' paper
st-gcn
Spatial Temporal Graph Convolutional Networks (ST-GCN) for Skeleton-Based Action Recognition in PyTorch
TIES_DataGeneration
Dataset Generation Code for: S.R. Qasim, H. Mahmood, and F. Shafait, Rethinking Table Parsing using Graph Neural Networks (2019)
rkshuai's Repositories
rkshuai/chromium_org
android5.0的chromium源码
rkshuai/TIES_DataGeneration
Dataset Generation Code for: S.R. Qasim, H. Mahmood, and F. Shafait, Rethinking Table Parsing using Graph Neural Networks (2019)
rkshuai/Awesome-LLMs-on-device
Awesome LLMs on Device: A Comprehensive Survey
rkshuai/Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Papers and Datasets on Multimodal Large Language Models, and Their Evaluation.
rkshuai/awesome-multimodal-ml
Reading list for research topics in multimodal machine learning
rkshuai/Awesome-Multimodal-Research
A curated list of Multimodal Related Research.
rkshuai/BlueLM
rkshuai/CapsFusion
CapsFusion: Rethinking Image-Text Data at Scale
rkshuai/chatglm.cpp
C++ implementation of ChatGLM-6B & ChatGLM2-6B & more LLMs
rkshuai/Dewarping-Document-Image-By-Displacement-Flow-Estimation
Dewarping Document Image By Displacement Flow Estimation with Fully Convolutional Network
rkshuai/DocTr
The official code for “DocTr: Document Image Transformer for Geometric Unwarping and Illumination Correction”, ACM MM, Oral Paper, 2021.
rkshuai/Document-Dewarping-with-Control-Points
rkshuai/improved-aesthetic-predictor
CLIP+MLP Aesthetic Score Predictor
rkshuai/llama.cpp
Port of Facebook's LLaMA model in C/C++
rkshuai/minigpt4.cpp
Port of MiniGPT4 in C++ (4bit, 5bit, 6bit, 8bit, 16bit CPU inference with GGML)
rkshuai/MMBench
Official Repo of "MMBench: Is Your Multi-modal Model an All-around Player?"
rkshuai/movenet
Un-official implementation of MoveNet from Google
rkshuai/prompt-to-prompt
rkshuai/seq2seq-ocr-analysis
end2end layout analysis based seq2seq
rkshuai/stable-diffusion
A latent text-to-image diffusion model
rkshuai/stable-diffusion-webui
Stable Diffusion web UI
rkshuai/stablediffusion-infinity
Outpainting with Stable Diffusion on an infinite canvas
rkshuai/TaiSu
TaiSu(太素)--a large-scale Chinese multimodal dataset(亿级大规模中文视觉语言预训练数据集)
rkshuai/Text2Poster-ICASSP-22
Official implementation of the ICASSP-2022 paper "Text2Poster: Laying Out Stylized Texts on Retrieved Images"
rkshuai/torch-fidelity
High-fidelity performance metrics for generative models in PyTorch
rkshuai/VisCPM
Chinese and English Multimodal Large Model Series (Chat and Paint) | 基于CPM基础模型的中英双语多模态大模型系列
rkshuai/visual-chatgpt
VisualChatGPT
rkshuai/VLMEvalKit
Open-source evaluation toolkit of large vision-language models (LVLMs), support ~100 VLMs, 40+ benchmarks
rkshuai/waveCorrection
OCR Document image deformation correction.复现阿里OCR皱巴巴文档图像形变矫正
rkshuai/yapf
A formatter for Python files