beargolden
My research interests are document analysis and recognition, scene text detection and recognition, computer vision and pattern recognition.
Hubei University of TechnologyWuhan 430068, P. R. China
beargolden's Stars
3b1b/manim
Animation engine for explanatory math videos
ManimCommunity/manim
A community-maintained Python framework for creating mathematical animations.
convdepth/ConvDepth
ConvDepth: Self-Supervised Monocular Depth Estimation for Autonomous Driving
DepthAnything/Depth-Anything-V2
Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation
LiheYoung/Depth-Anything
[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation
0nutation/SpeechGPT
SpeechGPT Series: Speech Large Language Models
zcablii/LSKNet
(ICCV 2023) Large Selective Kernel Network for Remote Sensing Object Detection
assafelovic/gpt-researcher
LLM based autonomous agent that does online comprehensive research on any given topic
zbezj/HEU_KMS_Activator
stanford-crfm/helm
Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09110). This framework is also used to evaluate text-to-image models in Holistic Evaluation of Text-to-Image Models (HEIM) (https://arxiv.org/abs/2311.04287).
hiyouga/LLaMA-Factory
Efficiently Fine-Tune 100+ LLMs in WebUI (ACL 2024)
zhaominyiz/STIRER
STIRER: A Unified Model for Low-Resolution Scene Text Image Recovery and Recognition -- ACMMM 2023
MagicMirrorOrg/MagicMirror
MagicMirror² is an open source modular smart mirror platform. With a growing list of installable modules, the MagicMirror² allows you to convert your hallway or bathroom mirror into your personal assistant.
ChatGPTNextWeb/ChatGPT-Next-Web
A cross-platform ChatGPT/Gemini UI (Web / PWA / Linux / Win / MacOS). 一键拥有你自己的跨平台 ChatGPT/Gemini 应用。
NITR098/Awesome-U-Net
Official repo for Medical Image Segmentation Review: The Success of U-Net
modelscope/FunASR
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
rany2/edge-tts
Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key
wzpan/wukong-robot
🤖 wukong-robot 是一个简单、灵活、优雅的中文语音对话机器人/智能音箱项目,支持ChatGPT多轮对话能力,还可能是首个支持脑机交互的开源智能音箱项目。
XingangPan/DragGAN
Official Code for DragGAN (SIGGRAPH 2023)
xszyou/fay-ue5
可对接fay数字人的ue5工程
NEU-Gou/awesome-reid-dataset
Collection of public available person re-identification datasets
RSL-NEU/person-reid-benchmark
A Systematic Evaluation and Benchmark for Person Re-Identification: Features, Metrics, and Datasets
binary-husky/gpt_academic
为GPT/GLM等LLM大语言模型提供实用化交互接口,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, moss等。
GaiZhenbiao/ChuanhuChatGPT
GUI for ChatGPT API and many LLMs. Supports agents, file-based QA, GPT finetuning and query with web search. All with a neat UI.
THUDM/ChatGLM-6B
ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
xszyou/Fay
Fay is an open-source digital human framework integrating language models and digital characters. It offers retail, assistant, and agent versions for diverse applications like virtual shopping guides, broadcasters, assistants, waiters, teachers, and voice or text-based mobile assistants.
szad670401/HyperLPR
基于深度学习高性能中文车牌识别 High Performance Chinese License Plate Recognition Framework.
YimianDai/open-atac
code and trained models for "Attention as Activation"
xmu-xiaoma666/External-Attention-pytorch
🍀 Pytorch implementation of various Attention Mechanisms, MLP, Re-parameter, Convolution, which is helpful to further understand papers.⭐⭐⭐
Fangyh09/pytorch-receptive-field
Compute CNN receptive field size in pytorch in one line