leeguandong

RS Image Processing | Deep Learning | CV | OCR | AIGC | LLM | LMM

SuningNanjing，China

Pinned Repositories

3D-DenseNet-for-HSI
paper：Three-dimensional densely connected convolutional network for hyperspectral remote sensing image classification
Language:Python46 1 115
Awesome-Chinese-Stable-Diffusion
中文文生图stable diffsion模型集合
262 1 115
DL-data-processing-methods
深度学习处理数据的一些基本操作
Language:Jupyter Notebook33 1 011
FSKNet-for-HSI
paper：Faster hyperspectral image classification based on selective kernel mechanism using deep convolutional networks
Language:Python32 1 06
How-to-make-high-resolution-remote-sensing-image-dataset
高分遥感影像数据集的制作
54 5 218
Interview-code-practice-python
面试题
Language:Python1.6k 62 10558
learn_python
Python 学习笔记
Language:Python242 6 0114
Multi-Scale-Dense-Networks-for-Hyperspectral-Remote-Sensing-Image-Classification
paper：Multi-Scale Dense Networks for Hyperspectral Remote Sensing Image Classification
Language:Python31 1 114
Paper-Learning
论文学习，主要研究深度学习处理遥感影像，地名识别，文档篡改检测，OCR，视觉生成
24 1 04
Parking
Parking 停车位APP
Language:Java21 1 09

leeguandong's Repositories

leeguandong/Awesome-Chinese-Stable-Diffusion
中文文生图stable diffsion模型集合
262 1 115
leeguandong/Paper-Learning
论文学习，主要研究深度学习处理遥感影像，地名识别，文档篡改检测，OCR，视觉生成
24 1 04
leeguandong/ComfyUI_InternVL2
comfyui的InternVL2插件，InternVL2是当前不错的开源多模态大语言模型，在文档vqa上表现很好
Language:Python12 1 21
leeguandong/ComfyUI_M3Net
comfyui的m3net插件，m3net是不错的显著性检测模型，抠图上效果不错，我开源了一个训练的电商的模型，供大家试玩
Language:Python10 2 02
leeguandong/XrayQwenVL
基于qwenvl微调一个多模态Xray识别的大模型
Language:Python10 1 21
leeguandong/XrayLLaVA
基于LLaVA1.6微调的Xray识别的多模态大模型
Language:Python6 1 10
leeguandong/ComfyUI_VisualAttentionMap
对sd中text prompt和self-attention以及cross-attention时的特征图进行可视化。
Language:Python5 3 01
leeguandong/EcommerceLLMQwen2.5
基于电商数据微调的Qwen2.5系列的电商大模型，电商数据sft后电商大模型。是https://github.com/leeguandong/EcommerceLLM的升级版本。qwen2.5的效果很好。
Language:Python5 1 0
leeguandong/ComfyUI_AliControlnetInpainting
阿里妈妈电商领域的inpainting方法
Language:Python4 1 0
leeguandong/ComfyUI_CrossImageAttention
CrossImageAttention是zero-shot方法，可以在制定外观图和结构的前提下，生成具有一致结构和外观的图，在qkv层面的工作。
Language:Python4 2 01
leeguandong/ComfyUI_MasaCtrl
在多次推理中可以固定图像主体，进行一致性控制，qkv层面工作
Language:Python4 1 12
leeguandong/OCRDetInternVL2
OCR Large Multi-model Model，基于Internvl2微调OCR文字检测的多模态大模型，在4张A800上基于internvl2-8b模型微调。不仅在ocr文字检测任务上，在大多数的目标检测任务也是work的。
Language:Python4 1 0
leeguandong/ComfyUI_CompareModelWeights
对比相同结构的stable diffusion的权重之间的偏差，主要用来直观的考量模型融合的权重之间的差异。
Language:Python3 1 01
leeguandong/ComfyUI_Diffusers
diffusers的模型，参数加载，以及公用的数据处理等操作，会持续更新。
Language:Python3
leeguandong/ComfyUI_LLaSM
语音文本多模态大模型，语音侧基于whisper，text侧基于llama，通用效果不错。
Language:Python3 2 01
leeguandong/ComfyUI_Style_Aligned
style_aligned，通过共享qkv的方式来zero shot得到相似图，风格一致图生成，reference方法。
Language:Python3 2 01
leeguandong/ComfyUI_VideoEditing
视频生成，controlnet+sd对输入视频进行一致性控制，对unet中的self-attention的qkv进行第一帧和前一帧参考。
Language:Python3 2 02
leeguandong/EcommerceOCRBench
电商文字识别的多模态大模型的ocr基准测试集，参照ocrbench，但是测评数据更多。
Language:Python3 1 0
leeguandong/ComfyUI_SelfGuidance
可以帮助锁定prompt中的特定对象在二次编辑中不被改变，对两次推理的crossattention map进行loss guidance。
Language:Python2 1 2
leeguandong/MaskControlnet
基于mask条件的controlnet生成模型，基于海量电商抠图数据（显著图检测数据）训练。
Language:Python2 1 01
leeguandong/XrayQwen2VL
Xray Large Multi-model Model，基于Qwen2VL微调Xray的多模态大模型，在4张A800上基于qwen2-vl-7b-instruct模型微调。a large multi-modal model fine-tuned from Qwen2VL for X-ray analysis, trained on 4 A800 GPUs based on the qwen2-vl-7b-instruct model.
Language:Python2 1 01
leeguandong/EcommerceSD
电商场景的stable diffusion模型，包括电商大模型，lora组件和controlnet等一系列应用
Language:Python1 1 0
leeguandong/MiniLLaMA3
llama3的迷你版本，包括了从0-1构造数据，训练tokenizer，pt，sft，dpo的全流程
Language:Python1 1 02
leeguandong/OCRDetPaliGemma
基于paligemma，专注于OCR文字检测和传统目标检测的多模态大语言模型。
Language:Python1 1 0
leeguandong/OCRInternVL2
OCR Large Multi-model Model，基于Internvl2微调OCR的多模态大模型，在4张A800上基于internvl2-8b模型微调。internvl2-8b在我们自测的ocr的vqa场景效果表现很好，我们再使用ocr数据微调之后，对于一般的ocr的vqa任务都能实现很好的效果。
Language:Python1 1 0
leeguandong/sd_webui_instantid
Instantid在stable diffusion webui上的插件，instantid是风格迁移和换脸，脸部id信息保留的很好的选择。
Language:Python1 1 0
leeguandong/sd_webui_ZeST
ZeST是zero-shot的材质迁移模型，本质上是ip-adapter+controlnet+inpaint算法的组合，只是在输入到inpaint的图生图的图上做了一些改动，包括对image+mask的改动
Language:Python1 2 0
leeguandong/XrayLLama3.2Vision
Xray Large Multi-model Model，基于llama3.2-vision微调Xray的多模态大模型，在4张VA800上基于llama3_2-11b-vision-instruct模型微调。
Language:Python11
leeguandong/leeguandong
2 0
leeguandong/leeguandong.github.io
Personal website
Language:JavaScript1 0