seefun

Thinking, Walking and Coding

Shanghai, China

seefun's Stars

infiniflow/ragflow
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
Language:Python26.2k 144 1.9k2.5k
KwaiVGI/LivePortrait
Bring portraits to life!
Language:Python13.4k 123 3931.4k
InstantID/InstantID
InstantID : Zero-shot Identity-Preserving Generation in Seconds 🔥
Language:Python10.7k 125 217785
OpenTalker/video-retalking
[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild
Language:Python6.8k 75 246994
Tencent/HunyuanDiT
Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
Language:Jupyter Notebook3.7k 42 182319
TMElyralab/MuseTalk
MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting
Language:Python3.2k 52 227400
CosmosShadow/gptpdf
Using GPT to parse PDF
Language:Python3.1k 12 40229
nerfies/nerfies.github.io
Language:JavaScript2.7k 37 5992
a312863063/generators-with-stylegan2
Here is a series of face generators based on StyleGAN2
Language:Python2.4k 51 53551
kvcache-ai/Mooncake
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
Language:C++2.3k 27 29130
facebookresearch/chameleon
Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.
Language:Python1.9k 26 51112
PixArt-alpha/PixArt-sigma
PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation
Language:Python1.7k 40 12884
AlibabaResearch/AdvancedLiterateMachinery
A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.
Language:C++1.6k 38 186182
OpenGVLab/VisionLLM
VisionLLM Series
Language:Python960 45 1529
mlfoundations/MINT-1T
MINT-1T: A one trillion token multimodal interleaved dataset.
785 25 1120
haofanwang/inswapper
One-click Face Swapper and Restoration powered by insightface 🔥
Language:Python551 10 1885
Yuliang-Liu/MultimodalOCR
On the Hidden Mystery of OCR in Large Multimodal Models (OCRBench)
Language:Python485 15 3032
dvlab-research/Step-DPO
Implementation for "Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs"
Language:Python310 2 2211
OpenGVLab/OmniCorpus
OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
Language:Python289 12 97
lmas/opensimplex
This repo has been migrated to https://code.larus.se/lmas/opensimplex
Language:Python242 9 2329
Global-Chem/global-chem
A Knowledge Graph of Common Chemical Names to their Molecular Definition
Language:Jupyter Notebook156 11 24021
nttmdlab-nlp/InstructDoc
InstructDoc: A Dataset for Zero-Shot Generalization of Visual Document Understanding with Instructions (AAAI2024)
Language:Python151 3 86
UniModal4Reasoning/DocGenome
DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Models
Language:Jupyter Notebook145 5 75
materials-data-facility/matchem-llm
A public repository collecting links to state of the art QA and evaluation sets for various ML and LLM applications
83 7 19
OpenGVLab/LCL
Vision Model Pre-training on Interleaved Image-Text Data via Latent Compression Learning
Language:Python68 1 43
yuyq96/TextHawk
Exploring Efficient Fine-Grained Perception of Multimodal Large Language Models
Language:Python55 5 23
MengLcool/DeepStack-VL
[NeurIPS-24] This is the official implementation of the paper "DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effective for LMMs".
Language:Python34 2 12
Kohulan/OCSR_Review
This repository contains the information related to the benchmark study on openly available OCSR tools
Language:Python33 4 212
OpenGVLab/De-focus-Attention-Networks
Learning 1D Causal Visual Representation with De-focus Attention Networks
Language:Python30 2 00
jiachengxiong/alpha-Extractor
Test data for paper “αExtractor: a web server for automatic extraction of chemical structure from literature”
12 2 00

seefun

seefun's Stars

infiniflow/ragflow

KwaiVGI/LivePortrait

InstantID/InstantID

OpenTalker/video-retalking

Tencent/HunyuanDiT

TMElyralab/MuseTalk

CosmosShadow/gptpdf

nerfies/nerfies.github.io

a312863063/generators-with-stylegan2

kvcache-ai/Mooncake

facebookresearch/chameleon

PixArt-alpha/PixArt-sigma

AlibabaResearch/AdvancedLiterateMachinery

OpenGVLab/VisionLLM

mlfoundations/MINT-1T

haofanwang/inswapper

Yuliang-Liu/MultimodalOCR

dvlab-research/Step-DPO

OpenGVLab/OmniCorpus

lmas/opensimplex

Global-Chem/global-chem

nttmdlab-nlp/InstructDoc

UniModal4Reasoning/DocGenome

materials-data-facility/matchem-llm

OpenGVLab/LCL

yuyq96/TextHawk

MengLcool/DeepStack-VL

Kohulan/OCSR_Review

OpenGVLab/De-focus-Attention-Networks

jiachengxiong/alpha-Extractor