Pinned Repositories
awesome
😎 Awesome lists about all kinds of interesting topics
awesome-vlm-architectures
Famous Vision Language Models and Their Architectures
CogVLM2
第二代 CogVLM多模态预训练对话模型
ComfyUI-Depth-Visualization
Depth map applied Image viewer inside ComfyUI
ComfyUI-Dream-Interpreter
Dream Interpreter inside ComfyUI
ComfyUI-Texture-Simple
Visualize your textures inside ComfyUI
ComfyUI_VLM_nodes
Custom ComfyUI nodes for Vision Language Models, Large Language Models, Image to Music, Text to Music, Consistent and Random Creative Prompt Generation
dspy-ollama-colab
dspy with ollama and llamacpp on google colab
HunyuanDiT
Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
Video-LLaMA
[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
gokayfem's Repositories
gokayfem/ComfyUI_VLM_nodes
Custom ComfyUI nodes for Vision Language Models, Large Language Models, Image to Music, Text to Music, Consistent and Random Creative Prompt Generation
gokayfem/awesome-vlm-architectures
Famous Vision Language Models and Their Architectures
gokayfem/ComfyUI-Dream-Interpreter
Dream Interpreter inside ComfyUI
gokayfem/ComfyUI-Depth-Visualization
Depth map applied Image viewer inside ComfyUI
gokayfem/ComfyUI-Texture-Simple
Visualize your textures inside ComfyUI
gokayfem/HunyuanDiT
Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
gokayfem/dspy-ollama-colab
dspy with ollama and llamacpp on google colab
gokayfem/Video-LLaMA
[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
gokayfem/awesome
😎 Awesome lists about all kinds of interesting topics
gokayfem/CogVLM2
第二代 CogVLM多模态预训练对话模型
gokayfem/DeepSeek-VL
DeepSeek-VL: Towards Real-World Vision-Language Understanding
gokayfem/flash-attention-minimal
Flash Attention in ~100 lines of CUDA (forward pass only)
gokayfem/lectures
Material for cuda-mode lectures
gokayfem/graph_websearch_agent
Websearch agent built on the LangGraph framework
gokayfem/img2txt-comfyui-nodes
Implements some of the most popular img2txt models on HF into ComfyUI nodes. Uses questions/conditional-prompts to get descriptions that are suited for being fed back into a txt2img node.
gokayfem/Reka-Torch
Implementation of the model: "Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models" in PyTorch
gokayfem/siglip
Projects based on SigLIP (Zhai et. al, 2023) and Hugging Face transformers integration 🤗
gokayfem/Vitron
A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing