codescracker

codescracker's Stars

Orfium/bytecover
Implementation of "Bytecover: Cover song identification via multi-loss training" paper (ICASSP 2021)
Language:Python255
huggingface/pytorch-image-models
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more
Language:Python32.1k4.8k
collabora/WhisperSpeech
An Open Source text-to-speech system built by inverting Whisper.
Language:Jupyter Notebook3.9k213
openai/whisper
Robust Speech Recognition via Large-Scale Weak Supervision
Language:Python70.6k8.4k
roboflow/notebooks
Examples and tutorials on using SOTA computer vision models and techniques. Learn everything from old-school ResNet, through YOLO and object-detection transformers like DETR, to the latest models like Grounding DINO and SAM.
Language:Jupyter Notebook5.5k869
yigitkonur/swift-ocr-llm-powered-pdf-to-markdown
An open-source OCR API that leverages OpenAI's powerful language models with optimized performance techniques like parallel processing and batching to deliver high-quality text extraction from complex PDF documents. Ideal for businesses seeking efficient document digitization and data extraction solutions.
Language:Python67949
hanouticelina/deformable-DETR
Implementation of the paper : Deformable DETR: Deformable Transformers for End-to-End Object Detection (ICLR 2021)
Language:Python257
adensur/blog
Language:Jupyter Notebook303
KimRass/DETR
PyTorch implementation of 'DETR' (Carion et al., 2020) from scratch.
Language:Python4
fundamentalvision/Deformable-DETR
Deformable DETR: Deformable Transformers for End-to-End Object Detection.
Language:Python3.2k520
DS4SD/DocLayNet
DocLayNet: A Large Human-Annotated Dataset for Document-Layout Analysis
26015
LynnHaDo/Document-Layout-Analysis
Object Detection Model for Scanned Documents
Language:Jupyter Notebook8211
IDEA-Research/GroundingDINO
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
Language:Python6.7k680
NielsRogge/Transformers-Tutorials
This repository contains demos I made with the Transformers library by HuggingFace.
Language:Jupyter Notebook9.4k1.4k
facebookresearch/segment-anything
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Language:Jupyter Notebook47.4k5.6k
facebookresearch/sam2
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Language:Jupyter Notebook12.1k1.1k
rasbt/LLMs-from-scratch
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
Language:Jupyter Notebook30.4k3.6k
gokayfem/awesome-vlm-architectures
Famous Vision Language Models and Their Architectures
Language:Markdown38822
opendatalab/MinerU
A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具，支持PDF/网页/多格式电子书提取。
Language:JavaScript13.5k1k
karpathy/LLM101n
LLM101n: Let's build a Storyteller
29.6k1.6k
lucidrains/vit-pytorch
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
Language:Python20.4k3k
tintn/vision-transformer-from-scratch
A Simplified PyTorch Implementation of Vision Transformer (ViT)
Language:Jupyter Notebook13725
AviSoori1x/seemore
From scratch implementation of a vision language model in pure PyTorch
Language:Jupyter Notebook16114
ollama/ollama
Get up and running with Llama 3.2, Mistral, Gemma 2, and other large language models.
Language:Go96k7.6k
evintunador/minLlama3
a simplified version of Meta's Llama 3 model to be used for learning
Language:Jupyter Notebook3211
OpenBMB/MiniCPM-V
MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
Language:Python12.4k873
ml-explore/mlx-examples
Examples in the MLX framework
Language:Python6.1k870
Lightning-AI/litgpt
20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
Language:Python10.5k1k
neobundy/Deep-Dive-Into-AI-With-MLX-PyTorch
"Deep Dive into AI with MLX and PyTorch" is an educational initiative designed to help anyone interested in AI, specifically in machine learning and deep learning, using Apple's MLX and Meta's PyTorch frameworks.
Language:Python37251
FareedKhan-dev/Building-llama3-from-scratch
LLaMA 3 is one of the most promising open-source model after Mistral, we will recreate it's architecture in a simpler manner.
Language:Jupyter Notebook9628