Idate96's Stars
openai/whisper
Robust Speech Recognition via Large-Scale Weak Supervision
bitsandbytes-foundation/bitsandbytes
Accessible large language models via k-bit quantization for PyTorch.
ikostrikov/walk_in_the_park
NVlabs/MinVIS
facebookresearch/msn
Masked Siamese Networks for Label-Efficient Learning (https://arxiv.org/abs/2204.07141)
carbon-language/carbon-lang
Carbon Language's main repository: documents, design, implementation, and related tools. (NOTE: Carbon Language is experimental; see README)
wjf5203/VNext
Next-generation Video instance recognition framework on top of Detectron2 which supports InstMove (CVPR 2023), SeqFormer(ECCV Oral), and IDOL(ECCV Oral))
hkchengrex/MiVOS
[CVPR 2021] Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion. Semi-supervised VOS as well!
hkchengrex/XMem
[ECCV 2022] XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model
MasterBin-IIAU/Unicorn
[ECCV'22 Oral] Towards Grand Unification of Object Tracking
lucidrains/mixture-of-experts
A Pytorch implementation of Sparsely-Gated Mixture of Experts, for massively increasing the parameter count of language models
libigl/libigl
Simple MPL-2.0-licensed C++ geometry processing library.
chaytonmin/Occupancy-MAE
Official implementation of our TIV'23 paper: Occupancy-MAE: Self-supervised Pre-training Large-scale LiDAR Point Clouds with Masked Occupancy Autoencoders
CodedotAl/gpt-code-clippy
Full description can be found here: https://discuss.huggingface.co/t/pretrain-gpt-neo-for-open-source-github-copilot-model/7678?u=ncoop57
clementchadebec/benchmark_VAE
Unifying Variational Autoencoder (VAE) implementations in Pytorch (NeurIPS 2022)
facebookresearch/omnivore
Omnivore: A Single Model for Many Visual Modalities
leggedrobotics/open3d_slam
Pointcloud-based graph SLAM written in C++ using open3D library.
facebookresearch/detr
End-to-End Object Detection with Transformers
voxel51/fiftyone
Refine high-quality datasets and visual AI models
mit-han-lab/bevfusion
[ICRA'23] BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View Representation
facebookresearch/pytorch3d
PyTorch3D is FAIR's library of reusable components for deep learning with 3D data
traveller59/spconv
Spatial Sparse Convolution Library
sshaoshuai/PV-RCNN
PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection, CVPR 2020.
ADLab-AutoDrive/BEVFusion
Offical PyTorch implementation of "BEVFusion: A Simple and Robust LiDAR-Camera Fusion Framework"
patrick-kidger/equinox
Elegant easy-to-use neural networks + scientific computing in JAX. https://docs.kidger.site/equinox/
lucidrains/PaLM-jax
Implementation of the specific Transformer architecture from PaLM - Scaling Language Modeling with Pathways - in Jax (Equinox framework)
lucidrains/imagen-pytorch
Implementation of Imagen, Google's Text-to-Image Neural Network, in Pytorch
facebookresearch/SlowFast
PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.
implus/UM-MAE
Official Codes for "Uniform Masking: Enabling MAE Pre-training for Pyramid-based Vision Transformers with Locality"
google/ml_collections
ML Collections is a library of Python Collections designed for ML use cases.