Hiroki11x
PhD Candidate at Université de Montréal, Mila / Student Research at Google DeepMind / HPC, Deep Learning, LLM / ex-Tokyo Tech, Microsoft Research, IBM Research
Mila, Université de MontréalMontreal, QC, Canada
Hiroki11x's Stars
youssefHosni/Data-Science-Interview-Questions-Answers
Curated list of data science interview questions and answers
google-research/jestimator
Amos optimizer with JEstimator lib.
HIPS/hypergrad
Exploring differentiation with respect to hyperparameters
ssnl/distributed_shampoo
For optimization algorithm research and development. Submodule & Py38 ready
MedicineToken/Medical-Graph-RAG
Medical Graph RAG: Graph RAG for the Medical Data
team-approx-bayes/ivon-experiments
prov-gigapath/prov-gigapath
Prov-GigaPath: A whole-slide foundation model for digital pathology from real-world data
facebookresearch/lingua
Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.
ankur-98/BERT_GLUE
Exploring the GLUE benchmark and fine tuning tasks on pre-trained BERT model using Hugging Face on PyTorch..
jong980812/Slurm_MultiNode_DDP
This helps you to submit job with multinode & multgpu in Slurm in Torchrun
tanganke/peta
Code for paper "Parameter Efficient Multi-task Model Fusion with Partial Linearization"
hushon/JAX-ResNet-CIFAR10
Simple CIFAR10 ResNet example with JAX.
mariaref/nonconvex-lr
Codes for theoretical study of learning rate scheduling in non-convex problems.
KellerJordan/cifar10-airbench
94% on CIFAR-10 in 2.6 seconds 💨 96% in 27 seconds
ibm-granite-community/granite-timeseries-cookbook
Granite Time Series Cookbook
soyflourbread/cifar10-tiny
A tiny neural network for CIFAR-10 dataset
vict0rsch/tips-research-mila
General tips to drive your research at Mila
bharathgs/Awesome-Distributed-Deep-Learning
A curated list of awesome Distributed Deep Learning resources.
linkedin/Liger-Kernel
Efficient Triton Kernels for LLM Training
ChenRocks/UNITER
Research code for ECCV 2020 paper "UNITER: UNiversal Image-TExt Representation Learning"
Liuhong99/Sophia
The official implementation of “Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training”
microsoft/Phi-3CookBook
This is a Phi-3 book for getting started with Phi-3. Phi-3, a family of open sourced AI models developed by Microsoft. Phi-3 models are the most capable and cost-effective small language models (SLMs) available, outperforming models of the same size and next size up across a variety of language, reasoning, coding, and math benchmarks.
SakanaAI/AI-Scientist
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑🔬
VITA-Group/TENAS
[ICLR 2021] "Neural Architecture Search on ImageNet in Four GPU Hours: A Theoretically Inspired Perspective" by Wuyang Chen, Xinyu Gong, Zhangyang Wang
katoro8989/IRM_Variants_Calibration
Towards Understanding Variants of Invariant Risk Minimization through the Lens of Calibration (TMLR 2024)
AndreasMadsen/faithfulness-measurable-models
Implementation of Faithfulness measurable masked language models
yandex-research/DeDLOC
Official code for "Distributed Deep Learning in Open Collaborations" (NeurIPS 2021)
pytorch/torchtitan
A native PyTorch Library for large model training
prateeky2806/ties-merging
naver-ai/model-stock
Model Stock: All we need is just a few fine-tuned models