Hiroki11x

PhD Candidate at Université de Montréal, Mila / Student Research at Google DeepMind / HPC, Deep Learning, LLM / ex-Tokyo Tech, Microsoft Research, IBM Research

Mila, Université de MontréalMontreal, QC, Canada

Hiroki11x's Stars

youssefHosni/Data-Science-Interview-Questions-Answers
Curated list of data science interview questions and answers
3.4k786
google-research/jestimator
Amos optimizer with JEstimator lib.
Language:Python816
HIPS/hypergrad
Exploring differentiation with respect to hyperparameters
Language:Python29771
ssnl/distributed_shampoo
For optimization algorithm research and development. Submodule & Py38 ready
Language:Python4
MedicineToken/Medical-Graph-RAG
Medical Graph RAG: Graph RAG for the Medical Data
Language:Python23038
team-approx-bayes/ivon-experiments
Language:Python7
prov-gigapath/prov-gigapath
Prov-GigaPath: A whole-slide foundation model for digital pathology from real-world data
Language:Python42555
facebookresearch/lingua
Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.
Language:Python4.2k218
ankur-98/BERT_GLUE
Exploring the GLUE benchmark and fine tuning tasks on pre-trained BERT model using Hugging Face on PyTorch..
Language:Jupyter Notebook21
jong980812/Slurm_MultiNode_DDP
This helps you to submit job with multinode & multgpu in Slurm in Torchrun
Language:Shell8
tanganke/peta
Code for paper "Parameter Efficient Multi-task Model Fusion with Partial Linearization"
Language:Jupyter Notebook141
hushon/JAX-ResNet-CIFAR10
Simple CIFAR10 ResNet example with JAX.
Language:Python212
mariaref/nonconvex-lr
Codes for theoretical study of learning rate scheduling in non-convex problems.
Language:C++2
KellerJordan/cifar10-airbench
94% on CIFAR-10 in 2.6 seconds 💨 96% in 27 seconds
Language:Python1789
ibm-granite-community/granite-timeseries-cookbook
Granite Time Series Cookbook
Language:Jupyter Notebook134
soyflourbread/cifar10-tiny
A tiny neural network for CIFAR-10 dataset
Language:Python3
vict0rsch/tips-research-mila
General tips to drive your research at Mila
Language:Python171
bharathgs/Awesome-Distributed-Deep-Learning
A curated list of awesome Distributed Deep Learning resources.
40584
linkedin/Liger-Kernel
Efficient Triton Kernels for LLM Training
Language:Python3.5k208
ChenRocks/UNITER
Research code for ECCV 2020 paper "UNITER: UNiversal Image-TExt Representation Learning"
Language:Python784109
Liuhong99/Sophia
The official implementation of “Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training”
Language:Python93952
microsoft/Phi-3CookBook
This is a Phi-3 book for getting started with Phi-3. Phi-3, a family of open sourced AI models developed by Microsoft. Phi-3 models are the most capable and cost-effective small language models (SLMs) available, outperforming models of the same size and next size up across a variety of language, reasoning, coding, and math benchmarks.
Language:Jupyter Notebook2.5k259
SakanaAI/AI-Scientist
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑‍🔬
Language:Jupyter Notebook8.2k1.2k
VITA-Group/TENAS
[ICLR 2021] "Neural Architecture Search on ImageNet in Four GPU Hours: A Theoretically Inspired Perspective" by Wuyang Chen, Xinyu Gong, Zhangyang Wang
Language:Python16731
katoro8989/IRM_Variants_Calibration
Towards Understanding Variants of Invariant Risk Minimization through the Lens of Calibration (TMLR 2024)
Language:Python1
AndreasMadsen/faithfulness-measurable-models
Implementation of Faithfulness measurable masked language models
Language:Python5
yandex-research/DeDLOC
Official code for "Distributed Deep Learning in Open Collaborations" (NeurIPS 2021)
Language:Jupyter Notebook1166
pytorch/torchtitan
A native PyTorch Library for large model training
Language:Python2.6k205
prateeky2806/ties-merging
Language:Python15318
naver-ai/model-stock
Model Stock: All we need is just a few fine-tuned models
Language:Jupyter Notebook921

Hiroki11x

Hiroki11x's Stars

youssefHosni/Data-Science-Interview-Questions-Answers

google-research/jestimator

HIPS/hypergrad

ssnl/distributed_shampoo

MedicineToken/Medical-Graph-RAG

team-approx-bayes/ivon-experiments

prov-gigapath/prov-gigapath

facebookresearch/lingua

ankur-98/BERT_GLUE

jong980812/Slurm_MultiNode_DDP

tanganke/peta

hushon/JAX-ResNet-CIFAR10

mariaref/nonconvex-lr

KellerJordan/cifar10-airbench

ibm-granite-community/granite-timeseries-cookbook

soyflourbread/cifar10-tiny

vict0rsch/tips-research-mila

bharathgs/Awesome-Distributed-Deep-Learning

linkedin/Liger-Kernel

ChenRocks/UNITER

Liuhong99/Sophia

microsoft/Phi-3CookBook

SakanaAI/AI-Scientist

VITA-Group/TENAS

katoro8989/IRM_Variants_Calibration

AndreasMadsen/faithfulness-measurable-models

yandex-research/DeDLOC

pytorch/torchtitan

prateeky2806/ties-merging

naver-ai/model-stock