YOKOTA Laboratory at Tokyo Tech
Tokyo Institute of Technology, Global Scientific Information and Computing Center YOKOTA Laboratory
Tokyo, Japan
Pinned Repositories
caffe
Caffe Extended Version fork from NVIDIA
DeepSpeedFugaku
main: microsoft/Meagtron-DeepSpeed, cpu: 富岳上で動かすstableブランチ
gpu_lecture
hpsc-2022
hpsc-2023
hpsc-2024
im2col
4D image/filter tensor -> 2d matrix
Megatron-Llama2
2023 ABCI Llama-2 継続学習プロジェクト
PixPro-with-OpticalFlow
Pixel-level Contrastive Learning of Driving Videos with Optical Flow, CVPR 2023 Workshop
polaris
Polaris is a hyperparamter optimization library
YOKOTA Laboratory at Tokyo Tech's Repositories
rioyokotalab/Megatron-Llama2
2023 ABCI Llama-2 継続学習プロジェクト
rioyokotalab/hpsc-2024
rioyokotalab/DeepSpeedFugaku
main: microsoft/Meagtron-DeepSpeed, cpu: 富岳上で動かすstableブランチ
rioyokotalab/Hatrix
rioyokotalab/PixPro-with-OpticalFlow
Pixel-level Contrastive Learning of Driving Videos with Optical Flow, CVPR 2023 Workshop
rioyokotalab/FRANK
rioyokotalab/nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
rioyokotalab/cutlass
CUDA Templates for Linear Algebra Subroutines
rioyokotalab/kenkyu_project
rioyokotalab/bigcodebench
BigCodeBench: Benchmarking Code Generation Towards AGI
rioyokotalab/cosmopedia
rioyokotalab/elses
rioyokotalab/FederatedLearning
An adaptable federated learning framework with a central server, supporting diverse datasets, models, and optimizers. Facilitates collaborative, yet private, data training with customizable aggregation algorithms.
rioyokotalab/gpt-neox
An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.
rioyokotalab/grok-1
Grok open release
rioyokotalab/lm-evaluation-harness
A framework for few-shot evaluation of autoregressive language models.
rioyokotalab/m2
Repo for "Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture"
rioyokotalab/megablocks
rioyokotalab/Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
rioyokotalab/Megatron-LLaMA
Best practice for training LLaMA models in Megatron-LM
rioyokotalab/Megatron-LM
Ongoing research training transformer language models at scale, including: BERT & GPT-2
rioyokotalab/moe-recipes
Ongoing Research Project for Mixture of Expert models
rioyokotalab/nbd
N-Body generator for Hatrix
rioyokotalab/nbd-cpu-only
rioyokotalab/nccl-reader
Optimized primitives for collective multi-GPU communication
rioyokotalab/STRUMPACK
Structured Matrix Package (LBNL)
rioyokotalab/toast-gpt
rioyokotalab/toast-vit
rioyokotalab/ylab_server_public
ひなどりクラスタの使い方 (for public)
rioyokotalab/zero-bubble-pipeline-parallelism
Zero Bubble Pipeline Parallelism