YOKOTA Laboratory at Tokyo Tech

Tokyo Institute of Technology, Global Scientific Information and Computing Center YOKOTA Laboratory

Tokyo, Japan

Pinned Repositories

caffe
Caffe Extended Version fork from NVIDIA
Language:C++8 4 82
DeepSpeedFugaku
main: microsoft/Meagtron-DeepSpeed, cpu: 富岳上で動かすstableブランチ
Language:Python5 7 21
gpu_lecture
Language:Cuda9 5 00
hpsc-2022
Language:Shell4 0 029
hpsc-2023
Language:Shell7 0 041
hpsc-2024
Language:Shell11 0 038
im2col
4D image/filter tensor -> 2d matrix
Language:Python8 2 01
Megatron-Llama2
2023 ABCI Llama-2 継続学習プロジェクト
Language:Python13 3 03
PixPro-with-OpticalFlow
Pixel-level Contrastive Learning of Driving Videos with Optical Flow, CVPR 2023 Workshop
Language:Python3 3 00
polaris
Polaris is a hyperparamter optimization library
Language:Python4 3 00

YOKOTA Laboratory at Tokyo Tech's Repositories

rioyokotalab/Megatron-Llama2
2023 ABCI Llama-2 継続学習プロジェクト
Language:Python13 3 03
rioyokotalab/hpsc-2024
Language:Shell11 0 038
rioyokotalab/DeepSpeedFugaku
main: microsoft/Meagtron-DeepSpeed, cpu: 富岳上で動かすstableブランチ
Language:Python5 7 21
rioyokotalab/Hatrix
Language:C++3 4 471
rioyokotalab/PixPro-with-OpticalFlow
Pixel-level Contrastive Learning of Driving Videos with Optical Flow, CVPR 2023 Workshop
Language:Python3 3 00
rioyokotalab/FRANK
Language:C++2 3 352
rioyokotalab/nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
Language:Python2 0 0
rioyokotalab/cutlass
CUDA Templates for Linear Algebra Subroutines
Language:C++1 0 0
rioyokotalab/kenkyu_project
Language:Python1 6 00
rioyokotalab/bigcodebench
BigCodeBench: Benchmarking Code Generation Towards AGI
rioyokotalab/cosmopedia
rioyokotalab/elses
Language:Fortran
rioyokotalab/FederatedLearning
An adaptable federated learning framework with a central server, supporting diverse datasets, models, and optimizers. Facilitates collaborative, yet private, data training with customizable aggregation algorithms.
Language:Python
rioyokotalab/gpt-neox
An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.
Language:Python0 0
rioyokotalab/grok-1
Grok open release
Language:Python0 0
rioyokotalab/lm-evaluation-harness
A framework for few-shot evaluation of autoregressive language models.
Language:Python0 0
rioyokotalab/m2
Repo for "Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture"
rioyokotalab/megablocks
Language:Python0 0
rioyokotalab/Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
rioyokotalab/Megatron-LLaMA
Best practice for training LLaMA models in Megatron-LM
Language:Python0 0
rioyokotalab/Megatron-LM
Ongoing research training transformer language models at scale, including: BERT & GPT-2
Language:Python
rioyokotalab/moe-recipes
Ongoing Research Project for Mixture of Expert models
Language:Python
rioyokotalab/nbd
N-Body generator for Hatrix
Language:C++1 01
rioyokotalab/nbd-cpu-only
Language:C++0 0
rioyokotalab/nccl-reader
Optimized primitives for collective multi-GPU communication
rioyokotalab/STRUMPACK
Structured Matrix Package (LBNL)
Language:C++0 0
rioyokotalab/toast-gpt
Language:Python0 01
rioyokotalab/toast-vit
Language:Python0 0
rioyokotalab/ylab_server_public
ひなどりクラスタの使い方 (for public)
0 0
rioyokotalab/zero-bubble-pipeline-parallelism
Zero Bubble Pipeline Parallelism