Pinned Repositories
AdvancedSoftwarePractices
고급소프트웨어실습 아카이빙
Awesome-Efficient-LLM
A curated list for Efficient Large Language Models
Awesome-Federated-Machine-Learning
Everything about federated learning, including research papers, books, codes, tutorials, videos and beyond
bitcoin-chart
📈Real-time Bitcoin Chart by Upbit
BSA-SpMM_EURO-PAR-2024
Official Artfifact for Accelerated Block-Sparsity-Aware Matrix Reordering for Leveraging Tensor Cores in Sparse Matrix-Multivector Multiplication (Euro-Par 2024)
celery-redis-queue
Celery and Redis Queue in FastAPI
chatbot_api
chatbot Swagger
codingTest
백준 문제풀이
kant
Using GPT-2, create a philosophical paper like Immanuel Kant
KoGPT2-chatbot
Simple Chit-Chat based on KoGPT2
dleunji's Repositories
dleunji/bitcoin-chart
📈Real-time Bitcoin Chart by Upbit
dleunji/kant
Using GPT-2, create a philosophical paper like Immanuel Kant
dleunji/Awesome-Efficient-LLM
A curated list for Efficient Large Language Models
dleunji/Awesome-Federated-Machine-Learning
Everything about federated learning, including research papers, books, codes, tutorials, videos and beyond
dleunji/BSA-SpMM_EURO-PAR-2024
Official Artfifact for Accelerated Block-Sparsity-Aware Matrix Reordering for Leveraging Tensor Cores in Sparse Matrix-Multivector Multiplication (Euro-Par 2024)
dleunji/celery-redis-queue
Celery and Redis Queue in FastAPI
dleunji/CUDA-TC
dleunji/cuda_til
dleunji/cuda_hgemm
Several optimization methods of half-precision general matrix multiplication (HGEMM) using tensor core with WMMA API and MMA PTX instruction.
dleunji/CUDATeaching
CUDA based GPU Programming
dleunji/curious-ui
Q&A Board 'Curious'의 UI
dleunji/dleunji.github.io
dleunji.github.io
dleunji/Federated-Averaging-PyTorch
An unofficial PyTorch implementation of a federated learning algorithm, FedAvg.
dleunji/Federated-Learning-Research
An implementation of federated learning research baseline methods based on FedML-core, which can be deployed on real distributed cluster and help researchers to explore more problems existing in real FL systems.
dleunji/Learn-CUDA-Programming
Learn CUDA Programming, published by Packt
dleunji/lm-evaluation-harness
A framework for few-shot evaluation of autoregressive language models.
dleunji/lmquant
dleunji/Magicube
Magicube is a high-performance library for quantized sparse matrix operations (SpMM and SDDMM) of deep learning on Tensor Cores.
dleunji/Misc-Cheatsheet
대학원 생활을 하며 사용하는 작고 소중한 코딩팁 (linux 명령어 등)
dleunji/mongoDB-test
To master mongoDB and pymongo, clone social media platform
dleunji/Parallel-Sudoku-Solver
🔢 A parallelized Sudoku solver implemented with various solving algorithms in C++.
dleunji/ppopp20_spmm_artifact
dleunji/pytorchviz
A small package to create visualizations of PyTorch execution graphs
dleunji/qserve
QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving
dleunji/TC-GNN_ATC23
Artifact for USENIX ATC'23: TC-GNN: Bridging Sparse GNN Computation and Dense Tensor Cores on GPUs.
dleunji/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
dleunji/test-nethereum
Connecting .NET with Solidity
dleunji/vectorSparse-custom
dleunji/wmma_tensorcore_sample
Matrix Multiply-Accumulate with CUDA and WMMA( Tensor Core)
dleunji/WWW23_ODE_custom