Pinned Repositories
2024-icpads-hpc-workload-characterization
Artifact for "Generic and ML Workloads in an HPC Datacenter: Node Energy, Job Failures, and Node-Job Analysis" (ICPADS'24)
FAILS
A Framework for Automated Collection and Analysis of Incidents on LLM Services
llm-service-analysis
Artifact for "An Empirical Characterization of Outages and Incidents in Public Services for Large Language Models" (ICPE'25)
2023-hotcloudperf-ml-failures
Code and data for "How Do ML Jobs Fail in Datacenters? Analysis of a Long-Term Dataset from an HPC Cluster" (HotCloudPerf'23)
2024-icpads-hpc-workload-characterization
AIP
An instrument to combine, unify, and correct (scientific) article meta-data.
atlarge-phd-thesis-template
A LaTeX template for creating beautiful PhD theses, originally created by TU Delft.
awesome-AI-for-time-series-papers
A professional list of Papers, Tutorials, and Surveys on AI for Time Series in top AI conferences and journals.
Awesome-System-for-Machine-Learning
A curated list of research in machine learning systems (MLSys). Paper notes are also provided.
gsoc-analyse
帮助新手参与开源
chuxiaoyu's Repositories
chuxiaoyu/2023-hotcloudperf-ml-failures
Code and data for "How Do ML Jobs Fail in Datacenters? Analysis of a Long-Term Dataset from an HPC Cluster" (HotCloudPerf'23)
chuxiaoyu/gsoc-analyse
帮助新手参与开源
chuxiaoyu/2024-icpads-hpc-workload-characterization
chuxiaoyu/AIP
An instrument to combine, unify, and correct (scientific) article meta-data.
chuxiaoyu/atlarge-phd-thesis-template
A LaTeX template for creating beautiful PhD theses, originally created by TU Delft.
chuxiaoyu/awesome-AI-for-time-series-papers
A professional list of Papers, Tutorials, and Surveys on AI for Time Series in top AI conferences and journals.
chuxiaoyu/Awesome-System-for-Machine-Learning
A curated list of research in machine learning systems (MLSys). Paper notes are also provided.
chuxiaoyu/blog
chuxiaoyu/blog_image
chuxiaoyu/chuxiaoyu
chuxiaoyu/chuxiaoyu.github.io
chuxiaoyu/cloud-failure-characterization
chuxiaoyu/datawhale-operational-research
Datawhale运筹组
chuxiaoyu/hpc-ontology-modeller
This project aims to create a toolset for defining HPC ontology, that builds on top of an existing HPC ontology (https://hpc-fair.github.io/ontology/).
chuxiaoyu/examon
A highly scalable framework for the performance and energy monitoring of HPC servers
chuxiaoyu/gjy_thesis
chuxiaoyu/honours-programme-project
chuxiaoyu/LeetCode-Py
⛽️「算法通关手册」,超详细的「算法与数据结构」基础讲解教程,「LeetCode」700+ 道题目的详细解析。通过「算法理论学习」和「编程实战练习」相结合的方式,从零基础到彻底掌握算法知识。
chuxiaoyu/lwj_hpc_unsupervised_anomaly_detection
chuxiaoyu/mcs-deadlines
conference deadlines for MCS group
chuxiaoyu/opendc
Collaborative Datacenter Simulation and Exploration for Everybody
chuxiaoyu/outage_report_characterization
chuxiaoyu/pcp
Performance Co-Pilot
chuxiaoyu/really_old_cfa_parsing_scripts
chuxiaoyu/Recommender
chuxiaoyu/SURFace
Beneath the SURFace: An MRI-like View into the Life of a 21st Centry Datacenter
chuxiaoyu/sysstat
Performance monitoring tools for Linux
chuxiaoyu/team-learning-program
主要存储Datawhale组队学习中“编程、数据结构与算法”方向的资料。
chuxiaoyu/tinyflow
Tutorial code on how to build your own Deep Learning System in 2k Lines