kubedl-io/kubedl

[ASoC 2022] Metrics visualization and health scoring model for job

Opened this issue · 0 comments

Background

For now, KubeDL dashboard supports displaying basic informations such as jobs, logs and events, and users are able to manipulate objects through some build-in buttons. However, dashboard can help users digging more insights with visualization of core metrics such as resources utilization, I/O tracing. Usually, system metrics will be collected and gathered in Prometheus protocol, which is a good entry point.

Goals to be achieved

  1. Implement data/metrics visualization leveraging prometheus.
  2. Based on the job information and data metrics, design a job health model to quantify degree of job runtime healthiness.

Additional context

This issue is part of our #249.

Difficulty: Normal
Mentor: Xuelin Hong (@hoaresky )