legion-platform/legion

Training metric service

Opened this issue · 0 comments

Currently, we use the MLflow metrics service to save all training metrics. But new toolchains, for example Horowod, do not understand Mlflow metric protocol. We should provide a way to collect and show metrics for any toolchains.