Convex Analysis of Layer-wise Adaptive Rate Scaling (LARS)
Igor Gitman, Deepak Dilipkumar, Ben Parr
Convex Optimization (10-725)
Carnegie Mellon University
Convex Analysis of Layer-wise Adaptive Rate Scaling (LARS)
Igor Gitman, Deepak Dilipkumar, Ben Parr
Convex Optimization (10-725)
Carnegie Mellon University