How to know if the model has converged?

Question

How to know if the model has converged?

Royalvice opened this issue 2 years ago · 2 comments

Hello! Thank you very much for the excellent work! I would like to know how to control the number of training epochs, how to train on GPU, and how to know if the model has converged?

Answer 1 · 2023-05-25T13:37:44.000Z

Hi @Royalvice and sorry for the late reply !

how to control the number of training epochs

When running an SR task using physo.SR, you can use the argument epochs:

expression, logs = physo.SR(X, y,
                            # Giving names of variables (for display purposes)
                            X_names = [ "z"       , "v"        ],
                            # Giving units of input variables
                            X_units = [ [1, 0, 0] , [1, -1, 0] ],
                            # Giving name of root variable (for display purposes)
                            y_name  = "E",
                            # Giving units of the root variable
                            y_units = [2, -2, 1],
                            # Fixed constants
                            fixed_consts       = [ 1.      ],
                            # Units of fixed constants
                            fixed_consts_units = [ [0,0,0] ],
                            # Free constants names (for display purposes)
                            free_consts_names = [ "m"       , "g"        ],
                            # Units offFree constants
                            free_consts_units = [ [0, 0, 1] , [1, -2, 0] ],
						    # -> epochs <-
						    epochs = 200,
)

By default it will use the number of epochs specified in physo.task.sr.default_config["learning_config"]["n_epochs"].

how to train on GPU

If you are using a GPU capable version of pytorch, physo will automatically use the GPU.
However, using the GPU can actually hinder performances because the bottleneck of physo is the evaluation of equation candidates not the neural network training.
See issue #7 or the table of expected performances in the readme file: https://github.com/WassimTenachi/PhySO/blob/main/README.md#about-performances for details.

how to know if the model has converged

Convergence is achieved when the best expressions produced by the model and on which it is reinforced end up always being the same ones. This can be seen by looking at logs or more conveniently by simply looking at the monitoring plot: by checking the distribution of reward, if it is converging to a single peak then the model has converged. The mean reward of training samples stagnating is also a tale tale sign.

Here is an example of a converged model curves:

After convergence, even after a long period of stagnation, a lucky and random improvement can sometime help the model get out of a local minima but this is not guaranteed.

Cheers.
Wassim

Answer 2 · 2023-05-25T13:38:22.000Z

@Royalvice please tell me if that answers you question so I can close this issue or do not hesitate if you have further questions !