Establish the model structure and hyperparameters that are able to approximate any of the elementary functions
Closed this issue · 7 comments
You can see an example for sine committed, but feel free to experiment. In any case, you should confirm in advance that the same structure and hyperparameters (the one in the example or some other) will be able to approximate any of the four elementary functions. These will be the "unknown functions" that your system will learn by observing the outputs of complex expressions of these functions, so we should first confirm that they are learnable.
In my last commit [0c3ea22] (0c3ea22) I modified sineNN.py to learn functions when x is in a certain range, as our initial task is to learn e.g. sin(x) when x in [-10,10].
Of course, keeping x in [0,1] is a good idea if we want to normalize the values of e^x in range [0,1] using the function y = (e^x-1)/(e-1), but is it actually useful for our task?
Should we continue exploring the learnability of those functions when x in [a,b], or keep it simple and check if the simple functions are learnable when x in [0,1] ?
Note that the "real" x is not in 0..1, it is in an arbitrary (but fixed in advance) domain that is squashed in 0..1 for the benefit of the NN and then expanded again for the benefit of plotting.
By all means, feel free to experiment on how the domain affects learnability, although I would recommend to keep it as simple as possible for this exercise. I recommend identifying a common domain for all functions, chosen so that the part of the function that is plotted is characteristic enough to be easily distinguishable.
(UNCHEKED just wild speculation) [-pi .. pi] might be a good candidate, as sinc() looks good on domains that are symmetric around zero and sine() will most probably be more learnable for exactly one period. The means only half the plot data for ln(), but I think it will still be identifiable. We will see.
Looks like the current architecture is enough for learning each function separatelly.
In PyTorch the default sinc is the normalized one :
Also, correct me if I am wrong, in this task we aim to have one NN that is able to learn different functions given different data and not one NN that learns to distinguish all four functions when given all of them as training data.
I think you mean
[1] https://pytorch.org/docs/stable/special.html#torch.special.sinc
[2] https://numpy.org/doc/stable/reference/generated/numpy.sinc.html
Also, correct me if I am wrong, in this task we aim to have one NN that is able to learn different functions given different data and not one NN that learns to distinguish all four functions when given all of them as training data.
That is correct. The aim is to have same network structure fit different functions based on the data it receives. The autodiff "magic" will be that it will appropriately distribute loss between these networks, following how they are connected into a DAG by the semantics of composition, addition, and the poly fn (that we consider as operators of the programming language and not functions).
I tried a small-scale test with sine only, and it looks promising. I will clean and curate and push it later today, along with another issue describing the next step.
Following up the conversation in #13, I did check if Snake
works well with the four functions, even the ones that are not periodic, and it actually fits like a glove in all cases.
I think it works because of the parameter
I quote the authors: 'We find that for standard tasks such as image classification, setting 0.2
Looks good. Please add these runs into exp00.py, or into a new exp file (whatever you prefer) and close this issue.
If you want this to appear in git history as preceding the work for Issue #13 (conceptually accurate history), then commit into autodiff, and rebase branch 13-... over the new autodiff HEAD.
If you want this to appear in git history as parallel to #13 (accurate timeline, but separating lines of work) then then commit into autodiff and merge the new autodiff into 13-..
If you want git history to give a linear timeline without separating lines of work, just commit at 13-...