This repository contains the results of the tests carried out to prove the ability of the Kolmogorov-Arnold neural network to resist the catastrophic forgetting that heavily affects MLPs.
KANs Continual Learning [V2] - Morelli Valerio Federica Paganica.pdf
KANs Continual Learning [Slideshow PPTX] - Morelli Valerio Paganica Federica.pptx
KANs Continual Learning [Slideshow PDF] - Morelli Valerio Paganica Federica.pdf
- 📈 Testing different learning rate scales on MLPs and KANs
- ⬆️ Sorted MNIST training set (INTRA training set sorting)
- ⬇️ Class-IL Scenario (INTER training set sorting)
- ❗ The Gaussian Peaks Problem
- 👨🏻💻 Authors
The first test tries to understand the impact of a non-shuffled trainset on the training of the different architectures. This scenario is common in real-time applications where the order of the input data cannot be decided and the network may not see the sample of a particular class again. This is a deliberate attempt to make the network's training less effective because the order of the training sets, as is well known in machine learning, should always be random in machine learning. In these unfavourable conditions, the KANs prove to outperform the MLPs
The training set is sorted as follows:
When MLPs see a new, previously unseen digit at the same learning rate, they tend to become distorted more quickly:
The results of this test are:
🎬 The following video highlights the difference between MLPs and KANs in a Domain-IL scenario:
MLP.vs.KAN.in.continual.learning.DOMAIN-IL.mp4
Learning Rate=10^-6:
Here we show how the can be solved by EfficientKAN with the same performance as PyKAN.
After introducing the sb_trainable and sp_trainable on the EfficientKAN class, and setting them to False
just like PyKAN does, the same results can be achieved:
Name | GitHub | |
---|---|---|
Valerio Morelli | s1118781@studenti.univpm.it | MrPio |
Federica Paganica | s1116749@studenti.univpm.it | Federica |