/KAN-Continual_Learning_tests

Collection of tests performed during the study of the new Kolmogorov-Arnold Neural Networks (KAN)

Primary LanguageJupyter Notebook

Continual learning in KANs

This repository contains the results of the tests carried out to prove the ability of the Kolmogorov-Arnold neural network to resist the catastrophic forgetting that heavily affects MLPs.


📘 The results of these tests are presented in detail in this paper:

KANs Continual Learning [V2] - Morelli Valerio Federica Paganica.pdf

📙 The slideshow presented on the day of the exam:

KANs Continual Learning [Slideshow PPTX] - Morelli Valerio Paganica Federica.pptx

KANs Continual Learning [Slideshow PDF] - Morelli Valerio Paganica Federica.pdf


📘 Table of Contents

📈 Testing different learning rate scales on MLPs and KANs on MNIST

⬆️ Sorted MNIST training set (INTRA training set sorting)

The first test tries to understand the impact of a non-shuffled trainset on the training of the different architectures. This scenario is common in real-time applications where the order of the input data cannot be decided and the network may not see the sample of a particular class again. This is a deliberate attempt to make the network's training less effective because the order of the training sets, as is well known in machine learning, should always be random in machine learning. In these unfavourable conditions, the KANs prove to outperform the MLPs

The training set is sorted as follows:

When MLPs see a new, previously unseen digit at the same learning rate, they tend to become distorted more quickly:

INTRA dataset lr-5 ep1-2

The results of this test are:

MLPs vs KANs

INTRAdataset_NON-CONV

KAN-based and non-KAN-based convolutional nets

INTRAdataset_CONV2

⬇️ Class-IL Scenario (INTER training set sorting)

🎬 The following video highlights the difference between MLPs and KANs in a Domain-IL scenario:

MLP.vs.KAN.in.continual.learning.DOMAIN-IL.mp4
confusion_matrix

MLPs vs KANs

Based on Convolutional-KANs by Blealtan

Learning Rate=10^-6:

INTER lr-6 MLP_KAN

KAN-based and non-KAN-based convolutional nets

Based on Convolutional-KANs by AntonioTepsich and on KANvolver by Subhransu Sekhar Bhattacharjee

INTER lr-6 CONV

❗ The Gaussian Peaks Problem

Here we show how the 7th PyKAN regression example can be solved by EfficientKAN with the same performance as PyKAN.

Read more on Something different from the official results for KAN

After introducing the sb_trainable and sp_trainable on the EfficientKAN class, and setting them to False just like PyKAN does, the same results can be achieved:

Gaussian Peaks EfficientKAN

👨🏻‍💻 Authors

Name Email GitHub
Valerio Morelli s1118781@studenti.univpm.it MrPio
Federica Paganica s1116749@studenti.univpm.it Federica