/TransferLearning_FineTuning

Transfer Learning for Deep Learning

Primary LanguageJupyter Notebook

📌Open In Colab Google Colab Not Defteri

📌Open In Jupyter Jupyter Not Defteri


What is Fine-Tuning and Transfer Learning? 👽


For a moment, let's take a small step back from the nitty-gritty details of EfficientNet.🕊Imagine that a bird could pass on to you what it has learned. Or what you have learned you could pass to a fish-sounds crazy, right?  Let's just say, I have learned since I was born and from my ancestors to recognize a glass. There are simple features (edge, corner, shape, material structure, etc.). Turns out, something happens when machines learn-they transfer what they know and learn to other machines, skipping the full learning process.

Let us start with a computer vision problem, but I am talking about a method that can be applied to many types of data and problems. So there is a dataset inside the object you want to recognize, but the dataset is too big (this is awesome 😃), the model is also very successful (again, which is awesome 🤗), but it will take days and weeks to train that model for that dataset. It's trained here though! 🧐


You've been solving a new problem by taking advantage of what a machine learning model has learned before. You do this by transferring all or part of what the model has learned. I mean the weights. That's exactly why it's called Transfer Learning. Sometimes if you just make adjustments to learn the basic features for your model, this time it's called Fine-Tuning. Another version, for example, contains images of dogs of Golden and Husky genres and human images of men and women. Here you can do Dog-Human classification with the model, as well as Female-Male or Golden-Husky classification, which is called Multi-Task Learning.


🎯 #1 Version: When we use these parameters only for the model, we do not make a new neural network design for testing. We can use the entire pre-trained model for testing. This method is used especially in mobile and real-time prediction systems that do not require real-time learning. Periodic training can be updated with larger data and system performance can be improved.

🎯 #2 Version: We take a part of the trained model and then train it for the data of our own problem which is not in the dataset. When we do this, we reduce the computational cost, that is, we save time. At the same time, even if our data is limited to our own problem, this method achieves a higher performance in terms of the basic features learned in large datasets. But there are strategies that we need to pay attention to when implementing this method.

  • How similar or different the data we use to the data set of the pre-trained model
  • Size of the data we will use

With the following scheme, we can simply determine how we can make a choice.

🕵 So let's look at a simple example of how we can use a test process that we can run edge!

🔥For this purpose, ResNet50 was trained with deep neural networks for the IMAGENET dataset and weight parameters were recorded at the end of the training.


⚡️On Algorithmia you can make your own model available to everyone as an API. With the ResNet deep learning model trained on the ImageNet dataset, you can try the image classification algorithm free of charge from the link below. Thanks to Yavuz Kömeçoğlu for this work.


✏️ Use the links below for more resources:

Comparison of Activation Functions for Deep Neural Networks

Step-by-Step Use of Google Colab’s Free TPU