How do pre-trained models work? – DataTrained

 


Introduction

In the modern world, an extremely sophisticated supervised machine learning problem requires hundreds of GBs of RAM, which is available to you for a small investment or rent. However, getting access to GPUs is not free. You must have access to GPUs with 100 GB of VRAM, which won't be simple and will cost a lot of money.

We must use our resources more wisely while tackling Deep Learning issues. Particularly when we attempt to address challenging real-world issues in fields like speech and picture recognition. Once your model has a few hidden layers, adding more would require an enormous number of resources.

 

You can also read: Top Data Science, Big Data & Analytics Courses in Kerala | Colleges offering Data Analytics Courses in Kerala | Data Science Course in Kerala with IBM Certification

 

Thanks to "Transfer Learning," which allows us to employ pre-trained models created by others by making minor adjustments. I'll explain how to leverage pre-trained models to speed up your solutions in this article.

 

What is transfer learning?

On a set of data, a neural network is trained. This data is compiled as the network's "weights," and the network learns from them. It is possible to extract these weights and then apply them to any other neural network. We "transfer" the learned characteristics rather than starting the other neural network from scratch.

 

How to Use Pre-Trained Models?

 The only restriction on using a pre-trained model is your imagination.

Remember that convolutional layers near the input layer of the model learn low-level features like lines, layers in the middle of the layer learn compound abstract features that merge the lower-level features extracted from the input, & layers near the output layer interpret the bring-out features in the context of a classification task.

 

You can also read: Data Science Courses: Fees, Eligibility, Syllabus, Online 2022 | Data Science Scope in India (2022)

 

·         With this knowledge, it is possible to select the level of detail for feature extraction from a pre-trained model. For instance, the output of the pre-trained model after a few layers might be adequate if the new task is very different from categorizing items in images (for example, different from ImageNet). The output from layers considerably deeper in the model, or even the output of the fully linked layer before the output layer, may be employed if a new task is relatively comparable to the task of categorizing objects in images.

  Another option is to directly incorporate the pre-trained model or desired component of the model into a fresh neural network model. The pre-trained model's weights can be set to "frozen" in this application so that they are not altered as the new model is trained. Alternatively, the weights might be changed while the new model is being trained, perhaps with a slower learning rate, allowing the previously trained model to function as a weight initialization method.

 

Following is a summary of some of these usage patterns:

 

Classifier:

To categorize brand-new photos, the previously trained model is employed directly.

 

Standalone Feature Extractor:

 The pre-trained model is used to pre-process images and extract pertinent features, or a subset of the model.

   The pre-trained model, or a subset of the model, is merged into a new model using an integrated feature extractor: however, the layers of the pre-trained model are frozen during training.

 

You can also read: Analytics / Data Analytics Certification Training Course in Pune | Data Analyst Course in Pune [with IBM Certificates]

 

Weight Initialization:

 The layers of the pre-trained model are trained in conjunction with the new model. The pre-trained model, or a portion of the model, is integrated into the new model.

·         Each method has the potential to be efficient and save a lot of time when creating and refining a deep convolutional neural network model.

·         Testing may be necessary because it may not be obvious which application of the pre-trained model will produce the best outcomes for your new computer vision task.


Comments