Concept: Pre-Trained Models

What are pre-trained models?

In many computer vision tasks, you will come across pre-trained models. Pre-trained models are useful for dealing with classification tasks, among others. Perhaps you’ve already tried to build your own convolutional neural network (or CNN) to classify lower-dimensional images, letters, or digits. For lower-dimensional images, a simple CNN architecture will suffice.


Now let’s say you are trying to classify more complex images, which are made up of many colors, oriented in different positions, or which exhibit other novel behavior. At this point, you might not want to build a model from scratch, as doing so would require a lot of data, resources, and a complex architecture.

Instead, you could begin with a pre-trained model (a CNN) that has been trained on huge datasets and can predict various classes. You can re-train this pre-trained model on your data, by identifying the layers of the neural network that need to be re-trained on your data.


Pre-trained networks such as Resnet50, Xception, Inception V3, and VGG16, are all trained on the ImageNet dataset. The ImageNet dataset consists of more than 14 million images, falling into more than 20,000 categories. These complex, pre-trained models are ready for use in many classification and object detection use cases, and they can recognize up to 20,000 different object categories with high accuracy.

You can leverage these pre-trained models for a variety of tasks such as feature extraction, and to prepare for transfer learning.

Pre-trained models in Dataiku

To leverage pre-trained models in Dataiku, you can use the Deep Learning on Images plugin. To download the plugin, search for it in the Dataiku plugin store inside your instance.

Dataiku screenshot of the Deep learning on images plugin in the plugin store.


Be sure not to use the CPU and/or GPU legacy versions of this plugin if starting a new project.

During the installation, Dataiku will inform you that the plugin requires a dedicated code environment. Click Build New Environment so that Dataiku can install all the required packages and create the environment. Once installed, you can see that the plugin includes the following recipes:

  • Image classification (v2). Use this recipe to classify images by providing two folders as input: one that contains your images and another that contains your model.

  • Classification model retrain on images (v2). Use this recipe to re-train an existing model on your own images.

  • Images feature extraction (v2). Use this recipe to extract vectorized data from a layer of your neural network.

In addition, the plugin includes macros so that you can easily install pre-trained models. Also included is a webapp template where you can leverage TensorBoard to analyze the performance of your model.

Other articles dive deeper into the plugin and its recipes, and demonstrate the use of the plugin:

  • as an input for image classification, without retraining your model (to use your model as is),

  • as an input to retrain a model, and

  • to perform image feature extraction.