Hands-On Tutorial: Image Classification with the Deep Learning on Images Plugin

Introduction

Deep learning models are powerful tools for image classification, but are difficult and expensive to create from scratch.

Dataiku provides a plugin, Deep learning on images, that supplies a number of pre-trained deep learning models that you can use to classify images. You can also re-train a model to specialize it on a particular set of images, a process known as transfer learning.

Let’s Get Started

In this tutorial, you will:

  • classify images of lions and tigers using a pre-trained model.

  • retrain the pre-trained model with additional labeled images and use it for image classification (transfer learning)

  • analyze the model’s architecture with a TensorBoard webapp.

When finished you’ll have built the Flow below.

Dataiku screenshot of the final flow for the image classification tutorial.

Note

You can also visit the Lion and Tiger project to see a completed version of a similar project.

Prerequisites

Note

You can find the instructions for installing plugins in the product documentation. To check whether the plugin is already installed on your instance, go to the Installed tab in the Plugin Store to see a list of all installed plugins.

Note that this plugin is not available for Dataiku Online.

Create the Project

  • From the Dataiku homepage, click +New Project > DSS Tutorials > ML Practitioner > Image Classification - The Visual Way (Tutorial).

Note

You can also download the starter project from this Dataiku download page and import it as a zip file.

Explore the Data

In the Flow, you can see two folders of images, Images to classify and Images for retraining, as well as a Python recipe and output dataset Labels, which we’ll discuss later.

Image classification project in Dataiku, showing two folders.

Take a moment to browse the images in the Images to classify folder to get a sense of the images we’ll be classifying.