Classify Images with the Pre-Trained Model

Let’s use the pre-trained model we just downloaded to classify the images in the Images to classify folder.

  • With the Images to classify folder selected, click Deep learning on images from the plugin recipe section of the Actions menu.

  • Choose Image classification (v2).

  • Set Images to classify as the “Image folder” and Pre-trained model (imagenet) as the “Model folder”.

  • Create a new output dataset Classification.

  • Click Create Dataset, and then click Create.

Dataiku screenshot of the dialog for an image classification recipe.

Now to adjust the settings.

  • In the Image classification dialog, set the “Max number of class labels” to 1 since we want the model to make a single prediction for each image.

  • Run the recipe.

Dataiku screenshot of the settings for an image classification recipe.

The resulting dataset contains a column with the predictions. Each prediction is a simple JSON with the predicted label and the model-predicted probability that the label is correct.

Dataiku screenshot of the classification dataset.

Prepare the Output from the Pre-trained Model

Manually scanning the predictions to see which are correct is time-consuming and error-prone, so we’ll use a Prepare recipe to find the correct and incorrect classifications.

  • From the Actions menu of the Classification dataset, select the Prepare recipe.

  • In the recipe creation dialog, rename the output dataset Classification_results, and then click Create Recipe.

Extract the labels from the filenames.

  • From the images column dropdown, select More actions > Find and replace….

  • Type labels as the output column name.

  • With “Regular expression” as the matching mode, copy-paste .*_(.*)\..* as the regular expression and $1 as the replacement value.

Extract the prediction from the JSON.

  • From the prediction column dropdown, select More actions > Find and replace….

  • With Regular expression as the matching mode, copy-paste .*"(.*)".* as the regular expression and $1 as the replacement value.

And one more step:

  • Click Add a New Step and choose Formula from the processors library.

  • Type good_prediction as the name of the output column.

  • Copy-paste if(labels==prediction,1,0) as the expression.

  • Sort the new good_prediction column in ascending order.

Right out of the box, the pre-trained model can classify most of our images of lions and tigers! Only three animals were misclassified as other animals in this case.

Dataiku screenshot of a prepared classification dataset, showing three misclassified images.

Finally, click Run to create the output dataset and return to the Flow.


Check the misclassified images to see why the model may have struggled with them!