Tutorial | Object detection without code#
In this tutorial, you will learn how to use Dataiku’s provided deep learning model to detect and classify objects within a provided set of images. Specifically, you will:
Use a pre-trained model to find and classify microcontrollers in images.
Learn how to evaluate and fine-tune the model.
Feed test images to the model to test its performance.
When finished, you’ll have built the Flow below:
A Dataiku instance (version 11.3 or above). Free edition is enough; Dataiku Online is not compatible.
An object detection code environment, which you can set up in the Applications menu of Dataiku under Administration > Settings > Misc.
Create the project#
To import this project, go to the Dataiku homepage and select +New Project > DSS tutorials > ML Practitioner > Object Detection without Code.
This tutorial uses the Microcontroller Object Detection dataset, which includes images with four models of microcontrollers — ESP8266, Arduino Nano, Heltec ESP32 Lora, and Raspberry Pi 3. We will build a model that detects whether and where these objects are present in the images, and then labels which microcontrollers are present.
Explore the data#
In the Object Detection without Code project, you’ll see one folder, images, which contains 148 images, and a dataset, annotations, which contains object labels for our model. As with image classification in Dataiku, the images must be stored in managed folders.
To create an object detection model, the images also must be labeled with bounding boxes around the objects of interest. These bounding boxes are represented in a tabular data file that will be input into the model.
Let’s open the annotations file. As you can see, it has two columns:
The record_id of each image file.
label, which is an array of the bounding boxes for the objects we’re interested in from each image.
See the reference documentation for more detailed information about the correct formats for images and label inputs for object detection. If your images are annotated in Pascal or VOC format you can use the plugin Image Annotations to Dataset to reformat the labels.
Note that images can have multiple bounding boxes for images of the same class or even multiple classes.
It is also helpful to view the images with their bounding boxes, which you can do using the image view toggle in the upper right corner of the dataset. Dataiku uses the record_id in the annotations dataset matched with the images from the image folder to display the images.
To view the dataset as images, you’ll need to turn on image view.
Navigate to the dataset Settings tab, then the Advanced subtab.
Scroll down to Image view and select the checkbox Display image view.
Set the Image folder to images and the Path column to record_id.
To be able to see the bounding boxes, check the box Has annotations, and select label for the Annotation column.
Change the Annotation type to Object detection, then click the Auto detect classes button. You should see all four classes appear.
When you return to the dataset Explore tab, you should now see the image view toggle in the top right.
Click the Image view button to bring up the image view.
Once in image view, you can click on any image to bring up a detailed view showing the classes and bounding boxes. You can also scroll through the images from here.
Prepare data for the model#
Before creating a model, we’ll split the images into training and testing sets, so we can later test how the model performs on images it has never seen. We’ll do this by splitting the annotations file into training and testing datasets, and each of those will point to different sets of images in the image folder.
From the Flow, select the annotations folder, and click on the Split recipe from Visual recipes in the Actions tab.
In the New Split recipe window, add two output datasets, called annotations_train and annotations_test, then Create the recipe.
In the recipe settings, choose Randomly dispatch data from the splitting methods.
Put 80% of the data into annotations_train, leaving the other 20% in annotations_test.
Save and Run the recipe.
Back in the Flow, you now have two sets of annotations to use in modeling and testing.
Fine-tune a pre-trained model#
After exploring and splitting the data, we are ready to use the pre-trained object detection model in Dataiku.
Before building an image classification model, you must set up a specific code environment. Go to Administration > Settings > Misc. In the section DSS internal code environment, create an Image classification code environment by selecting your Python interpreter and clicking on Create the environment. Dataiku will install all the required packages for image classification and add the new code environment to the Code Envs tab.
Create the model#
In this section, we’ll create a model in the Lab, and later we’ll deploy it in the Flow to test it on new images.
In the Flow, highlight the annotations_train dataset and go to Lab, then choose Object Detection.
The next window asks you to define the model’s target, or what objects it will detect and classes it will predict. Choose the label column.
For the Image folder, choose the images folder to tell the model where to find the images for training.
Name your model or leave the default name and select Create.
Dataiku creates the object detection model, adds it to the Lab, and opens the model Design tab where you can preview images and settings. Before training the model, let’s review the input and settings. With a training ratio of 80%, our model will have 98 training images and 22 test images.
Check the model settings#
The Basic > Target panel shows the Target column and Image location we input when creating the model. It also recognizes the Path column so the model can find each image. Double-check that these settings are correct.
Under Target classes, the model automatically recognizes that we have created an object detection task with four classes of potential objects in our images. You can preview the images and filter them by selecting each class in the bar chart or the filter above the images.
As with image classification, Dataiku automatically splits the images into training and validation sets so the model can continually test its performance during the training phase. You can see these settings under Basic > Train/Test Set.
In the Training panel you’ll find the settings for fine-tuning the pretrained model. There is only one potential model to use — the Faster R-CNN, which is a faster-performing iteration of a neural network model that divides images into regions to detect objects more quickly.
Also in the Model section, you can specify how many layers of the model to retrain on your images. The R-CNN model contains multiple layers pre-trained on millions of images, and the more layers you retrain, the more specialized your model will be on your images. But the model will also take much longer to train.
We’ll use the default setting of 0 under Number of fine-tuned layers. Because Dataiku’s models always retrain the final layer, this means one layer will be fine-tuned.
In the Optimization and Fine-tuning sections, values are set to industry standards, and in most cases you will not change these.
Using Early stopping, or stopping the training after the model shows no improvement after cycling through the images a number of times, is recommended to cut down on processing time. For purposes of this tutorial, if you want the model to finish training more quickly, change the Early stopping patience to
3 epochs, or three cycles through the images.
Train the model#
When you are finished viewing the settings, select Save, then Train at the top right to begin training the model. In the window, give your model a name or use the default and select Train.
If your Dataiku instance is running on a server with a GPU, you can activate the GPU for training so the model can process much more quickly. Otherwise, the model will run on CPU and the training will take longer, perhaps more than an hour.
Model metrics and explainability#
In the previous Introduction lesson, we created and trained a model to detect microcontrollers in images. In this lesson, we’ll view metrics to understand the model and its performance.
From the Result tab, click on the model name to navigate to the model report.
The Summary shows a report of the model’s training information, including the classes, epochs trained, and other parameters. Our model had an average precision of .813 (your results will vary). As we’ll see, the closer to 1 this number is, the stronger the model.
We can also view a number of other metrics.
You can upload new images directly into the model to find objects, classify them, and view the probabilities of each class. This information can give you insight into how the algorithm works and potential ways to improve it.
To see how this works, we’ll input a new image our model has not seen before.
Download the file
microcontroller_what-if, which is an image that contains two controllers — a Raspberry Pi and a ESP8266.
Navigate to What if? on the left panel.
Either drag and drop the new image onto the screen or click Browse for images and find the image to upload.
The model correctly identifies both controllers in the image, with a relatively high confidence for the Raspberry Pi but lower confidence of about 66% for the ESP. We also do not see any false positive detections on other objects in the image.
You can hover over the class names on the right to highlight the detected objects in the image. You also may upload more images to the What if? panel.
Dataiku calculates the number of true positive and false positive detections in the testing set of images, then inputs those values into a confusion matrix.
Navigate to the Confusion matrix panel under Explainability.
Though your results will vary, we can see in this example that the model correctly predicted six of the ESP8266 chips, but did not detect one of them. It also detected six objects that were not the microchips (in the “Not an object” row) and classified them as an ESP8266.
The true and false positive calculations in the confusion matrix are based on two measures: the Intersection over union, or IOU, and the Confidence score.
The Confidence score is the probability that a predicted box contains the predicted object.
IOU is a measure of the overlap between the model’s predicted bounding box and the ground truth bounding box provided in the training images. By comparing the IoU to an overlap threshold that you set, Dataiku can tell if a detection is correct (a true positive), incorrect (a false positive), or completely missed (a false negative).
The IOU threshold is the lowest acceptable overlap for a detection to be considered a true positive. For example, with a threshold of 50%, the model’s detected object must overlap the true object by 50% or more to be considered a true detection. If a detection overlaps by only 40%, it would be considered a false positive.
You can change the IOU and confidence score at the top of the confusion matrix. Experiment with different settings and note how the values in the matrix change.
We can see that moving the threshold and confidence score higher means the model misses many more predictions. In fact, in our example, it missed all of the Arduino objects and nearly all of the others.
You can return to the optimal settings with the button at the top right.
Double click on any image on the right to view the ground truth and predicted object bounding boxes, the object class, and confidence level. You can filter the images displayed by clicking on any value in the confusion matrix. For example, click on the cell representing the Raspberry Pi true positive detections.
Navigate to the Metrics panel under Performance, where we can view detailed information about the model’s performance for each type of object.
The evaluation metric is Average precision, which measures how well your model performs. The closer to 1 this score is, the better your model is performing on the test set. The average precision measures the area under the precision-recall curve, which you can view under Precision-Recall.
In this case, we can see that the model performed best when detecting Heltec ESP32 Lora controllers with an IOU of 50%. It makes sense that the model’s performance declines as the IOU increases, because the overlap threshold for counting a true positive detection is higher.
Test the object detection model#
Now that we have a fine-tuned model and understand its metrics from the previous Model metrics and explainability lesson, we can deploy it to the Flow and test it on a new set of labeled images.
Deploy the model to the Flow#
First, we’ll deploy the model to the Flow.
If you close the model before deploying it, it will not appear in the Flow. To find the model, you can click on the annotations dataset and go to the Lab, or select the Visual Analyses menu from the top navigation bar.
From the model Report tab where we viewed the performance metrics, click on Deploy in the top right corner.
In the Deploy prediction model info window, select Create.
Your training session and model now appear in the Flow.
Detecting objects on new images#
We can now use the model to make object detections on new images that we held aside in the annotations_test dataset.
Click on the annotations_test dataset and select the Predict recipe under Other recipes in the Actions tab of the right panel.
In the info window, select the model you just trained for the Prediction Model and the images folder under Managed folder.
Click Create to open the scoring recipe settings.
In the settings, note that you can change the confidence score, batch size, and other settings. Leave the defaults, and click Run to start the scoring process.
When the scoring finishes, click on Explore dataset annotations_test_scored to view the model’s predictions.
For each image, the dataset lists the label, or the ground truth object bounding boxes, and the prediction, or the model’s predicted bounding boxes. Some of the predictions might be empty, meaning the model did not detect an object in those images.
It’s most useful to view the predictions as images, rather than tabular data. Select the Image view icon in the top right.
You can click on any image to view the model’s prediction, which gives you further information about the model’s performance. The images show the predicted bounding box or boxes, along with the record id and the label, which tells us the correct class.
For this image, we can see that the model correctly detected a Heltec ESP32 Lora chip.
In the following image, we can see that the model correctly detected an object, but couldn’t currently identify it and applied two different labels.
Congratulations on building your first object detection model in Dataiku! You’re ready to create a new model on your own images.
You also might want to learn how to build image classification models in Dataiku with this tutorial.