Train machine learning models#

See a screencast covering this section’s steps

Now that the train and test data are in a separate Flow zone, let’s start creating models on the training data!

Create an AutoML prediction task#

The first step is to define the basic parameters of the machine learning task at hand.

  1. Click on the top-right corner of the Machine Learning Flow zone to open it.

  2. Select the train dataset.

  3. Navigate to the Lab tab of the right side panel.

  4. Among the menu of visual ML tasks, select AutoML Prediction.

Dataiku screenshot of the interface for selecting an autoML prediction task.

Now you just need to choose the target variable and which kind of models you want to build.

  1. Choose fraudulent as the target variable on which to create the prediction model.

  2. Click Create, keeping the default setting of Quick Prototypes.

Dataiku screenshot of the dialog for creating an AutoML prediction task.

Important

In addition to AutoML Prediction shown here, many other types of models can be built in a similar manner. Among visual options, you could also build time series, clustering, image classification, object detection, or causal prediction models.

You can also mix code for custom preprocessing or custom algorithms into visual models. Alternatively, those wanting to go the full code route should explore the Developer Guide.

Train models with the default design#

Based on the characteristics of the input training data, Dataiku has automatically prepared the design of the model. But no models have been trained yet!

  1. Before adjusting the design, click Train to start a model training session.

  2. Click Train again to confirm if necessary.

Dataiku screenshot of the dialog to train an ML model.