Hands-On Tutorial: What-If Analysis With Interactive Scoring

The interactive scoring feature of a Dataiku visual model enables anyone to run “what-if” analyses and compare multiple test cases from real or hypothetical situations. Learn how to use an interactive scoring tool that has been embedded in a dashboard with this hands-on exercise.

Getting Started

You will need a Dataiku DSS project with a dataset that you can use to create a predictive model. We will use the Haiku Starter project as an example. There are two ways you can import this project:

  • From the Dataiku DSS homepage, click +New Project > DSS Tutorials > General Topics > Haiku Starter.

  • Download the zip archive for your version of Dataiku DSS, then from the Dataiku DSS homepage, click +New Project > Import project and choose the zip archive you downloaded.

You should now be on the project’s homepage.

  • Go to the project’s list of dashboards, open the Analytic dashboard, and go to the What-If Analysis slide.

This slide contains an interactive scoring tile. The model underlying this tile predicts whether a customer of the Haiku T-Shirt company is a “high value” customer. We may be interested in assessing the likelihood that a new customer will become “high value” for various profiles.

Dashboard slide containing an interactive scoring tile for what-if analysis, with default values shown

Specifying a What-If Scenario

The left-hand side of the tile displays the interactive simulator where you can configure all the input features values. The right-hand side displays the result of the prediction, along with explanations of which features contribute most strongly to this prediction.

The default values are based on the training set for the model, and use the medians for numerical features and the most common values for categorical features. To create your own custom scenario, simply change the values:

  • The feature order_date_year_distinct is a count of the number of distinct years in which the customer has made an order. For a new customer, we won’t know this value, so click the Ignore feature button to not specify a value to the model.

  • Do the same for the features order_day_of_week_distinct and order_date_month_distinct.

  • The value for feature ip_address_country is “United States” and for ip_address_city it is “Tokyo”, because these were the most common values for each of these features, but it doesn’t make common sense. Change the value of ip_address_country to “Japan”.

  • Set ip_address_geopoint to Ignore feature.

We can see that this changes the predicted probability of the customer being “high value” from 24% to 28%.

Dashboard slide containing an interactive scoring tile for what-if analysis, with custom values shown

Note

The display for the prediction depends upon whether the model is a binary classification, multi-class classification, or regression. In this hands-on, the model is a binary classification problem and the display shows the model-predicted probability of each category and the classification threshold.

In a multi-class classification problem, the display shows the model-predicted probability of each category, and the predicted category.

../../_images/interactive-scoring-multiclass.png

In a regression problem, the display shows the model-predicted probability density function for the target, and the predicted value.

../../_images/interactive-scoring-regression.png

Comparing What-If Scenarios

A primary value of what-if analysis is to be able to compare differences scenarios. To do this in Dataiku DSS:

  • Click Add to Comparator.

  • Change ip_address_country and ip_address_city to “United States” and “New York”, respectively, then click Add to Comparator.

  • Change ip_address_country and ip_address_city to “France” and “Paris”, respectively, then click Add to Comparator.

Interactive scoring tile where multiple scenarios have been added to the comparator
  • Click Compare to open the comparator.

You can quickly see from the display how the probability of being “high value” changes from scenario to scenario.

Comparing multiple scenarios

Scrolling down the comparator display, you can quickly see which features are different across the scenarios; the features that are the same are greyed out.

Comparing multiple scenarios

Copying Values into a What-If Scenario

It is time consuming to manually update all the feature values in a scenario. You can copy rows from a DSS dataset into the interactive scoring tile to create a scenario.

The What-If Analysis slide has a dataset tile below the interactive scoring tile.

  • Right-click on a row in the dataset and select Copy row as JSON.

Copying a row of a dataset
  • In the interactive scoring tile, click and select Paste features, then follow the instructions in the dialog.

  • A dialog appears to confirm the values you will paste. Click Paste.

Confirmation dialog for pasting a row