Solution | Next Best Offer for Banking#
Overview#
Business Case#
Marketing campaigns are most impactful when intelligently targeted. Sending the same material to the whole customer base is not only costly, but also ineffective. Knowing how likely a group of customers is to subscribe to a product is essential to make the right decision.
This solution completes the customer segmentation solution within the marketing suite for banking. The user can plug the same data as in the former solution, and build an initial model. The Dataiku app enables users to control how the target customers will be defined for the marketing campaign.
Additionally, the solution includes an integration with the Advisor plugin, which allows users, when hovering their mouse over a client’s ID, to view the probability of subscription associated with each product not already owned in real-time.
Installation#
The process to install this solution differs depending on whether you are using Dataiku Cloud or a self-managed instance.
Dataiku Cloud users should follow the instructions for installing solutions on cloud.
The Cloud Launchpad will automatically meet the technical requirements listed below, and add the Solution to your Dataiku instance.
Once the Solution has been added to your space, move ahead to Data Requirements.
After meeting the technical requirements below, self-managed users can install the Solution in one of two ways:
On your Dataiku instance connected to the internet, click + New Project > Dataiku Solutions > Search for Next Best Offer.
Alternatively, download the Solution’s .zip project file, and import it to your Dataiku instance as a new project.
Technical Requirements#
To leverage this solution, you must meet the following requirement:
Have access to a Dataiku 13.1+* instance.
A Python 3.7 code environment.
If using the Advisor plugin with this solution, you’ll also need to install the plugin on your instance according to its documentation.
Data Requirements#
The project is initially shipped with all datasets using the filesystem connection.
The input data should be separated into five different datasets with the same time frequency:
Dataset |
Description |
---|---|
revenues |
Includes the revenues generated by each product per customer over time. |
product_holdings |
Includes product information and duration period of each product held by customers. |
customers |
Includes customers’ static information. Optional columns can be added to this dataset. |
balances |
Includes balance amounts of each product per customer over time. |
additional_information |
Includes optional additional columns. |
Workflow Overview#
You can follow along with the solution in the Dataiku gallery.
The project has the following high-level steps:
Connect your data as input and select your analysis parameters via the Dataiku Application.
Explore the data and understand the model through the Analytics and ML Model Dashboard.
Analyze and compare your top customers and top campaigns for Cross-Sell in the Marketing Campaign Dashboard.
Visualize specific information about a chosen customer in the Customer Focus Dashboard.
Walkthrough#
Note
In addition to reading this document, it is recommended to read the wiki of the project before beginning to get a deeper technical understanding of how this Solution was created and more detailed explanations of Solution-specific vocabulary.
Plug and play with your own data and parameter choices#
To begin, you will need to create a new instance of the Next Best Offer for Banking application. This can be done by selecting the Dataiku Application from your instance home and clicking Create App Instance. The project is delivered with sample data that should be replaced with our data, assuming that it adopts the data model described above.
This can be done in three ways:
Data can be uploaded directly from the filesystem in the first section of the Dataiku app.
Data can be connected to your database of choice by selecting an existing connection.
Connection settings and data can be copied from a Next Best Offer for Banking project already built.
In options 1 and 2, users must click the Check button which will load the data and verify the schema.
Note
Be sure to refresh the page so that the app can dynamically take your data into account.
With our data selected and loaded into the Flow, we can move to the following App sections:
Section |
Description |
---|---|
Subscription Probability Prediction |
Allows you to parameter the horizon on which you want the probability of subscription to be computed and select which product(s), if any, should be excluded from the prediction. |
Top Prospects for Cross-Sell |
Allows you to select some parameters about top prospects for cross-selling a selected product. It requires you to specify the product to be analyzed and choose between two methods to parameter the output. |
Top Campaigns for Cross-Sell |
Outputs top prospects for cross-selling each product present in the data. It requires you to specify the number of top customers (ex: 100 customers) to see as output for each marketing campaign. |
Cleaning and Preparing our Historical Data#
In total, seven Flow zones are involved in data preparation and cleaning for this solution. We won’t go into heavy detail about each Flow zone as this information can be found in the wiki of the project.
These Flow zone helps construct the consolidated input dataset. The solution generates a unique row for each client, date, and product combination, enabling it to determine whether the client held the product at a given point in time.
After our data have been used to train, test, and score a classification model, two final Data Prep Flow zones are called upon to compute the expected gain and impact values. These three Flow zones are important to get our data into a format that can be used to generate visualizations and metrics for the dashboard.
Exploring Input Data#
To better understand the input data and verify that it is coherent, it’s important first to explore our historical datasets. Doing so allows us to identify the population distributions and trends of the customer base and product holdings. The Customer analysis and Today’s holdings analysis Flow zones compute all metrics and values needed to generate charts for the two first pages of the Analytics and ML Model Dashboard.
Predicting Probability of Subscription#
The Subscription Prediction zone is where the prediction of the subscription probability is created.
The classification model first orders the training dataset by the observation date column. The model is then trained on the first 80% of the data points in the train dataset and tested on the remaining 20%.
The Score recipe is used to apply the regression model to the dataset containing the values to predict (to_predict_data) and generate the predicted values.
Note
To address potentially imbalanced values in the subscription variable, the model performs class rebalancing. This involves randomly selecting approximately 100,000 rows to rebalance all modalities of a column equally. The parameters for this sampling method can be customized by the user in the Flow.
Visualize Model Performance#
In the same Dashboard as the previously mentioned Exploring Input Data section, we can find a third page, named Machine learning model analysis, to visualize the explainability and performance of the classification model built to compute the subscription probability prediction.
Exploring Cross-Sell Impact#
The last page of the Analytics and ML Model Dashboard allows for understanding the effect of offering additional products to customers on the sales of existing products.
Marketing Campaign Dashboard#
The Marketing Campaign Dashboard covers the following topics across two pages.
Page |
Description |
---|---|
Top Prospects for Cross-Sell |
Displays information regarding the list of top customers, which were selected based on the parameters configured within the Dataiku application. |
Top Campaigns for Cross-Sell |
Presents a comparison of the gain and impact associated with each product analyzed based on the parameters configured within the Dataiku application. |
Customer Focus Dashboard#
The Customer Focus Dashboard offers specific information about a chosen customer. To find a particular customer and adjust the analysis accordingly, users can use the search engine located at the top of the page to select a Customer ID and update the visuals accordingly.
Advisor Plugin#
In this solution, the Advisor for NBO for Banking web app allows for the integration of the Advisor plugin. As a result, when accessing any website through a browser, you will see the Customer IDs underlined and will be able to click on it. A panel on the right will appear, and you will have access to a card including statistical information about a specific customer — along with the probability of subscription to each product they do not already own.
Responsible AI Considerations#
The Next Best Offer solution is designed to serve as a powerful marketing campaign targeting tool, but it should not be used for the distribution of offers or promotions to a select group of customers. Misusing this solution in such a manner may lead to unintended consequences, such as the unfair treatment of certain customers.
Reproducing these Processes With Minimal Effort For Your Own Data#
The intent of this project is to enable marketing teams to understand how Dataiku can be used to predict the probability of subscription of customers and to control how the target customers will be defined for the marketing campaign in the Dataiku application.