Solution | Next Best Offer for Banking#
Overview#
Business Case#
Marketing campaigns are most impactful when intelligently targeted. Sending the same material to the whole customer base is not only costly, but also ineffective. Knowing how likely a group of customers is to subscribe to a product is essential to make the right decision.
This solution completes the customer segmentation solution within the marketing suite for banking. The user can plug the same data as in the former solution and build an initial model. The Project Setup enables users to control how the target customers will be defined for the marketing campaign.
Installation#
The process to install this solution differs depending on whether you are using Dataiku Cloud or a self-managed instance.
Dataiku Cloud users should follow the instructions for installing solutions on cloud.
The Cloud Launchpad will automatically meet the technical requirements listed below, and add the Solution to your Dataiku instance.
Once the Solution has been added to your space, move ahead to Data Requirements.
After meeting the technical requirements below, self-managed users can install the Solution with the following instructions:
From the Design homepage of a Dataiku instance connected to the internet, click + Dataiku Solutions.
Search for and select Next Best Offer.
Click Install, changing the project folder into which the solution will be installed if needed.
From the Design homepage of a Dataiku instance connected to the internet, click + New Project.
Select Dataiku Solutions.
Search for and select Next Best Offer.
Note
Alternatively, download the Solution’s .zip project file, and import it to your Dataiku instance as a new project.
Technical Requirements#
To leverage this solution, you must meet the following requirement:
Have access to a Dataiku 13.2+* instance.
Data Requirements#
The project is initially shipped with all datasets using the filesystem connection.
The input data should be separated into five different datasets with the same time frequency:
Dataset |
Description |
---|---|
revenues |
Includes the revenues generated by each product per customer over time. |
product_holdings |
Includes product information and duration period of each product held by customers. |
customers |
Includes customers’ static information. Optional columns can be added to this dataset. |
balances |
Includes balance amounts of each product per customer over time. |
additional_information |
Includes optional additional columns. |
Workflow Overview#
You can follow along with the solution in the Dataiku gallery.
The project has the following high-level steps:
Connect your data as input, and select your analysis parameters via the Project Setup.
Explore the data and understand the model through the Analytics and ML Model dashboard.
Analyze and compare your top customers and top campaigns for Cross-Sell in the Marketing Campaign dashboard.
Visualize specific information about a chosen customer in the Customer Focus dashboard.
Walkthrough#
Note
In addition to reading this document, it is recommended to read the wiki of the project before beginning to get a deeper technical understanding of how this Solution was created and more detailed explanations of Solution-specific vocabulary.
Plug and play with your own data and parameter choices#
To begin, you will need to use the Project Setup. The project is delivered with sample data that should be replaced with our data, assuming that it adopts the data model described above.
This can be done in three ways:
Data can be uploaded directly from the filesystem in the first section.
Data can be connected to your database of choice by selecting an existing connection.
Connection settings and data can be copied from a Next Best Offer for Banking project already built.
In options 1 and 2, users must click the Check button, which will load the data and verify the schema.
Tip
Be sure to refresh the page so that the interface can dynamically take your data into account.
With our data selected and loaded into the Flow, we can move to the following sections:
Section |
Description |
---|---|
Subscription Probability Prediction |
Allows you to set the horizon on which you want the probability of subscription to be computed and select which product(s), if any, should be excluded from the prediction. |
Top Prospects for Cross-Sell |
Allows you to select some parameters about top prospects for cross-selling a selected product. It requires you to specify the product to be analyzed and choose between two methods to parameterize the output. |
Top Campaigns for Cross-Sell |
Outputs top prospects for cross-selling each product present in the data. It requires you to specify the number of top customers (ex: 100 customers) to see as output for each marketing campaign. |
Cleaning and Preparing our Historical Data#
In total, seven Flow zones are involved in data preparation and cleaning for this solution. We won’t go into heavy detail about each Flow zone as this information can be found in the wiki of the project.
These Flow zones help construct the consolidated input dataset. The solution generates a unique row for each client, date, and product combination, enabling it to determine whether the client held the product at a given point in time.
After our data have been used to train, test, and score a classification model, two final Data Prep Flow zones are called upon to compute the expected gain and impact values. These three Flow zones are important to get our data into a format that can be used to generate visualizations and metrics for the dashboard.
Exploring Input Data#
To better understand the input data and verify that it is coherent, it’s important first to explore our historical datasets. Doing so allows us to identify the population distributions and trends of the customer base and product holdings. The Customer analysis and Today’s holdings analysis Flow zones compute all metrics and values needed to generate charts for the two first pages of the Analytics and ML Model Dashboard.
Predicting Probability of Subscription#
The Subscription Prediction zone is where the prediction of the subscription probability is created.
The classification model first orders the training dataset by the observation date column. The model is then trained on the first 80% of the data points in the train dataset and tested on the remaining 20%.
The Score recipe is used to apply the regression model to the dataset containing the values to predict (to_predict_data) and generate the predicted values.
Note
To address potentially imbalanced values in the subscription variable, the model performs class rebalancing. This involves randomly selecting approximately 100,000 rows to rebalance all modalities of a column equally. The parameters for this sampling method can be customized by the user in the Flow.
Visualize Model Performance#
In the same dashboard as the previously mentioned Exploring Input Data section, we can find a third page, named Machine learning model analysis, to visualize the explainability and performance of the classification model built to compute the subscription probability prediction.
Exploring Cross-Sell Impact#
The last page of the Analytics and ML Model Dashboard allows for understanding the effect of offering additional products to customers on the sales of existing products.
Marketing Campaign Dashboard#
The Marketing Campaign Dashboard covers the following topics across two pages.
Page |
Description |
---|---|
Top Prospects for Cross-Sell |
Displays information regarding the list of top customers, which were selected based on the parameters configured within the Project Setup. |
Top Campaigns for Cross-Sell |
Presents a comparison of the gain and impact associated with each product analyzed based on the parameters configured within the Project Setup. |
Customer Focus Dashboard#
The Customer Focus Dashboard offers specific information about a chosen customer. To find a particular customer and adjust the analysis accordingly, users can use the search engine located at the top of the page to select a Customer ID and update the visuals accordingly.
Responsible AI Considerations#
The Next Best Offer solution is designed to serve as a powerful marketing campaign targeting tool, but it should not be used for the distribution of offers or promotions to a select group of customers. Misusing this solution in such a manner may lead to unintended consequences, such as the unfair treatment of certain customers.
Reproducing these Processes With Minimal Effort For Your Own Data#
The intent of this project is to enable marketing teams to understand how Dataiku can be used to predict the probability of subscription of customers and to control how the target customers will be defined for the marketing campaign in the Project Setup, but ultimately the “best” approach will depend on your specific needs and your data of interest. If you’re interested in adapting this project to the specific goals and needs of your organization, roll-out and customization services can be offered on demand.