Outline#

Churn prediction is one of the most well known applications of machine learning and data science in the Customer Relationship Management (CRM) and Marketing fields. Simply put, a churner is a user or customer that stops using a company’s products or services.

Churn applications are common in several sectors:

  • Subscription business companies (think internet and telephone services providers): customers that are most likely to churn at the end of their subscription are contacted by a call center and offered a discount.

  • E-commerce companies (think Amazon and the like): automatic e-mails are sent to customers that haven’t bought anything for a long time, but may respond to a promotional offer.

In this tutorial, you will learn how to use Dataiku to create your own churn prediction model, based on your customer data. More precisely, you will learn how to:

  • Define churn as a data science problem (i.e. create a variable or “target” to predict).

  • Create basic features that will enable you to detect churn.

  • Train your first model and deploy it to predict future churn.

  • Learn how to deal with time-dependent features and modeling, a refined concept of machine learning and data science.

Prerequisites#

This is an advanced tutorial so if you are lost at some point, please refer to the documentation or to the Core Designer learning path. We will rely a lot on a PostgreSQL database, so you will need to set up one and the proper Dataiku connection as well. Also, a basic knowledge of SQL is required.