Churn prediction is one of the most well known applications of machine learning and data science in the Customer Relationship Management (CRM) and Marketing fields. Simply put, a churner is a user or customer that stops using a company’s products or services.

Churn applications are common in several sectors:

  • Subscription business companies (think internet and telephone services providers): customers that are most likely to churn at the end of their subscription are contacted by a call center and offered a discount.

  • E-commerce companies (think Amazon and the like): automatic e-mails are sent to customers that haven’t bought anything for a long time, but may respond to a promotional offer.

To read more about how Dataiku’s customers fight churn, feel free to consult our Churn and Lifetime Value success stories.

In this tutorial, you will learn how to use Dataiku DSS to create your own churn prediction model, based on your customer data. More precisely, you will learn how to:

  • Define churn as a data science problem (i.e. create a variable or “target” to predict)

  • Create basic features that will enable you to detect churn

  • Train your first model and deploy it to predict future churn

  • Learn how to deal with time-dependent features and modeling, a refined concept of machine learning and data science.


This is an advanced tutorial so if you are lost at some point, please refer to the documentation or to the Basics courses. We will rely a lot on a PostgreSQL database, so you will need to set up one and the proper DSS connection as well. Also, a basic knowledge of SQL is required.