Business Case

The customer team at Dataiku is interested in using the website logs to perform two kinds of analysis:

  • Referrer Analysis

    • determine how visitors get to our website

    • identify who are the top referrers, both in terms of volume (number of visitors) and depth (number of pages linked)

  • Visitor Analysis

    • segment visitors according to how they engage with the website

    • map these segments to known customers in order to feed them into the right channels (Marketing, Prospective Sales, and Sales)

Supporting Data

This use case is based on two input data sources. The downloadable archives are found below:

  • Web Logs: The Dataiku website logs, spanning 2 months, that contain information about each individual pageview on the website.

  • CRM: A simulated Customer Relationship Management (CRM) database containing transactional and demographic data about our clients.

Workflow Overview

The final Dataiku DSS workflow should look like the image below. You can also follow along with the completed project in the Dataiku gallery.


You will go through the following high-level steps:

  • Upload the datasets

  • Clean up and enrich the log data

  • Use visual grouping recipes

  • Run a clustering model to build segments

  • Join the CRM and segment data for known visitors

  • Customize and split dataset by segments


You should be familiar with:

  • The Basics courses

  • Machine learning in Dataiku DSS

Technical Requirements