Overview#

Business case#

The customer team at Dataiku is interested in using the website logs to perform two kinds of analysis:

  • Referrer Analysis

    • determine how visitors get to our website

    • identify who are the top referrers, both in terms of volume (number of visitors) and depth (number of pages linked)

  • Visitor Analysis

    • segment visitors according to how they engage with the website

    • map these segments to known customers in order to feed them into the right channels (Marketing, Prospective Sales, and Sales)

Supporting data#

This use case is based on two input data sources. The downloadable archives are found below:

  • Web Logs: The Dataiku website logs, spanning 2 months, that contain information about each individual pageview on the website.

  • CRM: A simulated Customer Relationship Management (CRM) database containing transactional and demographic data about our clients.

Workflow overview#

The final Dataiku workflow should look like the image below.

../../../_images/flow32.png

You will go through the following high-level steps:

  • Upload the datasets

  • Clean up and enrich the log data

  • Use visual grouping recipes

  • Run a clustering model to build segments

  • Join the CRM and segment data for known visitors

  • Customize and split dataset by segments

Prerequisites#

You should be familiar with:

  • Core Designer learning path

  • Machine learning in Dataiku

Technical requirements#