Tutorial | Scenario reporters#

Once you have scenarios automating actions in Dataiku, you may want to attach reporters to those scenarios to send alerts through various messaging channels.

Get started#

Objectives#

In this tutorial, you will:

  • Log scenario results in a Dataiku dataset.

  • Distribute scenario results via a messaging channel, such as email.

  • Include an attachment in a reporter.

Prerequisites#

To reproduce the steps in this tutorial, you’ll need:

  • Access to an instance of Dataiku 12+.

  • Basic knowledge of Dataiku (Core Designer level or equivalent).

  • For the second example, you’ll also need an administrator to grant you access to an active messaging channel. This tutorial uses a mail reporter, but other kinds of channels work similarly.

Create the project#

  1. From the Dataiku Design homepage, click + New Project > DSS tutorials > Advanced Designer > Scenario Reporters.

  2. From the project homepage, click Go to Flow.

Note

You can also download the starter project from this website and import it as a zip file.

You’ll next want to build the Flow.

  1. Click Flow Actions at the bottom right of the Flow.

  2. Click Build all.

  3. Keep the default settings and click Build.

Use case summary#

The project has three data sources:

Dataset

Description

tx

Each row is a unique credit card transaction with information such as the card that was used and the merchant where the transaction was made.

It also indicates whether the transaction has either been:

  • Authorized (a score of 1 in the authorized_flag column)

  • Flagged for potential fraud (a score of 0)

merchants

Each row is a unique merchant with information such as the merchant’s location and category.

cards

Each row is a unique credit card ID with information such as the card’s activation month or the cardholder’s FICO score (a common measure of creditworthiness in the US).

Why reporters?#

Scenarios trigger actions that run in the background. Accordingly, you’ll want to monitor the performance of these actions in some way.

  • One option is the Automation Monitoring page found in the Jobs menu of the top navigation bar.

  • You may also include a scenario insight on a dashboard showing the latest results.

However, for more direct alerts about scenario activities, you’ll want to become confident customizing reporters according to your organization’s preferred messaging channels, such as mail, Slack, Microsoft Teams, Webhooks, or Twilio.

Send scenario results to a Dataiku dataset#

This project includes a scenario that attempts to verify data quality rules (or, for pre-12.6 users, run checks) on a dataset in the pipeline. If successful, the scenario proceeds to rebuild a downstream dataset.

Before configuring a reporter to send the results of this scenario to a messaging channel, let’s demonstrate sending that kind of information to a Dataiku dataset.

Create a dataset to receive scenario results#

First we need to create a Dataiku dataset to which we can write scenario results.

Important

This dataset must be on a writable connection. That is, Allow write must be enabled in the Usage params section of the connection settings.

  1. From the Flow, select + Dataset > Internal > Managed dataset.

  2. Name it scenario_results.

  3. Store it into a writable connection available on your instance.

  4. Click Create.

Dataiku screenshot of the dialog to create a managed dataset.

Configure the schema of the receiving dataset#

Next, we need to create the schema for the dataset that is going to document the scenario results with the following requirements:

  • It must include one date column to hold the timestamp of the scenario run.

  • Other columns must match the variables that we want to log.

  1. On the scenario_results dataset, navigate to the Schema subtab.

  2. Click + Add Column.

  3. Name the first column timestamp, and change the storage type from a string to a date.

  4. Click + Add Column again, and name the second column scenario.

  5. Click + Add Column once more, and name the third column status.

  6. Click Save.

Dataiku screenshot of the schema tab of a Dataiku dataset to receive scenario results.

Add a “Send to dataset” reporter#

Now that this dataset exists in the project, we can create a reporter to write data to this dataset after each scenario run.

  1. From the Jobs menu of the project, navigate to the Scenarios page.

  2. Open the Data Refresh page.

  3. On the Settings tab, click Add Reporter.

  4. Select Send to dataset.

Let’s configure the contents of the reporter next.

  1. Name the reporter Store scenario results.

  2. Turn Off the run condition to report all results, regardless of the scenario’s success or failure.

  3. Provide the project key found in your URL.

  4. Provide the dataset name scenario_results (matching the name of the dataset you just created).

  5. Provide timestamp as the name of the Timestamp column.

  6. Copy-paste the JSON below for the other two columns found in the schema.

    {
      "scenario": "${scenarioName}",
      "status": "${outcome}"
    }
    
  7. Click Save to activate the reporter.

  8. Click Run to launch the scenario manually.

Dataiku screenshot of a Send to dataset reporter.

Tip

Browse the Available variables section for a list of other variables that can be included in reporters. To use them, be sure to first update the schema of the receiving dataset.

Check the results#

Let’s make sure the scenario results were written to the dataset we’ve created.

  1. Navigate to the Last runs tab of the scenario to see the reporter activity listed in the most recent run log.

  2. From the Flow, open the scenario_results dataset to see a record of the most recent scenario run.

Dataiku screenshot of a dataset receiving scenario results.

Note

Depending on your location, you may notice that the timestamp column does not match the local time when you triggered the scenario. Timestamps are recorded in the UTC time standard. To update the timestamp, use the Format date processor in a Prepare recipe. See the reference documentation for more on Managing dates.

Tip

To learn more about why this scenario is failing, see Tutorial | Data quality.

Send scenario results to a messaging channel#

The “Send to dataset” reporter can be helpful in creating this kind of log in a Dataiku project. Reporters though are most often used for sending alerts outside of Dataiku. For this, we’ll need a messaging channel.

We’ll demonstrate using a mail reporter, but the process will work similarly if you have other channels configured.

Important

See the reference documentation to create a messaging channel. You’ll need admin access to do so.

  1. From the Jobs menu of the project, navigate to the Scenarios page.

  2. Open the Data Refresh page.

  3. On the Settings tab, click Add Reporter.

  4. Select your messaging channel. We’ll use Mail in this tutorial.

Configure a reporter with a messaging channel#

Let’s first make sure the messaging channel is working.

  1. Name it Mail reporter.

  2. Once again, turn Off the run condition.

  3. For Channel, select your messaging channel.

  4. Provide your email address in the To field.

  5. Click Run to launch the scenario now including a basic reporter with the default template file.

  6. Check your email for the result.

Dataiku screenshot of the settings for a mail reporter.

Use an inline message source#

The previous exercise used a default template file to format the message. However, we can also customize the contents of the message.

  1. Return to the Mail reporter of the Data Refresh scenario.

  2. Switch the message source to Inline.

  3. Copy-paste the following text into the message field. You can also customize it with more variables.

    Here is the summary:
    ${allEventsSummary}
    
    See the scenario run for more info:
    ${scenarioRunURL}
    
  4. Click Run to launch the scenario again.

  5. Check your email once more for the result.

Dataiku screenshot of a scenario reporter with an inline message.

Use a run condition#

Currently, this reporter delivers an email after every scenario run. However, we can also exert more control with run conditions. For example, we may want to send different messages for different scenario outcomes.

Let’s only send an email if the scenario run is not successful, which in this case is true.

  1. Return to the Mail reporter of the Data Refresh scenario.

  2. Toggle ON the run condition. It should say outcome != 'SUCCESS'.

  3. Click Save.

  4. Click Run.

  5. You should receive another email as we know the scenario is not successful.

Dataiku screenshot of the run condition of a scenario mail reporter.

Tip

The run condition uses == for equals and != for not equals. In addition to SUCCESS, other possible states are FAILED or ABORTED.

Add an attachment to a reporter#

Reporters also allow us to add attachments. This can be particularly useful for various kinds of exports.

  1. Return to the Mail reporter of the Data Refresh scenario.

  2. Click Add Attachment > Dataset data.

  3. For the attached dataset, select scenario_results.

  4. Click Save.

  5. Click Run to launch the scenario.

  6. Check your email to find the CSV file included as an attachment in the latest mail.

Dataiku screenshot of an attachment in a reporter.

Tip

If you navigate to the Steps tab of a scenario and browse the available types of steps, you’ll notice many types of export steps, as well as an option for “Send message.” Depending on your specific use case, you have the flexibility to relay scenario information as reporters or as steps in the scenario.

What’s next?#

Congratulations on taking your first steps with scenario reporters! Continue experimenting to learn how to send the most relevant alerts to the correct individuals at the best time.

Tip

If your organization uses other messaging channels, such as Jira, see Tutorial | Webhook reporters in scenarios to demonstrate how to send scenario messages to any API.

See also

See the reference documentation to learn more about Reporting on scenario runs.