Concept | Scenarios#

Watch the video

Scenarios are key to automating tasks related to your Dataiku project. Let’s learn about the different types of scenarios and their various components.

Use cases#

Automation scenarios are a set of actions that are scheduled to run when certain conditions are satisfied. They are most useful when automating various kinds of tasks when a project is in production. For example:

  • If new data arrives on a regular basis, a scenario can rebuild the Flow once per day or each time it detects a dataset change.

  • If a metric for a machine learning model falls outside a specified threshold range, a scenario can be triggered to retrain the model.

  • Scenarios can also automate administrative tasks such as cleaning logs or starting and stopping a cluster.

Scenario types#

There are two types of scenarios in Dataiku.

  • Step-based, where scenario steps are configured using the visual interface.

  • Code-based, where the set of actions performed are fully defined by Python code.

Note

Learn more about custom Python script scenarios in our article on custom scenarios.

Dataiku screenshot of the dialog for creating a new scenario.

Scenario components#

Scenarios consist of three main components.

  1. Steps that are actions configured by the user.

  2. Triggers that define when to execute a scenario.

  3. Reporters that send information or alerts about a scenario via a variety of channels.

Slide depicting the three main components of a scenario.

Scenario steps#

Scenario steps let you control what the scenario will do. Common scenario steps include:

  • Building or clearing a dataset.

  • Training a model.

  • Verifying data quality rules or running checks.

  • Sending messages.

  • Refreshing the cache of charts and dashboards.

  • Exporting documentation of the Flow or models.

Dataiku screenshot of the Add Steps options in a scenario.

Scenario steps run sequentially. However, you can control whether a step runs based on the outcome of a data quality rule or a check.

Note

All available scenarios steps are defined in the reference documentation.

Scenario triggers#

Triggers allow users to define a condition or set of conditions that, if satisfied, start a scenario. Each trigger can be enabled or disabled.

Trigger

Description

Time-based

This will launch the scenario at regular intervals.

Example: Repeat every 30 minutes.

Dataset change

This starts a scenario whenever a change is detected in the dataset. This type of trigger is used for filesystem-based datasets.

SQL query change

This runs a query at a specified interval and starts the scenario when the output of the query changes with respect to the last execution of the query.

Custom (Python)

This will execute a custom Python script that activates a trigger.

Note

Different types of triggers may be available depending on your license.

Dataiku screenshot of different triggers available in a scenario.

Reporters#

Dataiku lets you add reporters to a scenario to inform users about scenario activities through email and other channels. Reporters can be sent when a scenario starts or ends on the condition that it succeeds or fails.

Reporters operate through several channels, including:

  • Mail

  • Slack

  • Microsoft Teams

  • Webhook

  • Twilio

  • Shell command

  • Send to dataset

Dataiku screenshot of different reporters available in a scenario.

What’s next?#

To learn more about scenarios and try hands-on tutorials, please register for the free Academy course on this subject found in the Advanced Designer learning path.

Note

You can also find more information about scenarios in the reference documentation.