Concept | Automation scenarios#

Watch the video

Scenarios are key to automating tasks related to your Dataiku project. Let’s learn about the different types of scenarios and their various components.

Use cases#

Automation scenarios are a set of actions that are scheduled to run when certain conditions are satisfied. They are most useful when automating various kinds of tasks when a project is in production. For example:

  • If new data arrives on a regular basis, a scenario can rebuild the Flow once per day or each time it detects a dataset change.

  • If a metric for a machine learning model falls outside a specified threshold range, a scenario can be triggered to retrain the model.

  • Scenarios can also automate administrative tasks such as cleaning logs or starting and stopping a cluster.

Scenario types#

There are two types of scenarios in Dataiku.

  • Step-based, where scenario steps are configured using the visual interface.

  • Code-based, where the set of actions performed are fully defined by Python code.

Dataiku screenshot of the dialog for creating a new scenario.

See also

Learn more about custom Python script scenarios in Concept | Custom metrics, checks, data quality rules & scenarios.

Scenario components#

Scenarios consist of three main components:

Component

Function

Steps

Define the series of actions to take when the scenario executes.

Triggers

Control when a scenario executes.

Reporters

Send alerts about a scenario’s activity via a variety of information channels.

Slide depicting the three main components of a scenario.

Scenario steps#

Scenario steps let you control what the scenario will do. Common scenario steps include:

  • Building or clearing a dataset.

  • Training a model.

  • Verifying data quality rules or running checks.

  • Sending messages.

  • Refreshing the cache of charts and dashboards.

  • Exporting documentation of the Flow or models.

Dataiku screenshot of the Add Steps options in a scenario.

Scenario steps run sequentially. However, you can control whether a step runs based on the outcome of a data quality rule or a check.

See also

See Scenario steps in the reference documentation for the complete list of steps.

Scenario triggers#

Triggers allow users to define a condition or set of conditions that, if satisfied, start a scenario run. Each trigger can be enabled or disabled.

Trigger

Description

Time-based

Launches the scenario at regular intervals (such as daily).

Dataset change

Starts a scenario whenever a change is detected in the dataset. This type of trigger is used for filesystem-based datasets.

SQL query change

Runs a query at a specified interval and starts the scenario when the output of the query changes with respect to the last execution of the query.

Custom (Python)

Executes a custom Python script that activates a trigger.

Note

Different types of triggers may be available depending on your license and profile.

Dataiku screenshot of different triggers available in a scenario.

Reporters#

Dataiku lets you add reporters to a scenario to inform users about scenario activities through email and other channels. Reporters can be sent when a scenario starts or ends on the condition that it succeeds or fails.

Reporters operate through several channels, including:

  • Mail

  • Slack

  • Microsoft Teams

  • Webhook

  • Twilio

  • Shell command

  • Send to dataset

Dataiku screenshot of different reporters available in a scenario.

What’s next?#

Now that you have a better understanding of step-based scenarios in Dataiku, try this out for yourself in Tutorial | Automation scenarios!

See also

Find more information about Automation scenarios in the reference documentation.