Automate the Flow#
Our Flow now has two final outputs: a dataset identifying the highest risk assets in need of maintenance and another dataset of assets still at risk, but of a somewhat lower urgency.
Imagine that new data must travel through this pipeline on a daily basis. We need a way to automate building this Flow every day to consider the new data collected by sensors during the day. Scenarios are the way to achieve this in Dataiku.
From the Jobs (
) menu in the top navigation bar, click on Scenarios.
On the Scenarios page, click + New Scenario or + Create Your First Scenario.
Name it
Daily Flow Rebuild
, and click Create.In the Steps tab of the scenario, add a Build / Train step.
Add assets_high_risk and assets_mid_risk as the datasets to build.
The “Build mode” is already set to build required datasets, and so we only need to specify the final desired output, and Dataiku will determine the best way to do that.
Save your work.

Now we know what action the scenario will complete, but when should it run?
Navigate to the Settings tab of the scenario.
Add a time-based trigger.
Configure the scenario to repeat every 1 day at midnight.
Save your work.

Now let’s test it instead of waiting for the trigger to activate!
Click the green Run button to manually trigger the scenario.
Navigate to the “Last Runs” tab to view its progress.

What happened here? If you click on the job, you’ll see that there actually was nothing for Dataiku to do. With the Build mode set to “Build required datasets”, and no new data present in the pipeline, Dataiku didn’t need to take any action to give us the requested output.
Feel free to adjust the Build mode of the Build/Train step in the scenario to “Force rebuild” just to see how it would run through building the pipeline for the scenario.
Note
The scenario shown here is the absolute most basic version. For a more in-depth look, see the Data Quality & Automation course.