Tutorial | Custom script scenarios#

In addition to inserting code into step-based scenarios, it is also possible to create a scenario entirely with a Python script.

These custom script scenarios can execute everything that step-based scenarios can, while allowing you to fully configure it and integrate advanced logic capabilities.

Get started#

Objectives#

In this tutorial, you will:

  • Create a custom script scenario to automate actions in Dataiku according to your needs.

Prerequisites#

To reproduce the steps in this tutorial, you’ll need:

  • Dataiku 12.0 or later.

  • An Advanced Analytics Designer or Full Designer user profile.

  • Basic knowledge of Dataiku (Core Designer level or equivalent).

  • Basic knowledge of Python code.

Create the project#

  1. From the Dataiku Design homepage, click + New Project.

  2. Select Learning projects.

  3. Search for and select Custom Script Scenarios.

  4. Click Install.

  5. From the project homepage, click Go to Flow (or g + f).

Note

You can also download the starter project from this website and import it as a zip file.

Create a custom script scenario#

Although it is possible to do attach a reporter to a step-based scenario, the custom script scenario may be helpful for sending more customized reports based on performance or quality, for example.

Let’s create a scenario that sends a custom report based on a metric.

  1. From the Jobs menu in the top navigation bar, click Scenarios.

  2. Click + New Scenario.

  3. Select the Custom Python script option.

  4. Name it Mail Sender.

  5. Click Create.

Dataiku screenshot of the dialog for creating a new scenario.

Note

You can notice here that the usual Steps tab has been replaced by a Script tab.

Retrieve the data#

Let’s retrieve the data from the metric and check what interests us with the API.

  1. Click on the Script tab.

    Note

    Dataiku already presents by default a code sample that sends a custom report based on a ROC curve result after a model training. We can build our script on top of it.

  2. Delete the default code sample.

  3. Copy-paste the code below:

    from dataiku.scenario import Scenario
    from dataiku import Dataset
    
    # The Scenario object is the main handle from which you initiate steps
    scenario = Scenario()
    
    # Building a dataset
    scenario.build_dataset("tx_prepared")
    # Computing the metrics
    scenario.compute_dataset_metrics("tx_prepared")
    
    # Getting the metric
    dataset = Dataset("tx_prepared")
    metrics = dataset.get_last_metric_values()
    max_auth_flag_metric = metrics.get_metric_by_id("col_stats:MAX:authorized_flag")['lastValues'][-1]['value']
    

This block should retrieve the last computed metric within our scenario. The next step is to build a sender.

Create the report sender#

  1. Copy-paste this code-block following the previous code:

    # Calling the sender
    sender = scenario.get_message_sender("messaging-channel-id")
    sender.set_params(sender="[email protected]", recipient="[email protected]")
    
  2. Replace the string parameters by your own parameters such as:

    • messaging-channel-id: the unique ID given the messaging channel.

    • mail-from@your-company.com: the email address you want to send the message from.

    • mail-to@your-company.com: the email addresses you want to send the message to. Note that you can add several addresses in the array.

Send the message#

We can now do a check on the retrieved metric and send the message according to the result:

  1. Add the following code-block into the end of your script:

    # Checking on the metric
    if int(max_auth_flag_metric) == 1:
       sender.send(subject="The scenario is doing well", message="All is good.")
    else:
       sender.send(subject="Metric is inconsistent", message="There is a problem with the data, you should check it.")
    

    Note

    The condition is pretty simple: We expect the maximum of authorized flag as superior to 0. Hence, if it’s not the case we send a report to our mail that the data may be inconsistent from what it should be. The message is basic but Dataiku allows you to send it with JSON to add variables and increase the customization. See the documentation to learn more.

  2. Click Run to trigger the scenario.

  3. Navigate to the Last Runs tab as you would for a step-based scenario to view the run’s progress.

Note

You can’t set a reporting condition based on the outcome of the scenario directly in the script. Follow this article to see how to do it through the UI.

What’s next?#

Congratulations! You created your first custom scenario only based on a script. There are numerous automation possibilities thanks to the Python API.

See also

You can learn more about scenarios in the reference documentation.