Tutorial | Update a project deployment automatically (MLOps part 3)

As your MLOps setup becomes more sophisticated, you can rely on automation to do more. You can run scenarios that not only monitor model performance or data drift, but also actually retrain models based on this information.

However, it’s also possible to go one step further once a deployment is created. You can also automatically create new bundles and update project deployments when certain conditions are met.

This level of automation may become necessary when deploying very large numbers of models in many projects. To do this successfully though, you need to have mastered the fundamentals—i.e. robust metrics and checks to know with certainty that the model you are redeploying is truly better than the existing one.


In this tutorial, you will:

  • Design a scenario that creates a new bundle and updates an existing project deployment when a certain condition is met.

Starting here?

This section requires having created the scenario from Part 1 and a bundle deployment in Part 2, so you must complete those sections in order to reproduce the steps here.

Start with a retrain model scenario

Let’s start by duplicating the scenario that retrains the model if the data drift metric fails. In other words, this scenario retrains the model when our chosen metric in the model evaluation store exceeds the specified threshold.

  • Navigate to the Scenarios page from the top navigation bar.

  • Check the box to the left of the Retrain Model scenario to open the Actions tab, and click Duplicate.

  • Name it Retrain Model & Deploy, and click Duplicate.

Dataiku screenshot for duplicating a scenario.

Add a create bundle step

In the current scenario, the step that retrains the model runs only if a previous step (in our case the MES check) fails. However, the box is ticked to reset the failure state, and so this scenario can continue with other steps.

Let’s proceed with creating the bundle in cases where a new model is retrained.

  • In the Retrain Model & Deploy scenario, add a Create bundle step from the Deployer section.

  • Name it Create auto_deploy bundle.

  • Provide the bundle id auto_deploy.

  • Check the box to Make bundle id unique. Instead of v1, v2, etc, as we previously chose manually, our bundle ids will be “auto_deploy”, “auto_deploy1”, etc.

  • Provide the target variable bundleid.

  • Check the box to Publish on Deployer.

  • Choose the present project from the existing deployments as the Target project (selected by default).

Dataiku screenshot of the create bundle step.


The help note at the top of this step indicates that the new bundle will include any additional data defined in the Bundles page. If you navigate to the Bundles page, click Configure Content to see what data will be included in the automatically-created bundles.

Add an update project deployment step

As we have seen in the process for batch deployment, once we have a bundle, we need to deploy it. There’s a scenario step for this too!

  • In the Retrain Model & Deploy scenario, add an Update project deployment step from the Deployer section.

  • Name it Update auto_deploy.

  • Provide the Deployment id, which takes the form of <PROJECTKEY>-on-<infrastructure>. Click on the field or start typing to see available options.

  • Provide the new bundle id as ${bundleid}. Be sure to use the variable syntax here since this references the target variable in the previous step.

  • Click Save.

Dataiku screenshot of the update deployment step.

Run the scenario & observe the outcome

Let’s imagine that some specified unit of time has passed, triggering the scenario to run.

  • From the project on the Design node, click to manually Run the Retrain Model & Deploy scenario.

  • Observe its progress in the Last Runs tab of the scenario.

Dataiku screenshot of the last run tab having automatically deployed a new bundle.

With no new data in this situation, we already know the check on data drift in the model evaluation store will fail, and so we can anticipate the outcome.

First, there should be a new auto_deploy bundle on the Design node, and it should be the active bundle on the Deployer and Automation nodes.

Dataiku screenshot of the Project Deployer showing the new bundle deployed.

Second, the project on both the Design and Automation nodes should have a new active version of the saved model found in the Flow (the version number may differ depending on how many times you’ve run the scenario).

Dataiku screenshot of the saved model on the Automation node.


Run the scenario again to see how the bundle ID increments to auto_deploy1, and so on.

Plan for a more robust setup

To be sure, this scenario is not ready for a live MLOps setup. It’s intended only to demonstrate how you can use Dataiku to achieve your MLOps goals. That being said, let’s discuss a few ways you could make this setup more robust to handle the challenges of live production.

Add more metrics & checks

This scenario triggered the model rebuild based on the failure of one check based on a model evaluation store metric.

Depending on our Flow, it’s likely that we also want to create metrics and checks on other on upstream objects, such as datasets or managed folders. If upstream checks fail, we can circumvent the model retraining cycle.

We might also want to implement metrics and checks on the saved model itself to determine whether it is better than a previous version.

Keep a human in the loop

Even after adding a sufficient level of metrics and checks, we might never want to automatically deploy a bundle. Our scenario might stop at creating the new bundle, alerting a team member with a reporter, but leaving the job of updating the deployment to a human.

Add more stages of deployment infrastructure

In this example, we had only one lifecycle stage of deployment infrastructure. However, in a real setup, it would be common to have multiple stages, such as the default “Dev”, “Test”, and “Prod”, as shown in the tutorial on deploying a real-time API.

Our scenario might automatically update a deployment in the “Dev” stage, but require a human to push the deployment to the “Test” or “Prod” stages.

Next steps

Congratulations! You’ve created a scenario that can automatically update a batch deployment.

While this level of automation may not always be desirable (or advisable), it hints at what’s possible using only very simple building blocks.

Now that you have seen the batch deployment framework, move on to the methods for real-time API scoring.


For more information, please refer to the reference documentation on MLOps or production deployments.