Version a deployed project#

As you monitor the health of deployments, you’ll need to deploy updated versions of projects over time — especially since this one is already failing!

Where to make changes to a project#

When it is necessary to make a change to a deployed project, it’s critical to make all such changes in the development environment (the Design node), and then push a new bundle to the production environment (the Automation node).

It may be tempting just to make a quick change to the project on the Automation node, but you should avoid this temptation, as the project in the production environment would no longer be synced with its counterpart in the development environment.

Consider a situation where you want to revert back to an earlier version of the project. If you’ve made changes in the Automation node, these changes will be lost. Accordingly, actual development should always happen in the Design node, and new versions of bundles should be pushed from there.

Fix the failing scenario#

Taking this advice, let’s return to the Design node project. There are actually three changes that should be made.

Edit the data quality rule#

We first need to fix the failing data quality rule. Instead of an error, let’s reduce it to a warning.

  1. On the Design project, navigate to the Data Quality tab of the tx_prepared dataset.

  2. Click the pencil to edit the “Record count is above 50000” rule.

  3. Check the box to Auto-compute metric.

  4. Turn off the Min setting, and turn ON a Soft min of 50000 to produce a warning instead of an error.

  5. Click Run Test to confirm the warning.

  6. Click Save.

Dataiku screenshot of a data quality rule.

Add a Compute metrics step to the scenario#

We can also add a step in the scenario to explicitly compute metrics.

  1. Navigate to the Steps tab of the Data Refresh scenario.

  2. Click Add Step

  3. Select Compute metrics from the list of steps.

  4. Drag the metrics step to the first position.

  5. Click Add Dataset to Compute > tx_prepared > Add.

  6. Click Save.

  7. As a matter of good practice, click Run to make sure it returns the expected warning result.

Dataiku screenshot of a compute metrics step in a scenario.

Enable the scenario trigger#

Finally, let’s turn on the time-based trigger, but not the auto-trigger for the scenario itself. This way, once we activate the scenario’s auto-triggers in the production environment, it will begin running.

  1. Navigate to the Settings tab of the Data Refresh scenario.

  2. Turn On the Time-based trigger.

  3. Verify the Auto-triggers remain Off.

  4. Click Save.

Dataiku screenshot of scenario trigger settings.

Create a second bundle#

Now let’s demonstrate the process for updating an existing deployment with a new bundle.

  1. From the Bundles page on the Design node project, click + New Bundle.

  2. Name it v2.

  3. In the release notes, add fixed scenario.

  4. Click Create.

Dataiku screenshot of the second version of the project bundle.

Note

Note how when creating the second bundle, the configuration of the previous one is inherited. In this case, the uploaded dataset and the managed folder are already included.

Deploy the new bundle#

The process for deploying the new bundle is the same as for the first one.

  1. Click on the newly-created v2 bundle, and click Publish on Deployer.

  2. Confirm that you indeed want to Publish on Deployer.

  3. Click to Open in Deployer to view the bundle details on the Deployer.

  4. Once on the Deployer, click Deploy on the v2 bundle.

    Dataiku gives the option to create a new deployment or update the existing one.

  5. Since this is a new version of an existing deployment, verify Update is selected, and click OK.

  6. Click OK again to confirm the deployment you want to edit.

Dataiku screenshot for updating a deployed bundle.

We’re not done yet!

  1. Navigate to the Status tab of the deployment, and note how Dataiku warns that the active bundle on the Automation node does not match the configured bundle.

  2. Click the green Update button to deploy the new bundle. Then Confirm.

  3. Navigate to the Deployments tab of the Project Deployer to see the new bundle as the currently deployed version of this project.

Dataiku screenshot of the Deployer showing a second version of the deployment.

Activate a scenario from the Deployer#

Previously we activated the scenario directly from the Automation node project. Now let’s control it from the Project Deployer.

  1. Navigate to the Settings tab of the deployment.

  2. Uncheck the box for Disable automatic triggers.

  3. Click Activate All to enable the auto-triggers for any scenario.

  4. Click Save and Update and then Confirm.

Dataiku screenshot of the scenarios panel of a deployment.

Tip

Once you’ve done this, verify the v2 scenario runs and produces a warning. You can check this on the Automation project, the Project Deployer, or on the Unified Monitoring page (depending on the synchronization interval).

Revert to a previous bundle#

It’s also important to be able to revert to an earlier version, should a newer bundle not work as expected. Let’s demonstrate that now.

  1. From the Deployments tab of the Deployer, find the project in the left hand panel.

  2. Click Deploy next to the v1 bundle.

  3. With Update selected, click OK, and confirm this is correct.

  4. Now on the Settings tab with v1 as the source bundle, click the green Update button, and Confirm the change.

Dataiku screenshot of the dialog to revert a bundle.

Important

If you return to the Status tab of this deployment, or open the project homepage on the Automation node, you’ll see that v1 is once again the active bundle running in production.

Before signing off, be sure to disable automatic triggers for this deployment either from the Project Deployer or the Automation project!

See also

See the reference documentation to learn more about reverting bundles.

What’s next?#

Congratulations! To recap, in this tutorial, you:

  • Created a project bundle on the Design node.

  • Published a bundle to the Automation node via the Deployer.

  • Activated (and disabled) a scenario to run on the Automation node.

  • Saw where to monitor the health of deployments.

  • Switched bundle versions within a deployment.

Now that you have seen the batch processing framework for production, your next step may be to examine the real-time API scoring method of production presented in the API Deployment course.

See also

For more information on batch processing, please refer to the reference documentation on Production deployments and bundles.