Concept | Automation node preparation#

The Preparing for Production course discussed many important steps to ensure a project is ready for deployment to a production environment. However, much of this discussion was limited to actions taken within the project on the Design node.

If deploying a project to production within a batch framework, you’ll also need to take a few steps to ensure that the Automation node is ready to receive the project bundle.

Setting up the Automation node itself is typically the job of an instance administrator, and the details for doing so are included in the reference documentation. Dataiku Cloud users will only want to review documentation about the Automation Node on Dataiku Cloud. On top of this, there are a few additional criteria to check that are specific to the project being deployed.

Here is a sample of components concerning the Automation node that may require further attention before a successful batch deployment can occur.

Plugins#

A well-documented project includes a list of all plugins used in the project (likely in a wiki). These plugins must also be installed on the Automation node for the bundle to run successfully.

When deploying the bundle from the Deployer to the Automation node, you’ll see warnings during the bundle activation check about plugin differences between the Design and Automation nodes.

In the image below, many plugins found on the bundle’s Design node are not present in the Automation node. This, by itself, is not a problem. However, if any of these missing plugins are used in the project, that will prevent the bundle from running successfully in the Automation node.

Dataiku screenshot of missing plugins on the Automation node.

Connections#

Many organizations maintain separate data sources for the development and production stages of a project. Accordingly, it is often necessary to remap connections in a project from the development sources on the Design node to the production sources on the Automation node.

However, in order to be able to do the actual remapping, the desired connections to production databases must exist on the Automation node, and so you may need to check with your instance administrators that the correct production connections are available.

Dataiku screenshot of the connection on an Automation node.

Code environments#

Any code environment used by a project should exist on the Automation node. In most cases, the code environment import setting on the Deployer will be to create a new code environment if it finds one in the project that does not yet exist on the Automation node. As long as this is the case, there is no further action required from the user.

Dataiku screenshot of the code environment settings on the Deployer.

Note

You can learn more about code environments on Automation nodes in the reference documentation.

Dataiku Applications-as-Recipes#

Plugin recipes won’t run on the Automation node without having installed the plugin. Similarly, if a project that you are pushing to the Automation node is using a Dataiku Application-as-recipe, then the parent project that is used to create the Dataiku Application-as—recipe must also be on the Automation node.

Note

You can find a tutorial for creating a Dataiku application (including an application-as-recipe) in the Knowledge Base.

Git#

If you are importing code from Git into project libraries on the Design node, you need to ensure that the bundle on the Automation node will have access to the same code repositories.

Note

The reference documentation provides more details about importing code from git in project libraries.

What’s next?#

Now that you’ve learned about how to prepare an Automation node to receive a project bundle, you might want to read this article about the actual process for creating, deploying, and versioning project bundles using a batch method of deployment.