Introduction¶
Once you have designed a Flow and automated updates to it, you can deploy the project to a production environment.
Note
Development and Production Environments
A development (or sandbox) environment is where you test new analyses in your project. Failures in this environment are an expected part of its experimental nature.
A production environment is where serious operational jobs are run. This environment should be available whenever necessary and may serve external consumers for their day-to-day decisions, whether those consumers are humans or software. Failure is not an option in production, and the ability to roll-back to a previous version is critical.
Dataiku provides two dedicated nodes to handle development and production:
The Design Node is used for the development of data projects.
It provides capabilities for the creation of data pipelines and models, plus the definition of how they are meant to be reconstructed. Projects developed in the Design Node are packaged and handed off to the Automation Node.
The Automation Node is used to import packaged projects defined in the Design Node and run them in the production environment.
When you make updates to the project in the Design node, you can create an updated version of the project package, import the new package into the Automation node, and control which version of the project runs in production.
Development work from the Design node flows to the Automation node, and while it is technically possible to make changes to a project in the Automation node, those changes don’t flow back to the Design node, so it’s best practice to do all development in the Design node.
Let’s Get Started!¶
In this tutorial, you will learn how the Design and Automation Nodes work together by:
Packaging flows for deployment
Versioning flows
Deploying packages in a production environment
Note
We will work with the fictional retailer Haiku T-Shirt’s data. You can follow along with the instructions and screenshots. There are also short videos recapping the steps at the end of each section.
Prerequisites¶
This tutorial assumes that:
You have completed the Automation Quick Start tutorial (or at least have knowledge of metrics, checks, and scenarios).
You have access to a:
Dataiku Design node (version 9.0 or above). Dataiku Online is not compatible.
Dataiku Deployer, which has either been set up locally on the Design node or as a separate node.
An infrastructure has been defined on the Deployer to connect the Design node to the Automation node.
Create Your Project¶
You can use the completed project from the Automation Quick Start tutorial.
Alternatively, you can create a new project from the same point:
From the homepage of your Design node, click +New Project > DSS Tutorials > Automation > Deployment (Tutorial).
Need Help Creating the Project?
Note
You can also download the starter project from this website and import it as a zip file.