Variables in Flows, Webapps, and Dataiku Applications¶
If you have ever used a webapp that asked you to input a value, or encountered a deceptively simple project Flow with the power to swap out input column values effortlessly, you were probably witnessing variables at work!
You can use variables in your project Flows to turn high-maintenance, hard-coded information into efficient variables that can be used to develop simpler Flows and make automation tasks more robust.
In programming, variables store pieces of information, such as the name of a company, or a category ID. This piece of information can then be used many times in your code. When you use variables, you can avoid the cumbersome work of “hard coding” information.
Let’s compare two code snippets that output the same text. Although the output is the same, one is much more robust than the other.
The code snippet on the left is an example of “hard coding”. The company name, Dataiku, is hard coded, or repeated three times. Hard-coded information is maintained manually and is error-prone. Imagine if you had a typo or two in the company name, you may not be able to find all the instances of its occurrence. This could cause errors and there would be inconsistencies in your output.
The code snippet on the right includes a variable arbitrarily named
cie that stores the string value
Dataiku. Having a variable for the company name not only means more consistent output, it also means you can use this same code snippet for different values of the variable.
Defining a Variable¶
You can use variables in all Dataiku DSS visual recipes, code recipes, scenarios, and other objects in the Flow. Project variables are only available to a specific project, whereas Global variables are available to the entire Dataiku DSS instance.
In this example, merch_state and merch_category are project variables. The names are intended to be short and descriptive.
The syntax for variables typing and autotyping to use is to wrap both the name and the value in quotes and separate them with a colon. To call the variable in a recipe, use a dollar sign followed by the defined variable wrapped in curly braces.
You can easily create variables for frequently used, and frequently updated information, such as company sector, company year of creation, and logo description.
Variables in a Recipe¶
Once you’ve defined the value of your variable in your code, you can edit it once, and Dataiku DSS updates it everywhere in your Flow.
For example, this simple Flow contains a Prepare recipe and three Filter recipes. The purpose of the Flow is to create prepared output datasets that correspond to a specific geographical territory, such as a state, which can then be exported as CSV files.
There are two variables, one for geographical territory, and one for merchant category. This Flow is designed to output a CSV file by territory and merchant category, by allowing a user to simply edit the values of the project variables, then run the Flow, all without ever having to edit a recipe!
This is because the variable merch_state is referenced in a formula step within the Prepare and Filter recipes. So when the user edits the project variable with “Nevada”, Dataiku DSS updates it everywhere in the Flow.
Variables in Complex Flows¶
You can use variables in webapps and Dataiku applications to make them more robust.
To find out the difference between webapps and Dataiku applications, visit this Knowledge Base article.
A Dataiku Application is an application created using the Application Designer. Starting with Dataiku DSS 8.0, you can convert projects into reusable applications. For example, you can turn your project into a Dataiku Application that makes use of your variables by providing users with a simple interface for selecting which values they want. In this way, users do not have to understand all of the behind-the-scenes details.
For example, this application allows the user to identify which state territory they want to use to build the Flow.
To create an application like this, first define the variable in the Filter recipe of the project before saving it as a Dataiku application:
Then create a scenario in the project to build the output dataset filtered by state territory.
In the application designer, use the Edit project variables tile to reference the existing variable, “merch_state”.
Finally, add the Run scenario tile. This runs the scenario that was defined in the project.
Similarly, you can create a simple webapp that allows users to select the value of a variable. A webapp can be a code webapp or a visual webapp that you develop by writing code through the code menu in Dataiku DSS.
This webapp re-creates the functionality of the Dataiku application, where the user identifies the name of a state territory. When the user clicks Run, the webapp updates the merch_state variable, and runs the Filter recipe behind the scenes. This particular webapp has been designed to output the data in the webapp itself.