Concept Summary: Variables 101

We can use variables in our project Flows to turn high-maintenance, hard-coded information into efficient variables that can be used to develop simpler Flows and make automation tasks more robust.

../../../../_images/variables101.png

If you have ever used a webapp that asked you to input a value, or encountered a deceptively simple project Flow with the power to swap out input column values effortlessly, you were probably witnessing variables at work!

../../../../_images/variables-at-work.png

In programming, variables store pieces of information, such as the name of a company, or a category ID. This piece of information can then be used many times in our code. When we use variables, we can avoid the cumbersome work of “hard coding” information.

Let’s compare two code snippets that output the same text. Although the output is the same, one is much more robust than the other.

../../../../_images/hardcoded-vs-variable.png

The code snippet on the left is an example of “hard coding”. The company name, Dataiku, is hard coded, or repeated three times. Hard-coded information is maintained manually and is error-prone. Imagine if we had a typo or two in the company name–we may not be able to find all the instances of its occurrence. This could cause errors and there would be inconsistencies in our output.

The code snippet on the right includes a variable arbitrarily named cie that stores the string value Dataiku. Having a variable for the company name not only means more consistent output, it also means we can use this same code snippet for different values of the variable.

Defining a Variable

You can use variables in all Dataiku DSS visual recipes, code recipes, scenarios, and other objects in the Flow. Project variables are only available to a specific project, whereas Global variables are available to the entire Dataiku DSS instance.

Here, we’ve defined project variables, “merch_state” and “merch_category”. When creating variable names, it’s a good idea to use short, descriptive nouns.

../../../../_images/variable-definition.png

The syntax to define a variable is to wrap both the name and the value in quotes and separate them with a colon. To call the variable in a recipe, use a dollar sign followed by the defined variable wrapped in curly braces.

../../../../_images/variables-syntax.png

We can easily create variables for frequently used, and frequently updated information such as company sector, company year of creation, and logo description.

Variables in a Recipe

Once we’ve defined the value of our variable in our code, we can edit it once, and Dataiku DSS updates it everywhere in our Flow.

Let’s use an example. In this simple Flow, we have a Prepare recipe and three Filter recipes. The purpose of the Flow is to create prepared output datasets that correspond to a specific geographical territory, such as a state, which can then be exported as CSV files.

../../../../_images/without-variables.png

Upon inspecting this Flow, we can see that the territory is hard-coded. This means, to export a CSV file for a single territory, such as Nebraska, we must first edit the name of the territory in the Prepare recipe, and then edit the name of the territory in the Filter recipe.

../../../../_images/hard-coded-nevada.png

We can replace our hard-coded information using the variables we set up for the project. This will make our Flow much simpler and less error-prone. In addition, when we want to output a CSV file by territory and merchant category, all we have to do is edit the values of the project variables, then run the Flow, all without ever having to edit a recipe!

For example, we can reference our variable “merch_state” in a formula step within the Prepare recipe to output only those records for the value we set up for this variable, which is “Nevada”. Similarly, we can reference the same variable in the Filter recipe.

../../../../_images/variables-nevada.png

Variables in Complex Flows

When variables are used globally, and for more complex Flows, such as Dataiku Applications and Webapps, their benefit becomes even more apparent.

Dataiku Applications

A Dataiku Application is an application created using the Application Designer. Starting with Dataiku DSS 8.0, we can convert projects into reusable applications. For example, we can turn our project into a Dataiku Application that makes use of our variables by providing users with a simple interface for selecting which values they want. In this way, users do not have to understand all of the behind-the-scenes details.

Here, the application allows the user to identify which state territory they want to use to build the Flow.

../../../../_images/dataiku-app-merchant-info.png

In the project that was used to create the application, we can see that the variable has been defined in the Filter recipe.

../../../../_images/filter-merchant-info.png

In addition, a scenario has been created to build the output dataset filtered by state territory.

../../../../_images/scenario-merchant-info.png

The application design uses a tile called Edit project variables where the code references the existing variable, “merch_state”.

../../../../_images/tile-edit-project-variables.png

Finally, the application design uses a tile called Run scenario. This runs the scenario that was defined in the project.

../../../../_images/run-scenario-merchant-info.png

Webapps

Similarly, we can create a simple Webapp that allows users to select the value of a variable. A Webapp can be a Code Webapp or a Visual Webapp that you develop by writing code through the code menu in Dataiku DSS.

In this example, the Webapp re-creates the functionality of the Dataiku Application, where the user identifies the name of a state territory. When the user clicks Run, the Webapp updates the “merch_state” variable, and runs the Filter recipe behind the scenes. This particular Webapp has been designed to output the data in the webapp itself.

../../../../_images/webapp-variables.png

What’s next

Now you can try setting up variables in your project and using them in your visual recipes, Dataiku Applications and Webapps.