Concept: Variables 101

In this lesson, we’ll introduce the concept of variables in Dataiku, and focus on how project variables can be used in visual components, such as visual recipes, in order to make tasks more efficient and robust.

We’ll discuss:

  • the benefits of variables in general,

  • different types of variables in Dataiku,

  • different components where they can be used,

  • and, most importantly, the syntax for defining and calling them.

Why Variables?

Programmers will already be familiar with the benefits of variables. They are a way to avoid hard-coding values that may change, and then can be reused in many places.

Spreadsheet users will understand the same principle. A formula points to a particular cell, whatever its actual value may be, rather than hard-coding the cell’s present value.

The motivation for using variables in Dataiku is the same.

Slide depicting the concept of variables for programmers and spreadsheet users.

Types of Variables in Dataiku

Our focus here is project variables, but note that there are a few different kinds of variables in Dataiku:

  • Instance-level global variables accessible to administrators. These might be used to store an API key needed in several different projects on the instance.

  • Project variables for use anywhere in the project. We’ll focus on the “global” variety, but there are also project variables that remain “local” to the instance. In other words, they are not exported when bundling the project. (Think of API credentials that you don’t want to share).

  • Scenario-level variables that are not persisted after the scenario ends.

Slide depicting types of variables in Dataiku.

Where Variables can be Used in Dataiku

Variables can be used in many different components throughout Dataiku:

  • You can use variables in visual components, such as visual recipes, but also scenarios and Dataiku applications.

  • You can use variables in components where you can write your own code, such as code recipes, notebooks, and webapps.

Slide depicting components in Dataiku where variables can be used.

Variable Syntax in Dataiku

Let’s now look at the syntax for defining variables in Dataiku.

You define project variables from the Variables page within the “More options” menu from the top navigation bar.

Variables should be defined as a JSON object. That means every project variable should be defined as a key-value pair, separated by commas, and wrapped within curly braces.

Here are two examples of project variables, one defined as a string and the other an integer.

Slide depicting variable syntax in Dataiku.

Variable Demonstration

Once a variable is defined, you can call its value where it is needed. Let’s start with the example of a Formula in a Prepare recipe.

When defining a formula, just type $ to pull up a list of accessible variables.

Dataiku screenshot of an example showing how to find available variables.

After making a selection, the editor includes the necessary curly braces around the variable name. That’s it! You now have replaced a hard-coded value with a variable.

Dataiku screenshot of an example showing how to use an integer variable in a Prepare recipe processor.

It’s helpful though to have a better understanding of how Dataiku actually evaluates variables.

The dollar sign, and the following variable name within the curly braces, gets replaced with the value stored in the variable.

Let’s demonstrate by calling a variable stored as a string. When Dataiku evaluates ${my_state}, it replaces it with the value of the variable, in this case New York.

Dataiku screenshot of an example showing how a string variable in a Prepare recipe processor needs to be wrapped in quotes.

But now we have an unquoted string value in our Formula. We need to add an extra set of quotation marks around the dollar sign and curly braces to re-quote the string value.

Dataiku screenshot of an example showing how to use a string variable in a Prepare recipe processor.

Now our Formula is fixed and flags the intended rows!

What’s Next?

Now that you know the basics, you can start using project variables in places such as Formulas, pre- and post-filters of visual recipes, as well as in scenarios, and Dataiku applications.