Concept: Recipes in Dataiku¶
This content is also included in the free Dataiku Academy course, Basics 102, which is part of the Core Designer learning path. Register for the course there if you’d like to track and validate your progress alongside concept videos, summaries, hands-on tutorials, and quizzes.
Recipes in Dataiku contain the transformation steps, or processing logic, that act upon datasets.
In the Flow, blue squares represent datasets. The yellow, orange, and red circles, on the other hand, which connect datasets to one another, represent recipes.
Keeping processing logic separate from datasets has a number of benefits:
One is that data storage technologies rapidly change. As these winds shift, the underlying storage infrastructure of a dataset can change (for example, switching cloud providers) without impacting the processing logic found in the recipes of a Flow.
Another is a clear sense of data lineage in a project. By looking at the Flow, you can see all actions that have been applied to the data recorded in recipes – from the raw imported data to the final output dataset.
A circle in the Flow represents a recipe, but its color represents the category of recipe. Dataiku recipes can be divided into visual, code, or plugin recipes.
Visual recipes (in yellow) accomplish the most common data transformation operations, such as cleaning, grouping, and filtering, through a pre-defined graphical user interface.
Instead of a pre-defined visual recipe, you are free to define your own processing logic in a code recipe (in orange), using a language such as Python, R, or SQL.
The third category of recipe is the plugin recipe (typically in red). A full discussion of plugins within Dataiku is outside the scope of this section, but know that they are a way for coders to extend the native capabilities of Dataiku.
If code recipes give you complete freedom to perform any data processing task, and visual recipes can be used and understood by everyone in your team, a plugin recipe combines these benefits by wrapping a visual interface on top of a code recipe.