Concept | Recipes in Dataiku¶
Recipes in Dataiku contain the transformation steps, or processing logic, that act upon datasets.
In the Flow, blue squares represent datasets. The yellow, orange, and red circles, on the other hand, which connect datasets to one another, represent recipes.
Keeping processing logic separate from datasets has a number of benefits:
One is that data storage technologies rapidly change. When this happens, you can change the underlying storage infrastructure of a dataset (for example, switching cloud providers) without impacting the processing logic found in the recipes of a Flow.
Another is a clear sense of data lineage in a project. By looking at the Flow, you can see all actions that have been applied to the data recorded in recipes – from the raw imported data to the final output dataset.
A circle in the Flow represents a recipe, but its color represents the category of recipe. Dataiku recipes can be divided into visual, code, or plugin recipes.
Visual recipes (in yellow) accomplish the most common data transformation operations, such as cleaning, grouping, and filtering, through a pre-defined graphical user interface.
Instead of a pre-defined visual recipe, you are free to define your own processing logic in a code recipe (in orange), using a language such as Python, R, or SQL.
The third category of recipe is the plugin recipe (typically in red). A full discussion of plugins within Dataiku is outside the scope of this section, but know that they are a way for coders to extend the native capabilities of Dataiku.
If code recipes give you complete freedom to perform any data processing task, and visual recipes can be used and understood by everyone in your team, a plugin recipe combines these benefits by wrapping a visual interface on top of a code recipe.
In this lesson, you learned about recipes and how they can be used to keep processing logic separate from datasets used in a Dataiku project. Continue getting to know the basics of Dataiku by learning about the Prepare recipe.