Concept: Recipes in DSS

Note

Recipes in DSS contain the transformation steps, or processing logic, that act upon datasets.

In the Flow, blue squares represent datasets. The yellow, orange, and red circles, on the other hand, which connect datasets to one another, represent recipes.

../../../_images/flow-recipes.png

Keeping processing logic separate from datasets has a number of benefits:

  • One is that data storage technologies rapidly change. As these winds shift, the underlying storage infrastructure of a dataset can change (for example, switching cloud providers) without impacting the processing logic found in the recipes of a Flow.

  • Another is a clear sense of data lineage in a project. By looking at the Flow, you can see all actions that have been applied to the data recorded in recipes – from the raw imported data to the final output dataset.

A circle in the Flow represents a recipe, but its color represents the category of recipe. DSS recipes can be divided into visual, code, or plugin recipes.

Visual recipes (in yellow) accomplish the most common data transformation operations, such as cleaning, grouping, and filtering, through a pre-defined graphical user interface.

Instead of a pre-defined visual recipe, you are free to define your own processing logic in a code recipe (in orange), using a language such as Python, R, or SQL.

The third category of recipe is the plugin recipe (typically in red). A full discussion of plugins within DSS is outside the scope of this section, but know that they are a way for coders to extend the native capabilities of DSS.

If code recipes give you complete freedom to perform any data processing task, and visual recipes can be used and understood by everyone in your team, a plugin recipe combines these benefits by wrapping a visual interface on top of a code recipe.

../../../_images/recipes-slide.png