Visual Recipes

Learn more about many of the natively-available visual recipes for data preparation.

How-to | Copy a recipe in the Flow

Do you have recipes that you want to re-use elsewhere in a project? You can copy recipes from the Flow for use in the same project.

From the Flow, click the recipe you want to copy, and a Copy action will appear in the Actions sidebar on the right.

../../_images/copy-recipe-1.png

You will be asked to choose on which dataset the recipe should be applied before the recipe is copied.

../../_images/copy-recipe-2.png ../../_images/copy-recipe-3.png

Note that, in many cases, you can avoid keeping multiple identical recipes up to date by stacking your data, via the Stack recipe, then splitting your data, via the Split recipe. This is particularly helpful when managing training and testing datasets for machine learning. More information can be found on the Dataiku Academy in the Visual Recipes 101.

If you are looking to copy the steps of a Prepare recipe, you can use the method described here, but you also have the option of copying the steps themselves. Once you have the steps copied to your clipboard, you can paste them into Prepare recipes in other projects. That process is described in detail in the reference documentation.

What’s next?

There are other methods of duplicating recipe steps or even entire projects:

How-to | Segment your data using statistical quantiles

You can create statistical quantiles without code in Dataiku in two ways:

  • The Split recipe allows you to break down each quantile into separate datasets, so it can be useful if you’re planning to separately handle a small amount of quantiles like quartiles or deciles.

  • The Window recipe allows you to create a new column containing the quantile value, which can be easier to set up for a large amount of quantiles like centiles.

In statistics and probability, quantiles are cut points dividing the range of a probability distribution into continuous intervals with equal probabilities, or dividing the observations in a sample in the same way. In the two examples below, let’s assume that you want to create quantiles based on a numerical column called “score”.

Using a window recipe

Configure the Windows recipe to reorder the rows according to the scoring column, enable the window frame with no limits set, and configure the number of quantiles you want in the aggregations screen in addition to retrieving all the existing columns.

../../_images/Screen-Shot-2020-07-03-at-9.38.18-AM.png

Using a split recipe

Configure the Split recipe with the “Dispatch percentiles of sorted data” mode, order the rows according to the scoring column, and assign each portion of the rows in separate datasets.

../../_images/Screenshot-2020-07-02-at-18.51.27.png

Next steps

In addition, to interactively compute statistical quantiles, you can refer to the quantiles table of the Interactive Statistics worksheets.

For more details about interactive statistics, please refer to this course.

You can read more about different Dataiku recipes:

You can also watch this presentation on Customer Predictive Analytics to learn how Dataiku was used to perform data preparation. This resulted in using a machine learning algorithm to assess the probability of a customer returning to the website a certain number of days after their visit.