Concept | Sort recipe

See the video version of this article

The Sort recipe allows you to sort the rows of an input dataset by the values of one or more columns in the dataset. In this example, our dataset provides customer information and includes a revenue prediction column.


Our goal is to output a dataset sorted by predictions for each country.

By default, the Sort Recipe sorts columns in ascending order. In order to meet our business goal, we’ll change the sort option so that revenue predictions sort in descending order.

Next, let’s look at the defined sort order. Without changing the sort order, our output rows would be sorted by the value of prediction followed by the value of the country. To accomplish our objective, we’ll rearrange this order by dragging and dropping the columns in the configuration window.

The Sort Recipe provides options for making computations for each row. These options include asking Dataiku to compute the row number, rank of row, or dense rank of row. Selecting any of these options will create an additional column in the output dataset. Let’s select to compute the dense rank.


If you select each computation in this step, this would be the output of the three additional columns:

  • The first column would contain a row’s respective row number.

  • The second column would contain a row’s ranking based on its value in the sorting column. When there is a tie between rankings, subsequent rankings will skip ranks based on the number of ties there are.

  • The third column would contain the dense rank of each row. This is the same as the row’s ranking, but rankings are consecutive, as no ranks are skipped.


Finally, after running the recipe, our output dataset contains rows sorted by the customer’s country of origin, and the prediction of revenue. In addition, Dataiku has computed the dense rank for each row so that each row is ranked within its ordered group.