Concept: Distinct Recipe

The Distinct Recipe allows you to filter a dataset in order to remove some of its duplicate rows. In the following example, we have a simple dataset that gives us information about products, their categories and their prices. To output a dataset that contains only unique rows, we apply the default configuration of the Distinct Recipe.


The Distinct Recipe has two configuration options. The default option allows you to identify all rows that have the exact same values on all columns and keep only one of them. The second option allows you to choose a sample of the available columns in order to compute the distinct rows related to their combinations. For example, if you select two columns, the resulting dataset will have exactly two columns.