Reference | Removing duplicates#

Excel can remove duplicate values, using all columns or a subset to determine uniqueness of a row. Duplicates are simply removed, with no way to recover them later.

../../_images/excel-remove-duplicates.png

Dataiku’s Distinct recipe identifies and removes duplicate rows within a dataset. Additionally, it can track which rows had duplicates, and how many, in the original dataset.

See also

More information on the Distinct recipe can be found in Concept | Distinct recipe.