Removing Duplicates

Excel can remove duplicate values, using all columns or a subset to determine uniqueness of a row. Duplicates are simply removed, with no way to recover them later.

../../../_images/excel-remove-duplicates.png

Dataiku’s Distinct recipe identifies and removes duplicate rows within a dataset. Additionally, it can track which rows had duplicates, and how many, in the original dataset.

See also

Distinct recipe lessons in the Visual Recipes Overview course

Distinct recipe in the reference documentation