How-to | Reshape data from wide to long format#

You can use the Pivot recipe to reshape data from long to wide format. However, if initially presented with data in a wide format, you can “unpivot” the data from wide to long format using the Prepare recipe processor Fold multiple columns (or Fold multiple columns by pattern).

Consider a dataset with the following structure:

Dataiku screenshot of a dataset in wide format.

To reshape this dataset, so that the *_total_sum columns are folded into one total_sum column with one row per year:

  1. In a Prepare recipe, click + Add a New Step.

  2. Choose Fold multiple columns by pattern.

  3. For the field Columns to fold pattern, supply a regular expression that matches which columns should be folded.

  4. For the Column for fold name field, provide a name for the new column holding the row labels (in this case year).

  5. For the Column for fold value field, provide a name for the new column holding the cell values (in this case total_sum).

  6. Check the box Remove folded columns to delete the folded columns from the schema of the output dataset.

Dataiku screenshot of a dataset in wide format.

Note

You can find another example of this processor being used in the reference documentation.