How-to | Reshape data from wide to long format#
You can use the Pivot recipe to reshape data from long to wide format. However, if initially presented with data in a wide format, you can “unpivot” the data from wide to long format using the Prepare recipe processor Fold multiple columns (or Fold multiple columns by pattern).
Consider a dataset with the following structure:
data:image/s3,"s3://crabby-images/3c0d6/3c0d6b06c0c7f29c3b48a4d05d333066df089a55" alt="Dataiku screenshot of a dataset in wide format."
To reshape this dataset, so that the *_total_sum
columns are folded into one total_sum column with one row per year:
In a Prepare recipe, click + Add a New Step.
Choose Fold multiple columns by pattern.
For the field Columns to fold pattern, supply a regular expression that matches which columns should be folded.
For the Column for fold name field, provide a name for the new column holding the row labels (in this case year).
For the Column for fold value field, provide a name for the new column holding the cell values (in this case total_sum).
Check the box Remove folded columns to delete the folded columns from the schema of the output dataset.
data:image/s3,"s3://crabby-images/ab19f/ab19fa43da5a57662e658315468cad5bc6ec38ea" alt="Dataiku screenshot of a dataset in wide format."
Note
You can find another example of this processor being used in the reference documentation.