Tutorial | Flow zones#

Let’s organize a Flow and make it more navigable. Using Flow zones to sub-divide a complex Flow into more manageable pieces allows you to view it at a higher level of abstraction and quickly grasp its overall purpose.

Get started#

Objectives#

In this tutorial, you will:

  • Move Flow items into new Flow zones.

  • Manage the display of Flow zones.

  • Delete Flow zones and undo that change.

Prerequisites#

To reproduce the steps in this tutorial, you’ll need:

  • Dataiku 12.6 or later.

  • Basic knowledge of Dataiku (Core Designer level or equivalent).

Create the project#

  1. From the Dataiku Design homepage, click + New Project.

  2. Select Learning projects.

  3. Search for and select Flow Zones.

  4. Click Install.

  5. From the project homepage, click Go to Flow (or g + f).

Note

You can also download the starter project from this website and import it as a zip file.

You’ll next want to build the Flow.

  1. Click Flow Actions at the bottom right of the Flow.

  2. Click Build all.

  3. Keep the default settings and click Build.

Use case summary#

The project has three data sources:

Dataset

Description

tx

Each row is a unique credit card transaction with information such as the card that was used and the merchant where the transaction was made.

It also indicates whether the transaction has either been:

  • Authorized (a score of 1 in the authorized_flag column)

  • Flagged for potential fraud (a score of 0)

merchants

Each row is a unique merchant with information such as the merchant’s location and category.

cards

Each row is a unique credit card ID with information such as the card’s activation month or the cardholder’s FICO score (a common measure of creditworthiness in the US).

Create and move items into Flow zones#

Looking at the Flow, you can abstract its purpose to two steps: data ingestion and data preparation. We can use this grouping to build our Flow zones.

Move items to a new zone#

  1. Select these datasets:

    • tx_prepared

    • tx_distinct

    • tx_pivot

    • tx_topn

    • tx_windows

  2. In the Actions tab, under Flow Zones, select Move (or right-click on the selection and select Move to a flow zone).

  3. Name the zone Data preparation and review which recipes will be moved as well.

  4. Click Confirm.

Dataiku screenshot of the dialog for moving datasets to a new Flow zone.

Tip

Here, we moved items to a new Flow zone. You also have the option in this step to move items to existing Flow zones if present.

Rename the default zone#

  1. Select the Default Flow zone and click Edit in the right panel.

  2. Rename the Flow zone Data ingestion and Confirm.

Dataiku screenshot of the dialog for renaming a Flow zone.

Open the Flow zone view#

Let’s look at the Flow zone view.

  1. From the View menu in the lower left, select Flow Zones. In this view, Flow items are colored according to their Flow zone.

  2. Click on Hide Zones. Items are still color-coded according to their Flow zone, but the zone boundaries no longer appear.

  3. Exit the Flow zone view to prepare for next steps.

Dataiku screenshot of the Flow zone view filters.

Expand or Collapse a Flow zone#

Let’s hide the content of the Data ingestion Flow zone. This could be useful if you were only focused on making changes to the Data preparation items.

  1. Right-click on the Data ingestion Flow zone, then select Collapse. You can also click the collapse (Collapse Flow zone icon.) icon on the Flow zone to do this.

  2. Click the expand (Expand Flow zone icon.) icon to see the Flow details again.

Note

If collapsing a zone isn’t enough, you can:

  • Double-click on the Data preparation Flow zone to only see this zone on the screen.

  • Implement Flow Folding as described in the reference documentation.

Delete a Flow zone#

What happens to Flow items when you delete a Flow zone? We’ll delete our Data preparation zone and see.

  1. Select the Data preparation Flow zone.

  2. In the right panel, click Delete and Confirm.

Dataiku screenshot of the option to delete a Flow zone.

You’ll see that both Flow zones disappear. This is because items in the deleted Flow zone move to the Default Flow zone (which we have renamed to Data ingestion). However, you cannot only have one Flow zone in the Flow. Thus, both zones were removed.

Note

If we had tried to delete the Data ingestion Flow zone, we would have seen that it was not possible. This is because the “Default” Flow zone cannot be deleted.

Imagine you had deleted the Flow zones by accident. Do you know how to undo this change? Let’s practice.

  1. From the More Options (Horizontal dots icon.) menu in the top navigation bar, select Version Control.

  2. Click on the change you made before deleting the Flow zone. (Assuming you made no other changes, this will be the second item in the list from the top.)

  3. Select Revert To This Revision and Confirm to undo any changes you made after this commit.

Dataiku screenshot of the dialog for reverting to a previous version.

Note

To learn more about version control in Dataiku, visit our page on project version control.

What’s next?#

Now you know how to organize your Flow into multiple Flow zones. Once you have zones, you can also rearrange them within your Flow. See How-to | Rearrange Flow zones for instructions.

To learn about how to use Flow zones to your advantage when building datasets, check out Tutorial | Build modes!

See also

For more information, see the Working with Flow zones article in the Developer Guide.