Tutorial | In-database operations with visual recipes (SQL part 4)

While SQL recipes allow you to completely customize a query, many visual recipes (Group, Join, Stack, etc.) can also perform in-database operations.

Objectives

In this tutorial, you will:

  • create a visual recipe that runs in-database.

Starting here?

This tutorial requires having completed parts 1, 2, and 3 in order to reproduce the steps here.

Initiate a Prepare recipe running in-database

We are now ready to enrich the customers_stacked_prepared dataset with information about the orders that customers have made.

  • Click on customers_stacked_prepared, and from the Actions menu choose Join with… from the list of visual recipes.

  • Select orders_by_customer as the second input dataset.

  • Change the name of the output dataset to customers_enriched.

  • Note that the output dataset will be stored in the same database.

  • Click Create Recipe.

Inspect a Prepare recipe running in-database

You will be familiar with the mechanics of the Join recipe, but this time we can execute it in-database.

  • In the Join step, change the join type to an Inner join in order to return only customers who have made an order. Note that Dataiku has automatically determined the join keys customerID and customer_id, even though they have different names in the two datasets.

  • In the Selected columns step, we can deselect customer_id from the orders_by_customer dataset because it is redundant with customerID in the customers_stacked_prepared dataset.

  • Before running the recipe, check the Output step. Click View Query to preview the SQL code.

  • Before running the recipe, click the gear icon underneath the Run button to confirm that the recipe will be executed using the in-database engine.

  • We’re satisfied with this, so click Run and update the schema.


At this point, your Flow should resemble the image below.

../../_images/sql-dss-flow-after-join.png

What’s next?

In addition to running recipes in-database, you may also want to build charts in-database.

See the tutorial on in-database charts to learn how!

Note

Many, but not all visual recipes can run in-database. For more information, see the reference documentation.