Tutorial | In-database operations with visual recipes (SQL part 4)¶
While SQL recipes allow you to completely customize a query, many visual recipes (Group, Join, Stack, etc.) can also perform in-database operations.
In this tutorial, you will:
create a visual recipe that runs in-database.
We are now ready to enrich the customers_stacked_prepared dataset with information about the orders that customers have made.
Click on customers_stacked_prepared, and from the Actions menu choose Join with… from the list of visual recipes.
Select orders_by_customer as the second input dataset.
Change the name of the output dataset to
Note that the output dataset will be stored in the same database.
Click Create Recipe.
You will be familiar with the mechanics of the Join recipe, but this time we can execute it in-database.
In the Join step, change the join type to an Inner join in order to return only customers who have made an order. Note that Dataiku has automatically determined the join keys customerID and customer_id, even though they have different names in the two datasets.
In the Selected columns step, we can deselect customer_id from the orders_by_customer dataset because it is redundant with customerID in the customers_stacked_prepared dataset.
Before running the recipe, check the Output step. Click View Query to preview the SQL code.
Before running the recipe, click the gear icon underneath the Run button to confirm that the recipe will be executed using the in-database engine.
We’re satisfied with this, so click Run and update the schema.
At this point, your Flow should resemble the image below.