Hands-On Tutorial: SQL Notebooks

SQL notebooks are useful for prototyping code and querying and analyzing data.

  • From the customers_enriched dataset, select Lab > New Code Notebook from the Actions sidebar.

  • Select SQL and create.

  • Click +Query to initiate the first query of the notebook.

  • In the Tables tab of the left pane, click on the share icon to the right of the table name to paste a starter query for this table.

The query form is now populated with the following code, which selects all columns from the DKU_TUTORIAL_SQL_customers_enriched table in the dku_shirt schema.

SELECT * FROM "dku_tshirt"."DKU_TUTORIAL_SQL_customers_enriched"

Tip

As before, your own query may not have the “dku_tshirt” schema depending on the configuration of your database.

If you click Run, the query executes and shows you a sample of the query results.

../../../_images/tshirt-sql-query-02.png

We are interested in how Haiku T-Shirt customers respond to the campaign, and in particular across different age groups. We can quickly run some SQL queries to analyze this.

  • Click +Query to start a new query.

  • Copy the following code to the new query cell, and click Run.

SELECT campaign, AVG(total), SUM(total), COUNT(campaign)
    FROM "dku_tshirt"."DKU_TUTORIAL_SQL_customers_enriched"
    WHERE birthdate >= '1980-01-01' AND birthdate < '1990-01-01'
    GROUP BY campaign

The query pulls customers born in the 1980’s and groups them by whether they are part of the campaign. The columns report:

  • campaign, since this is the GROUP BY variable, the two rows represent the two values indicating whether a customer is part of the campaign

  • avg, representing the average order placed by a customer. Customers who were part of the campaign placed orders averaging 97.40 in value, while those who were not placed orders averaging 34.49

  • sum, representing the total value of orders placed by customers in this group

  • count, representing the number of customers in this group

../../../_images/tshirt-sql-query-03.png

Let’s try another.

  • Click +Query.

  • Copy the following code to the new query cell, and click Run.

SELECT campaign, AVG(total), SUM(total), COUNT(campaign)
    FROM "dku_tshirt"."DKU_TUTORIAL_SQL_customers_enriched"
    WHERE birthdate >= '1990-01-01' AND birthdate < '2000-01-01'
    GROUP BY campaign

The query results show summaries for customers born in the 1990’s. What is of immediate interest here is that there is a much smaller gap between customers who are and are not part of the campaign, versus customers born in the 80’s. This suggests that there might exist some interaction between age and the effect of the campaign.

../../../_images/tshirt-sql-query-04.png

Note

All of these queries are part of a notebook that exists only in the Lab. To use a query from a notebook in the Flow, you can turn the query into a SQL code recipe by clicking Create Recipe.