Concept | SQL notebooks#
Each SQL notebook is attached to a single SQL connection configured in Dataiku and allows you to perform queries in an interactive environment.
Note
To use SQL notebooks in Dataiku, you need to have an available SQL connection.
SQL notebooks use SQL queries to interact with the tables that are associated with the Dataiku SQL datasets using this connection or directly on tables that are made available via the connection.
Creating SQL notebooks#
There are two main ways to create an SQL notebook from:
The Notebooks menu, by clicking the + New Notebook button.
The Lab menu of an SQL dataset in the Flow, by clicking New in the Code Notebooks section.
Using SQL notebooks#
SQL notebooks consist of one or more query cells. In each cell, you modify and execute a single query, and then view its results.
Try using a single cell for each query that you might want to rerun later.
Each cell has its own version history so you can work on tuning and debugging your queries, as you’ll always be able to revert to previously executed states.
When to use an SQL notebook#
An SQL notebook allows you to interact with the tables associated with your Dataiku SQL datasets through the use of SQL queries.
With SQL notebooks, you can:
Perform queries on an SQL dataset for quick prototyping and data analysis.
Leverage an SQL engine for in-database computations.
Return query outputs without having to write them as new datasets in your SQL database.
Execute many kinds of SQL statements, from the simplest queries to advanced DDL statements and stored procedures.
When creating an SQL notebook, you must link it to an SQL connection already set up by an administrator in your Dataiku instance.
The SQL notebook interface#
The main area of the SQL notebook interface is a window for writing queries.
In the left panel, the Queries tab lists all the queries available. When you first create your notebook, this tab is empty. You must click the + Query button to create your first query.
The left panel also contains a Tables tab that lists:
The tables available in your connection.
Additional information like the list of columns in each table.
The column data types.
When you have several tables, you can sort them by table name or dataset name.
By clicking the plus () icon next to a table name, a
SELECT statement displays within the query window. This is where you can quickly build your query by selecting a column name to add to the query window.
When you run the query, the results display below the query window for you to explore and download.
The SQL notebook also lets you switch views between full screen and stacked query modes and add a new query or markdown cell.
You can even add comments, either in plain text or markdown within a markdown cell.
Dataiku stores all queries and comments you create in the Queries tab as distinct cells. Simply click to view them.
As you can see below, each query cell is where you execute and modify a single query. It’s recommended to keep a single cell for each query as each cell has its own history and you can rerun any query at a later time.
For better clarity, note that you can name your query.
Once back in the Flow, you’ll notice the SQL notebook isn’t represented by a Flow icon. This is because the purpose of SQL notebooks is to write SQL code for experimentation only. It’s a Lab object outside of the Flow.
See also
For more information, see SQL notebook in the reference documentation.
Or, learn how to use a query from a notebook in the Flow by turning the query into an SQL code recipe.
