Tutorial | Configure a connection between Dataiku and a SQL database#

Get started#

Data projects must begin with connecting to data sources.

Objectives#

In this tutorial, you will:

  • Create a local SQL connection if you do not already have one available.

  • Create a connection to this source in Dataiku.

Prerequisites#

  • You will need admin permission on a Dataiku instance.

  • You may also want to review this tutorial’s associated concept article.

Create a SQL database#

To get started, you need to have a SQL database available. If you already have one, please skip to the next section.

Note

For this example, we’ll use PostgreSQL, but you can follow a similar process for any supported SQL database.

Install PostgreSQL#

If it isn’t already available, install PostgreSQL version 9 or above. Take note of the host on which PostgreSQL is installed.

Tip

On macOS, we recommend using the Postgres App.

Configure your PostgreSQL database#

At a minimum you need a user and a database, as explained in the PostgreSQL guide. As a best practice, we recommend using schemas in order to administer multiple projects within a database.

For example, the sample code below creates the user matthieu, with password Password, and grants this user all privileges (can create and delete tables) in the dku_churn schema in the dku database. Similarly, user dku_tshirt_admin has been granted all privileges on the dku_tshirt schema in the dku database.

psql -h localhost
CREATE DATABASE dku;
\c dku
CREATE SCHEMA dku_churn;
CREATE USER matthieu WITH PASSWORD 'Password';
GRANT ALL PRIVILEGES ON SCHEMA dku_churn TO matthieu;
CREATE SCHEMA dku_tshirt;
CREATE USER dku_tshirt_admin WITH PASSWORD 'Password';
GRANT ALL PRIVILEGES ON SCHEMA dku_tshirt TO dku_tshirt_admin;
\q

Configure a SQL connection in Dataiku#

Now that we have credentials for a SQL database, we need to establish a connection between Dataiku and this database.

Important

This walkthrough is for a self-managed Dataiku instance. If using Dataiku Cloud, see the how-to for adding a new connection.

Create a connection#

  1. Log in as the Dataiku Administrator.

  2. From the Applications menu in the top navigation bar, choose Administration.

    The Administration menu is located in the Admin Tools menu.
  3. Navigate to the Connections tab, and select New Connection > PostgreSQL.

    Many different SQL database connections are available in the New Connection menu.
  4. Fill in the information required for the connection. Don’t forget to give a name to your connection!

    Warning

    You cannot change the connection name afterward.

    Create a SQL connection on the new PostgreSQL connection screen.
  5. Test and create your PostgreSQL connection.

Import datasets into Dataiku#

After creating your connection, if you are connecting to a pre-existing SQL database, you can create a Dataiku dataset for every table in the database.

  1. Click Import tables to datasets.

  2. Select the tables you want to import.

  3. Click Import tables.

  4. Select the project in which the datasets associated with these tables should appear.

You can mass import tables after creating the SQL connection.

What’s next?#

Congratulations! You now have a SQL connection to use underneath your Dataiku Flows.

Learn more about how to take advantage of this type of connection in Tutorial | Data transfer with visual recipes.

Tip

You can find this content (and more) by registering for the Dataiku Academy course, Integration with SQL Databases. When ready, challenge yourself to earn a certification!