Hands-On Tutorial: Remapping Connections in a Dataiku Instance¶
Often a project is initially created on a Dataiku instance that uses a connection available only on that instance. If you later want to import the same project into a second Dataiku instance, you may have to remap the connection if an identical connection name is not found on the second Dataiku instance.
This article will show how to remap a connection in Dataiku. For illustration, we’ll remap a PostgreSQL connection defined in the import archive of a project to an existing connection on a different Dataiku instance.
You must be a Dataiku user with the “Create projects” global permission.
Your instance of Dataiku should have an existing database connection.
You can follow the steps for Defining a Connection if you do not have a connection already.
You may also want to review the lesson on Configure the Connection Between Dataiku DSS and PostgreSQL.
While we’ll be using a PostgreSQL connection, the process described here will be very similar for other database connections.
Import an SQL-based Project¶
To import a project that uses an SQL connection, follow the general steps for importing a project into Dataiku.
From the Dataiku homepage, click +New Project > Import project.
If you do not have available a Dataiku project that already uses a SQL connection, you can import any tutorial that is set up for remapping, such as:
From the Dataiku homepage, click +New Project > DSS Tutorials > Advanced Designer > Visual Recipes & Plugins (Tutorial).
Then in the Flow, select any or all of the filesystem datasets downstream of the Sync recipes, and change their connection in the right Actions panel to an available database connection.
You can also download the starter project from this website and import it as a zip file.
No Identical Connection Name in the Second Instance¶
Suppose the project to be imported was created on a first instance having a connection to a PostgreSQL database called postgresql. When importing into a second Dataiku instance that doesn’t have a connection with the same name (postgresql), Dataiku will display errors alerting you to a missing connection.
To resolve these errors, follow these steps to remap the connection name postgresql to an existing connection in the second Dataiku instance.
Click Add Remapping in the “Connection remapping section.”
postgresqlto an existing PostgreSQL connection on your instance. In the following screenshot, this connection is named “PostgreSQL_tshirt.”
Once you’ve remapped the connection and addressed any other errors or warnings, Dataiku will finish importing the project.
Identical Connection Name in the Second Instance¶
Suppose the second Dataiku instance has a PostgreSQL connection with the same name postgresql as the connection on the first instance. For example, this could happen if your Dataiku Design node has a connection to a development database, and the Automation node has a connection, with the same name, to a production database. In this example, you would not have to remap connection names.
However, if the second Dataiku instance must be connected to the same database as the first instance (where the project was created), and the connection names are identical, be sure to remap your connection in the second instance to a different one.
In general, you should avoid connecting the second Dataiku instance to the same database used in the first instance. Otherwise, you can encounter an undesirable situation where both the original project (in the first Dataiku instance) and the imported project (in the second Dataiku instance) will write to the same SQL tables. Therefore, computing a dataset in one instance would overwrite the identically-named dataset in the other instance because both datasets read from the same table.
For more information, see Export/Import Project Options in the product documentation.