Concept: APIs Outside Dataiku

At this point, we have demonstrated using the Dataiku APIs in many ways from inside the platform. In this lesson, you will learn how the same APIs can be used outside Dataiku.

Using APIs outside Dataiku can be useful in a number of situations. For example:

  • to analyze a Dataiku dataset from your local computer;

  • to deploy a project from the Design node to the Automation node;

  • to manage Dataiku instance settings from your computer;

  • … and much more!

Let’s focus again on Python. Using “pip”, you can install the two Dataiku API packages dataiku and dataikuapi on your computer.

Follow the Documentation

The product documentation guides you through the steps for using the Python clients of the Dataiku APIs outside DSS. The additional notes provided here will talk you through some of the steps described in the documentation.

Note

Be careful to use a version of Python that is compatible with your Dataiku infrastructure. Code environments in Dataiku can help you to manage different versions of Python on your Dataiku instance. You can find more information about Python code environments in the product documentation.

Installing the dataiku Package Outside Dataiku

This section of the product documentation explains how to use the dataiku package externally.

You can pip install the package, but since it is not publicly available, you have to download it from your Dataiku instance in one of three ways:

  • directly through pip,

  • in a requirements.txt file or

  • manually with download.

These three methods are equivalent, and you can choose whichever method is the most convenient for you. The first one is the most direct one (only one step).

When using the APIs inside Dataiku, you do not need to provide any authentication details (since you are already logged in to DSS). Outside Dataiku however, you have to set up the connection with your DSS account.

As covered in the documentation, you have various options for setting up the connection with your Dataiku account: by using code, environment variables, or a configuration file.

For all of these options:

  • You will need your DSS_HOST and DSS_PORT. These can be found in your DSS URL. For instance, if your home URL is http://localhost:10000/home/, then your DSS_HOST is “localhost” and your port is “10000”.

  • You will also need an API key. More details about this are available at the end of the tutorial.

Once you have completed the installation, you can use functions from the dataiku package as you normally would from within your DSS instance. For example, the code below accesses a specific dataset and looks at the data.

Terminal output using the API to view a dataset.

Note

Once you have installed the dataiku package, you can create a DSSClient from it, and leverage the public API functions as you would do inside DSS. You do not need to also install the dataikuapi package. This is also noted in the product documentation.

Follow the instructions in the next section if you want to directly access the public API (HTTP REST method) instead of using the recommended Python client.

Creating a DSS Client Externally

You can also directly use the public API outside Dataiku by following the product documentation. From there, you can very easily create a DSS client externally to perform all the administrative tasks needed in your project.

Let’s take a look at the steps mentioned in the product documentation above:

  • You need to pip install the package dataiku-api-client. It is publicly available to anyone, even if they are not users of Dataiku DSS.

  • After a successful installation, you can use the dataikuapi package like any other package in your installation of Python.

The example below creates a DSS client and creates a new user on the platform.

Terminal output creating a DSS client.

You may have noticed that both packages require an API key to be used externally. However, the key’s usage differs:

  • Using the dataiku package, you just need to set up the connection to your instance once.

  • Using the public API, you have to authenticate yourself with an API key every time you create a DSSClient.

Types of API Keys

As you just saw, the need for an API key is one difference when using the APIs outside Dataiku:

  • Inside Dataiku, you do not need to provide an API key. As you’re already logged in, Dataiku inherits the necessary credentials.

  • When using the APIs from the outside Dataiku, you have to identify yourself using an API key.

API keys can be either personal, project-level, or global

We encourage you to create your own personal API key for use on all projects. API keys can be created in Profile & Settings > API keys. This key will give you the rights to perform externally what you can do inside Dataiku, but not more. In other words, you have the same rights and limitations whether you’re acting from inside or outside Dataiku.

Dataiku screenshot showing where to find API keys.

The other types of keys are more specific:

  • Project-level keys give you rights for the project only.

  • Global keys can only be created by an administrator to have the rights on all projects.