API Node on Dataiku Cloud

Dataiku’s architecture includes an API Deployer and API nodes for deploying real-time API services.

Dataiku Cloud manages the configuration and the infrastructure of the API Deployer for you. Explore resources here for using API nodes on Dataiku Cloud.

Reference | Specificities of using the API Deployer on Dataiku Cloud

Using Dataiku Cloud, you cannot create an API service in the API node. Instead, you must use the API designer or, if it’s a prediction endpoint, create it directly from the Flow.

You can expose the following types of endpoints on Dataiku Cloud:

  • a visual prediction model

  • a Python prediction model

  • an R prediction model

  • a Python function

  • an R function

  • an SQL query

  • a lookup in a dataset

The clustering model endpoint is not yet available on Dataiku Cloud.


For questions outside of this, please refer to the general reference documentation for the API node or the real-time APIs section of the Knowledge Base.

How-to | Install the API node

  1. Navigate to the Extensions panel of the Launchpad.

  2. Click Add an Extension and then API node.

  3. Click Confirm to install the API node on your instance.

Dataiku screenshot of the Extensions tab of the launchpad.

How-to | Access API query logs

You can access the logs of all queries made on your API endpoints through an S3 connection called customer-audit-log automatically included in the S3 managed storage of Dataiku Cloud.

To access these query logs:

  1. Within a project on the instance, go to + Dataset > Cloud Storage & Social > Amazon S3.

  2. Select the connection called customer-audit-log.

  3. Click Browse to navigate the file directory and List Files to preview the contents.

  4. Click Create to import the logs as a dataset.

Dataiku screenshot of creating a dataset from query logs stored in S3.

How-to | Use the referenced data deployment mode on Dataiku Cloud

Bundled or referenced data modes are the two possible deployment options for data enrichments or dataset lookup endpoints.

The referenced data mode is only available for SQL datasets. During the activation of the API node, Dataiku automatically adds a connection api-node-referenced-data that is dedicated to API usage.

To use this connection for enriching prediction queries or for the SQL query endpoints:

  1. Sync the relevant dataset in your Flow to the api-node-referenced-data connection (for example with a Sync recipe, see below).

  2. Choose that synced dataset as the dataset to use for enrichment in your API endpoint.

  3. Choose the option Referenced (SQL only) as the Deployment policy.

Dataiku screenshot of a dataset synced to the api-node-referenced-data connection. Choose that synced dataset as the dataset to use for enrichment in your API endpoint Choose the option "Referenced (SQL only)" as the Deployment policy

How-to | Deploy an API service from the Automation node on Dataiku Cloud

You can also deploy an API service from the Automation node. This can be useful if, for example, you want an API endpoint to look-up in a dataset that is updated by a Flow running on the Automation node.

  1. Create your API service in the Design node project.

  2. Publish your project to the Automation node. (This tutorial walks through the steps).

  3. Once on the Automation node, build the Flow so as to populate all datasets with data.


    To make sure the Flow on the Design node and the Automation node do not collide and provoke data inconsistencies, your datasets must be relocatable or you must remap the connection on the Automation node.

    If you are using the referenced mode with a dataset in the api-node-referenced-data connection, that dataset will be automatically relocatable; there is nothing for you to do.

  4. Still on the Automation node project on the Automation node, go to the API Designer. Find the API service, and deploy it to the Deployer (click on the service > Publish on Deployer). The endpoints in the service will use the data created by the Automation node Flow.


To have clear naming of all your services, the trick here is when deploying the service to the Deployer to click on the advance option and change the default name of the target service to make it clear it comes from the Automation node (otherwise it will just be suffixed with 2).

Dataiku screenshot of overriding name of target service

If you had also deployed the API service from the project on the Design node, you will have both your Design and Automation API services deployed looking like this:

Dataiku screenshot of 2 deployed API services from Design and Automation node