Use the API Node on Dataiku Online

Installation

Dataiku Online manages the configuration and the infrastructure of the API deployer for you.

  • To start using the API node, from the Extensions tab of your Launchpad, click Add an extension > API node.

Dataiku screenshot of the Extensions tab of the launchpad.

Once the API node is activated, you can deploy your first API with the API Deployer. For more information, see the product documentation on how to deploy your first API.

Specificities of using the API Deployer on Dataiku Online

Using Dataiku Online, you cannot create an API service in the API node. Instead, you must use the API designer or, if it’s a prediction endpoint, create it directly from the Flow.

The following types of endpoints are available on Dataiku Online:

  • Exposing a visual prediction model

  • Exposing a Python prediction model

  • Exposing a Python function

  • Exposing an SQL query

  • Exposing a lookup in a dataset

The following types of endpoints are not yet available on Dataiku Online:

  • Exposing a clustering model

  • Exposing an R prediction model

  • Exposing an R function

API Query Logs

In your Dataiku DSS you can access the logs of all queries made on your API endpoints through an S3 connection called “customer-audit-log” automatically included in the S3 managed storage of Dataiku Online.

  • To access these query logs, go to Create Dataset > Cloud Storage & Social > Amazon S3.

  • Select the connection called customer-audit-log, and click List to browse the files.

Dataiku screenshot of creating a dataset from query logs stored in S3.

Using the Referenced data deployment mode on Dataiku Online

There are two possible deployment options for data enrichments or dataset lookup endpoints (see the enriching prediction queries article):

  • the “bundled data” mode

  • the “referenced data” mode

The referenced data mode is only available for SQL datasets. During the activation of the API node, Dataiku automatically adds a connection api-node-referenced-data that is dedicated to API usage. To use this connection for enriching prediction queries or for the SQL query endpoints:

  • Sync the relevant dataset in your Flow to the api-node-referenced-data connection (for example with a Sync recipe, see below).

  • Choose that synced dataset as the dataset to use for enrichment in your API endpoint.

  • Choose the option Referenced (SQL only) as the Deployment policy.

Dataiku screenshot of a dataset synced to the api-node-referenced-data connection. Choose that synced dataset as the dataset to use for enrichment in your API endpoint Choose the option "Referenced (SQL only)" as the Deployment policy

How to deploy an API service from the Automation node on Dataiku Online

You can also deploy an API service from the Automation node. This can be useful if for example you want an API endpoint to look-up in a dataset that is updated by a Flow running on the Automation node.

First, create your API service in the project on the Design node. Then publish your project to the Automation node. You can refer to this hands-on to learn how to package a Flow into a bundle and deploy the bundle to the Automation node.

Once on the Automation node, remember to build the Flow so as to populate all datasets with data. To make sure the Flow on the Design node and the Automation node do not collide and provoke data inconsistencies, your datasets must be relocatable or you must remap the connection on the Automation node. If you are using the referenced mode with a dataset in the api-node-referenced-data connection, that dataset will be automatically relocatable; there is nothing for you to do.

From there (still on the project on the Automation node), go to the API Designer. You will find the API service and you can deploy it to the Deployer (click on the service > Publish on Deployer). The endpoints in the service will use the data created by the Automation node Flow.

To have clear naming of all your services, the trick here is when deploying the service to the Deployer to click on the advance option and change the default name of the target service to make it clear it comes from the Automation node (otherwise it will just be suffixed with 2).

Dataiku screenshot of overriding name of target service

Then you can deploy your API service to the Deployer. If you had also deployed the API service from the project on the Design node, you will have both your Design and Automation API services deployed looking like this:

Dataiku screenshot of 2 deployed API services from Design and Automation node

Resources

Note

Please note the differences listed in this article when navigating through Dataiku product documentation.