An API endpoint on a Dataiku API node#

Many data science workloads call for a real-time API framework, where queries sent to an API endpoint receive an immediate response.

As a means of comparison to other deployment contexts, this section presents how to monitor a model under a real-time API framework staying entirely within Dataiku.

Additional prerequisites#

In addition to the shared prerequisites, you’ll also need:

Deploy the model as an API endpoint#

The starter project already contains the API endpoint that we want to monitor, and so the next step is pushing a version of an API service including the endpoint to the API Deployer.

  1. From the top navigation bar, navigate to More Options (…) > API Designer.

  2. Open the pokemon API service.

  3. Note how it includes one prediction endpoint called guess using the model found in the Flow.

  4. Click Publish on Deployer and Publish to confirm publishing v1 of the service to the API Deployer.

Dataiku screenshot of an API service with a prediction endpoint.

Once the API service exists on the API Deployer, we can deploy the service to an infrastructure.

  1. From the waffle menu in the top navigation bar, click Local (or Remote) Deployer.

  2. Click Deploying API Services.

  3. In the Deployments tab of the API Deployer, find the API version that you just pushed to the API Deployer, and click Deploy.

  4. If not already selected, choose an infrastructure.

  5. Click Deploy and Deploy again to confirm.

Dataiku screenshot of an API deployment.

Note

To review the mechanics of real-time API deployment in greater detail, please see Tutorial | Real-time API deployment.

Generate activity on the API endpoint#

Before we set up the monitoring portion of this project, we need to generate some activity on the API endpoint so that we have actual data on the API node to retrieve in the feedback loop.

  1. Within the Status tab of the deployment, navigate to the Run and test subtab for the guess endpoint.

  2. Click Run All to send several test queries to the API node.

Dataiku screenshot of test queries of an API deployment.

Create a feedback loop on the API endpoint#

Now direct your attention to the Dataiku Monitoring (API) Flow zone. Just like the batch Flow zone, we have an Evaluate recipe that takes two inputs (a dataset of predictions and a saved model) and outputs a model evaluation store. However, there are two subtle differences.

Dataiku screenshot of a Flow zone for monitoring API node log data.

API node log data#

The input data in this context comes directly from the API node. As explained in Tutorial | API endpoint monitoring, the storage location of this data differs for Dataiku Cloud and self-managed users.

  1. Follow the steps in Audit trail on Dataiku Cloud to access API node queries.

  2. Once you’ve imported this dataset, replace pokemon_on_static_api_logs with the apinode_audit_logs dataset as the input to the Evaluate recipe in the Dataiku Monitoring (API) Flow zone.

After pointing this dataset to the correct prediction logs, we can now explore it. Each row is an actual prediction request answered by our model. You can find all the features that were requested, the resulting prediction, with details and other technical data.

Dataiku screenshot of the Explore tab of API node log data fetched from the Event server.

Warning

Although we are showing a local filesystem storage for the API node logs to make the project import easier, in a real situation, any file-based cloud storage is highly recommended. This data can grow quickly, and it will not decrease unless explicitly truncated.

It would also be common to activate partitioning for this dataset.

The Evaluate recipe with API node logs as input#

Another subtle difference between the Evaluate recipe in the API Flow zone compared to the Batch Flow zone is the option to automatically handle the input data as API node logs.

With this activated (detected by default), you do not need to care about all the additional columns or the naming.

  1. Open the Evaluate recipe in the Dataiku Monitoring (API) Flow Zone.

  2. Confirm that the input dataset will be handled as API node logs.

  3. Click Run to produce a model evaluation of the API node logs.

Dataiku screenshot of an Evaluate recipe with API node log input data.

Note

If using a version of Dataiku prior to 11.2, you will need to add a Prepare recipe to keep only the features and prediction columns, and rename them to match the initial training dataset convention.

Create a one-click monitoring loop#

After understanding these details, you should also be aware that since version 12, users can simplify this process by building the entire feedback loop directly from the API endpoint in the API Designer.

  1. From the top navigation bar of the Design node, navigate to More Options (…) > API Designer.

  2. Open the pokemon API service.

  3. Navigate to the Monitoring panel for the guess endpoint.

  4. Click Configure to create a monitoring loop for this endpoint.

  5. Click OK, and then return to the Flow to see the new zone, which, in this case, duplicates the work of the existing Dataiku Monitoring (API) Flow zone.

Dataiku screenshot of the Monitoring panel within the API endpoint of the API Designer.