Tutorial | Add an enrichment to a prediction endpoint (MLOps part 6)

The real-time deployment tutorial demonstrated the mechanics of deploying an API service from the Design node to production infrastructure. In many cases though, you may need to perform an enrichment on the incoming queries before sending them to the model for scoring.


In this tutorial, you will:

  • Add an enrichment to a prediction endpoint in an API service.

  • Redeploy a new version of an API service.

Starting here?

This section requires having created the API service in Part 5 (including the API endpoint created in Part 4), and so it’s required to complete those sections in order to reproduce the steps here.

Add a query enrichment

In this section, we’ll add a query enrichment to the prediction endpoint. This enrichment will allow us to enhance features by using a lookup on an additional table in our database.


For greater context on the concepts at work here, see our resources on API Query Enrichment or the reference documentation on enriching prediction queries.

The prediction model in the Flow was trained on six features from the training dataset. (You can confirm this by opening the active version of the model, and navigating to the Features panel in the Model Information section.)

Ideally, an incoming transaction to the API endpoint would have values for all six features. However, suppose at the time a transaction occurs, a merchant point of sale system sends values for only a subset of these features:

  • signature_provided

  • merchant_subsector_description

  • purchase_amount

  • merchant_state

We first need to retrieve the missing values for the features card_fico_score and card_age from our internal database, and then use these values to enrich the API queries.

Using the cardholder_info dataset in the project, we’ll use the card_id value of each real-time transaction to look up the corresponding values for fico_score and age, and then pass the complete feature set to the prediction model.

Dataiku screenshot of a dataset and lookup key to use for enrichment.

  • From the API Designer page, open the fraud_detection API service.

  • Navigate to the Enrichments panel, and click + Add Enrichment.

  • Select cardholder_info as the dataset to use for enrichment.

  • Leave the default Bundled deployment policy.


If you want to try the referenced deployment policy, you’ll need an SQL connection, and so follow our resources on remapping connections.

Now let’s provide the lookup key and retrieve the two missing columns, keeping in mind the names of the columns used to train the model.

  • Next to Lookup keys definition, click + Add Key, and select the internal_card_mapping column.

  • Provide card_id as the name in the query for the lookup key.

  • In Columns to retrieve, specify the two missing features to retrieve from the dataset: fico_score and age.

  • Remap these columns to the names card_fico_score and card_age.

Configure settings for data enrichment.

Let’s change one more setting before we test the enrichment.

  • Navigate to the Advanced panel of the API endpoint.

  • Check the box Return post-enrichment to return a more verbose response to each API query.

Settings to return post-enrichment records in response.

Test the query enrichment

To test the enrichment, we’ll use a query that includes only four of the six features that were used to train the prediction model.

  • Navigate to the Test queries panel of the API endpoint.

  • Click +Add Queries, and then Add to add 1 new empty query.

  • For the new test Query 6, paste the following JSON code sample in the query window.

      "features": {
          "card_id": "C_ID_23626074d5",
          "purchase_amount": 3704.17,
          "signature_provided": 0,
          "merchant_subsector_description": "luxury goods",
          "merchant_state": "Wyoming"
  • Click Run Test Queries.

  • Click Details in the API response for Query 6, and observe the values for card__fico_score and card_age despite them not being present in the query.

Dataiku screenshot of a test data enrichment for a prediction query.

To summarize, the enrichment uses the card_id to retrieve the missing features (card_fico_score and card_age) so that the model has all features needed to determine a prediction.


You can also test the enrichment by modifying the JSON code for any of the previous test queries. To do this, delete all the features except for the four used in the JSON code sample above. When you run the test queries, you’ll notice that the endpoint returns the same prediction as before for the modified test query, even with the missing features.

Redeploy the API service

Now that we’ve added an enrichment to the prediction endpoint, we need to redeploy a new version of the API service, just as if we had a new project bundle.

  • From the fraud_detection API service, click the green Publish on Deployer button.

  • Accept the default version ID (v2), and click OK.

  • Open the API service on the API Deployer, and click Deploy on v2.

  • In the Deploy version dialog, click OK to update the version used in the service.

  • Click OK again to confirm on which infrastructure you want to deploy it.

  • Now on the API service page, click the green Update button.

  • Choose the default Light Update.

  • Navigate back to the Deployments tab of the API Deployer to confirm v2 is the new version.

Dataiku screenshot of the second version of an API service.

Next steps

Congratulations! You added an enrichment to a prediction endpoint, and redeployed the API service to the production environment.

A prediction endpoint though is just one of many kinds of supported endpoints. Now let’s see another kind of endpoint: a dataset lookup.