Add an enrichment to a prediction endpoint#

You have successfully published an API service from the Design node to the API Deployer and then to a deployment infrastructure (an API node). This allowed you to receive a response from a query to a live prediction endpoint.

In many cases though, you may be missing information required for the model to score an incoming request. Accordingly, let’s return to the Design node, and add an enrichment to the prediction endpoint within the existing API service. Then, we’ll re-deploy a new version of the API service.

Add a query enrichment#

The prediction model in the Flow was trained on six features from the training dataset. You can confirm this by opening the active version of the model, and navigating to the Features panel in the Model Information section.

Ideally, an incoming transaction to the API endpoint would have values for all six features. However, suppose at the time a transaction occurs, a merchant point of sale system sends values for only a subset of these features:

  • signature_provided

  • merchant_subsector_description

  • purchase_amount

  • merchant_state

We first need to retrieve the missing values for the features card_fico_score and card_age from our internal database, and then use these values to enrich the API queries.

Using the cardholder_info dataset, we’ll use the card_id value of each real-time transaction to look up the corresponding values for fico_score and age, and then pass the complete feature set to the prediction model for scoring.

Dataiku screenshot of a dataset and lookup key to use for enrichment.
  1. From the API Designer page in the Design node project, open the fraud_detection API service, and navigate to the Enrichments panel.

  2. Click + Add Enrichment.

  3. Select cardholder_info as the dataset to use for enrichment.

  4. Leave the default Bundled deployment policy.

    Tip

    If you want to use a referenced deployment policy, you’ll need an SQL connection, and so follow our resources on remapping connections.

  5. Next to Lookup keys definition, click + Add Key, and select the internal_card_mapping column.

  6. Provide card_id as the name in the query for the lookup key.

  7. In Columns to retrieve, specify the two missing features to retrieve from the dataset: fico_score and age.

  8. Remap these columns to the names card_fico_score and card_age.

Configure settings for data enrichment.

Let’s change one more setting before we test the enrichment.

  1. Navigate to the Advanced panel of the API endpoint.

  2. Check the box Return post-enrichment to return a more verbose response to each API query.

Test the query enrichment#

To test the enrichment, we’ll use a query that includes only four of the six features that were used to train the prediction model.

  1. Navigate to the Test queries panel of the API endpoint.

  2. Click + Add Queries, and then Add 1 new empty query.

  3. For the new empty query, paste the following JSON code sample in the query window.

    {
      "features": {
          "card_id": "C_ID_23626074d5",
          "purchase_amount": 3704.17,
          "signature_provided": 0,
          "merchant_subsector_description": "luxury goods",
          "merchant_state": "Wyoming"
        }
    }
    
  4. Click Run Test Queries.

  5. Click Details in the API response for the new test query, and observe the values for card__fico_score and card_age despite them not being present in the query.

Dataiku screenshot of a test data enrichment for a prediction query.

To summarize, the enrichment uses the card_id to retrieve the missing features (card_fico_score and card_age) so that the model has all features needed to generate a prediction.

Tip

You can also test the enrichment by modifying the JSON code for any of the previous test queries. To do this, delete all the features except for the four used in the JSON code sample above. When you run the test queries, you’ll notice that the endpoint returns the same prediction as before for the modified test query, even with the missing features.

Redeploy the API service#

Now that we’ve added an enrichment to the prediction endpoint, we need to redeploy a new version of the API service.

Tip

If this were a batch processing use case, the corollary would be deploying a new project bundle.

  1. From the fraud_detection API service in the Design node project, click the green Publish on Deployer button.

  2. Accept the default version ID (v2), and click OK.

  3. Open the API service on the API Deployer, and click Deploy on v2.

  4. In the Deploy version dialog, click OK to update the version used in the service.

  5. Click OK again to confirm which deployment you want to edit.

  6. Now on the API service page, click the green Update button.

  7. Select the default Light Update.

  8. Navigate back to the Deployments tab of the API Deployer to confirm v2 is the new version.

Dataiku screenshot of the second version of an API service.