Tutorial | Manage multiple versions of an API service (MLOps part 9)¶
Once you’ve successfully deployed an API service to production, it is important to monitor the service and update it as needed. You’ll also want to monitor the different versions of your API service so that you can roll back to previous versions, when needed, or even use the multiple versions for performing A/B testing.
Objectives¶
In this tutorial, you will:
Deploy multiple generations of a prediction endpoint to use for A/B testing.
Starting here?
This section requires having deployed at least two versions of an API service including a prediction endpoint, and so having completed Part 4, Part 5, and at least one of Part 6, or Part 8 is required in order to reproduce the steps here.
Deploy multiple versions of the endpoint for A/B testing¶
When you’ve deployed multiple versions of your prediction endpoint, you may decide to run multiple generations of the endpoint at once. This allows the multiple versions of your prediction model to be used at random for scoring requests to the API service.
On the API Deployer, return to the Deployments tab.
Click to open the most recent deployment of the fraud_detection API service.
Navigate to the Settings tab of the deployment.
Within the General panel, change the Active version mode from Single generation to Multiple generations.
In the Entries field, we need to define a mapping to enable multiple versions of the endpoint. The mapping is in JSON format and specifies one or more generations and their corresponding probabilities. The probabilities indicate the likelihood that calls to the API node will be served to each of the generations. Therefore, the sum of the probabilities of all generations must equal one.
Currently, the probability is set to 1 for all calls to be sent to the active version, auto_deploy_api. Let’s direct some calls to another generation of the API service.
Copy-paste the JSON below into the Entries field:
[
{
"generation": "auto_deploy_api",
"proba": 0.7
},
{
"generation": "v1",
"proba": 0.3
}
]
This mapping specifies that calls to the API endpoint will be served to the auto_deploy_api version 70% of the time and to the v1 version the remaining 30% of the time.

Note
Setting up multiple generations must be done manually and cannot be done through the automated deployment of API services.
Click the Save and Update button to update the API service with the new settings.
Choose either update option.
Test multiple generations of the endpoint¶
Let’s test it to confirm that some queries will be sent to a different generation of the API endpoint.
Navigate to the Status tab of the deployment, and then the Run and test tab of the predict_fraud endpoint.
Click Run All to run a few test queries.
Click the Details of each response to see which version of the API endpoint generated the response.

What’s next?¶
Congratulations! You successfully deployed multiple versions of a prediction endpoint in an API service to implement A/B testing.
To continue learning, see the tutorial on monitoring the output of API endpoints to learn how to set up a monitoring system that centralizes the logs from the API node and monitors the responses of endpoints.
Note
The reference documentation provides more information on managing versions of your endpoint.