Concept | Labeling recipe#

The Labeling recipe provides a structured, collaborative workflow to label tabular, text, or image data for use in training a machine learning model.

You can also use the recipe to keep humans in the loop and review predictions from an ML model or responses from a GenAI model.

Use case#

A machine learning model needs a ground truth to learn from, so it can accurately classify images, extract meaning from text, or predict outcomes from tabular data. The Labeling recipe allows teams to add this ground truth.

For example, a healthcare provider might want to train an object detection model to confirm in images whether employees are wearing proper personal protective equipment (PPE), such as masks, gloves, or suits.

First, the team must label these items in input images so the model can learn which items are PPE. In the Labeling recipe, the team can create bounding boxes around PPE items and identify their classes.

Another potential use case is using the Labeling recipe to annotate text responses from an LLM based on accuracy.

The recipe supports six types of labeling tasks:

Task

Data type

Description

Record classification

Tabular

Assign labels to rows of a dataset.

Free-text

Tabular

Assign free-text labels to rows of a dataset.

Object detection

Image

Identify objects with bounding boxes.

Image classification

Image

Assign labels to images.

Text annotation

Text

Tag elements of text records.

Text classification

Text

Assign labels to text records.

Choosing the type of task to start a new Labeling recipe.

Labeling recipe configuration#

The process for completing labeling tasks includes:

  • A project manager creates the recipe, selects the type of labeling task, and configures settings.

  • Annotators create labels for the images, texts, or records.

  • Reviewers resolve conflicts and validate labels.

  • Project managers track status and performance.

  • The recipe outputs an annotated dataset for use in model training or review.

Each of these takes place in a different tab of the recipe.

Settings tab#

In the Settings tab, the manager sets up the task. This includes configuring the data for labelers to work on, desired labels, instructions, and user permissions.

The Settings tab to set up an object detection labeling task.

Annotate tab#

The assigned labelers can then annotate the image files, text, or records in the Annotate tab of the recipe. The Annotate interface differs depending on the type of data.

The Annotate interface includes categories and instructions set up by the manager.

If the project manager enabled comments, labelers can also add free-text comments to each record. This is useful to allow the labelers to communicate uncertainty or other important information to the reviewers. Reviewers can view comments in the Review tab and output dataset.

Review tab#

After the labelers are finished, reviewers can check annotations in the Review tab.

The reviewer can see conflicts when labelers don’t assign the same class, bounding box, or annotation (depending on the type of task). The reviewer can also:

  • Resolve conflicts.

  • Validate labels.

  • Create their own annotations if needed.

  • Send a document back to be labeled again.

The Review screen allowing a manager to validate, change, or reject labels.

Overview tab#

The Overview tab allows the project manager to follow the project with indicators on how many documents are labeled and reviewed, the performance of each labeler, and the popularity of each label.

The Overview tab, showing progress of the project.

Inputs/Outputs tab#

The project manager can configure input files and the output dataset in the Input/Output tab.

The Input/Output tab of the Labeling recipe.

The output dataset is continually updated with all reviewed documents, including labels and comments, if enabled.

An example output dataset for a labeling task.

See also

For more information, see Labeling in the reference documentation.