Concept | Labeling recipe#

The Labeling recipe provides a structured, collaborative workflow to label tabular, text, or image data for use in training a machine learning model. For generative AI models, the recipe can also be used to keep humans in the loop and review the model’s responses for accuracy.

The Labeling recipe supports five types of labeling tasks:

  • Object detection: Identify objects with bounding boxes.

  • Image classification: Assign labels to images.

  • Text annotation: Tag elements of text records.

  • Text classification: Assign labels to text records.

  • Record classification: Assign labels to rows of a dataset.

Choosing the type of task to start a new Labeling recipe.

Though the annotation interface differs depending on the type of task, the general process is the same:

  • A project manager sets up the labeling task.

  • Annotators create labels for the images, texts, or records.

  • Reviewers resolve conflicts and validate labels.

  • Project managers can track status and performance.

  • The labeled output dataset can be used to train a machine learning model, for instance.

For example, a health provider might want to train a model to detect in images whether employees are wearing proper personal protective equipment (PPE) of different kinds, such as masks, gloves, or suits. In this case, the recipe input is a dataset pointing to a managed folder that contains the unlabeled images.

Setting up a labeling task#

When creating a Labeling recipe, the project manager first selects the type of labeling task, in this case an object detection task. Then, in the Settings tab, the manager sets up the task and instructions for labelers. In the Data panel, we’ll make sure that the recipe input is a dataset pointing to a managed folder that contains the unlabeled images.

The Settings tab to set up an object detection labeling task.

In the Categories panel, the manager creates the desired labels and optional instructions to help the labelers choose the correct categories.

The Categories panel to set up labels and instructions for labelers.

In the Task Setup screen, the manager can write overall guidelines for the annotators, choose the number of labelers for each image, enable comments, and change other settings.

The Task setup page for an object detection labeling task.

In Permissions, the manager can grant access to individual users or groups of users for the task. Annotate or Review permissions can be granted on each specific task and require users to be granted at least Read dashboards permission on the parent project.

Note

See the reference documentation for more information about permissions on labeling tasks.

The Permissions page to grant permission to users or groups.

Annotating data#

The assigned labelers can then annotate the image files, texts, or records in the Annotate tab of the recipe. The Annotate interface includes the overall instructions written by the manager, along with the definitions of classes the manager set up in the Categories page.

The Annotate interface includes categories and instructions set up by the manager.

If comments are enabled by the project manager, labelers can also add free-text comments to each document. This is useful to allow the labelers to communicate uncertainty or other important information to the reviewers. Reviewers can view comments in the Review tab and output dataset.

Reviewing labels#

After the text or images have been labeled, they are available to check in the Review tab.

The reviewer can see conflicts when annotators do not assign the same class, bounding box, or annotation (depending on the type of task). The reviewer can also:

  • Resolve conflicts

  • Validate labels

  • Create their own annotations if needed

  • Send a document back to be labeled again

The Review screen allowing a manager to validate, change, or reject labels.

Progress and performance overview#

The Overview screen is available for all labeling types. It allows the project manager to follow the project with indicators on how many documents have been labeled and reviewed, the performance of each labeler, and the popularity of each label.

The Overview tab, showing progress of the project.

Output#

The output dataset is continually updated with all reviewed documents. Labels are included in the column created by the manager in the Task Setup panel.

The Overview tab, showing progress of the project.