Concept | Metrics#

Metrics are measurements on datasets, folders, models, or model evaluation stores. They allow for monitoring the current status and evolution of these Dataiku items. For example, we could compute:

  • The number of records in a dataset

  • The size of a folder

  • The accuracy of a model

Examples of metrics and checks used on managed folders or saved models.

Metrics on datasets#

The Metrics tab within datasets includes several default metrics, such as the column count and record count. You can also edit the metrics to include:

  • Column statistics such as sum, average, minimum, maximum, etc.

  • Most frequent values of columns, such as the mode or top N values

  • Column percentiles

  • Data validity, or checking that the records match with a column’s Meaning. (Note that the dataset’s meaning must be locked or manually selected for this metric to be applied.)

  • Custom metrics calculated using a formula or code

Because there can be a lot of available metrics on an item, you can select the metrics you want to add to the displayed screen tiles on the Metrics page.

Examples of metrics and checks used on managed folders or saved models.

Note

You can also create custom metrics using a Python probe or SQL probe. For more information and examples, see the Knowledge Base concept or the Developer Guide.

Metrics on other Flow objects#

In addition to datasets, you can also set up metrics on other Flow objects:

  • Folders

  • Models

  • Model evaluation stores

For example, you can compute metrics in the Status tab of a managed folder to see the count of included files and size of the folder.

Examples of metrics on a managed folder.

Monitoring metrics#

You can use metrics to monitor the value of a metric in combination with checks or data quality rules.

Object

Feature for monitoring

Datasets

  • For Dataiku 12.6 or above, use data quality rules. (Note that data quality rules don’t need to be built on top of metrics, but they can be.)

  • For Dataiku 12.5 and below, use checks.

Folders, models, or model evaluation stores

  • For all versions of Dataiku, use checks.