Where can I see how many records are in my entire dataset?¶
The default sample previewed in the Explore tab of a dataset is the first 10,000 records, but your whole dataset may have many more records than this. To check the full record count using DSS built-in methods, there are a few different options (additional details on each are included below):
From the Flow, select the dataset and directly compute dataset metrics from the Info tab on the far right panel, under the Status header.
With the dataset open, visit the Status tab to compute or review dataset metrics.
If record count is part of a recurring quality check (after a scenario run, for example), you can embed this metric into a Dataiku dashboard and set it to automatically update each time the table is rebuilt.
Method 1: From the Flow¶
With the dataset selected in the Flow, navigate to the Info tab in the far right panel and click Compute under the Status header.
Configured metrics will appear in-place inside this menu, and may be refreshed as needed from this point forward.
Methods 2 and 3: With the dataset open¶
From the Explore dataset view, navigate to the Status tab and click Compute.
The default metrics are column count and record count, but you can add additional dataset metrics in the Edit subtab if desired. Metrics are often used in conjunction with scenarios, but are not strictly dependent on scenarios. For example, tracking the number of records might show you how many new customer records are getting added to the database each day.
Metrics can be published to a Dataiku dashboard, and if you would like them to automatically update each time the dataset is rebuilt (as might be the case in a recurring automation scenario), simply toggle the option for Auto compute after build to Yes.
Note that metrics probes are automatically historized, which is very useful to track the evolution of a dataset’s status. To review the history of a dataset metric, simply select History instead of Last value in the Display dropdown menu of the main Metrics page.
You can find more information about metrics in our documentation here.
If the metrics you’ve configured are part of a scenario, you may be interested in receiving updates about scenario activities such as model training or changes in data quality.
Dataiku DSS provides the ability to add reporters which can send updates and actionable messages about scenario activities to users. To learn more about setting up reporters, visit this hands-on tutorial.