Concept: Categorical and Numerical Variables

Recognizing categorical and numerical variables in a worksheet can help you to identify the appropriate statistics to compute on them, as well as to understand certain behaviors of Dataiku DSS.

In DSS the character \(A\) next to a variable name indicates that it is categorical (nominal or ordinal), while the character \(\#\) indicates that it is numerical.

Also, DSS enforces the use of the correct variable type when computing statistics. For instance, when creating a PCA card or a correlation matrix card, DSS disables the selection of categorical variables so that these cards do not return meaningless results.

Likewise, when you choose variables for cards such as the univariate and bivariate analysis cards, DSS automatically selects the appropriate statistics options to compute, while disabling the selection of options that would be meaningless for the types of variables that you selected.

Note

After creating a univariate or bivariate card, you can change the way that DSS treats the variables used in the card. However, this change does not affect the schema of the dataset or the default way that DSS will treat the same variables in a different card.

../../../_images/stats_change_variable_behavior.png