Concept: Filter Recipe

In Dataiku, you can create simple or advanced row filters with the Filter recipe. You can filter a dataset down to a subset of rows for further analysis. You can also filter on a condition or multiple conditions.

Let’s filter our orders on a date range where the order date is between July 1st and August 1st.

../../_images/filter-before.png

The result is a dataset containing only those orders between July 1st and August 1st.

../../_images/filter-by-date-range.png

We can also filter by value. Let’s try filtering our orders on a column value where the order type is a white t-shirt. The resulting output dataset contains only those rows where the Type column contains a value of W_tshirt.

../../_images/filter-by-value.png

Now, let’s filter on multiple conditions using the “and” operator. In this example, only one order was made in July and was a white t-shirt.

../../_images/filter-condition-and.png

Finally, let’s filter on multiple conditions using the “or” operator. Almost all of the orders in this table were placed either in July or were of type “white t-shirt”.

../../_images/filter-condition-or.png

DSS will use a sample of the dataset by default. If you don’t want to sample your rows after filtering, select the Sampling menu, then choose No sampling (whole data).

../../_images/filter-whole-data.png