Concept: Filter Recipe¶
In Dataiku, you can create simple or advanced row filters with the Filter recipe. You can filter a dataset down to a subset of rows for further analysis. You can also filter on a condition or multiple conditions.
Let’s filter our orders on a date range where the order date is between July 1st and August 1st.
The result is a dataset containing only those orders between July 1st and August 1st.
We can also filter by value. Let’s try filtering our orders on a column value where the order type is a white t-shirt. The resulting output dataset contains only those rows where the Type column contains a value of W_tshirt.
Now, let’s filter on multiple conditions using the “and” operator. In this example, only one order was made in July and was a white t-shirt.
Finally, let’s filter on multiple conditions using the “or” operator. Almost all of the orders in this table were placed either in July or were of type “white t-shirt”.
DSS will use a sample of the dataset by default. If you don’t want to sample your rows after filtering, select the Sampling menu, then choose No sampling (whole data).