Concept Summary: Time Series Interval Extraction Pt 2¶
It’s important to have an intuitive sense of the Interval Extraction recipe’s intuition, described in Part 1, but you should also be aware of how it actually works, so that you can avoid unexpected results.
Let’s look at the mechanics of this recipe in Part 2.
The Mechanics of the Interval Extraction Recipe¶
We’ll continue with the same example time series and the same threshold range used in the first part of the lesson.
This flowchart captures how the Interval Extraction recipe functions.
The Interval Extraction recipe Part 2 video walks through this flowchart in detail. The summary here only explains the variables and final results.
Recipe Parameters and Variables¶
Let’s start with the key recipe parameters. The threshold range set by the user is 25 to 35, and, for this example, we’ll set the acceptable deviation and the minimal segment duration to 1 day.
As we walk along the time series row by row, we’ll need to keep track of a few different variables.
Let i represent an index starting from 1.
As we increment i row by row, Ti represents the current timestamp. So, if i is 1, Ti or T1, is the first timestamp.
In the same row as the current timestamp is the current value, or vali.
Next, we’ll have to keep track of the timestamps marking the beginning and end of a valid interval.
That is Ta and Tb.
These values are both initially NULL because we do not yet have a candidate for a valid interval.
We’ll also need to know the total number of timestamps in the series, to use in the stopping criterion for the process described in the flowchart.
We’ll call it N.
We also have to maintain a counter for the current deviation from the threshold range.
We’ll call it dev and initialize it to 0.
Should any number of timestamps meet the conditions set by the segment parameters, the recipe will assign interval IDs using the id variable.
The recipe starts assigning IDs from 0, so we’ll initialize the variable id to 0.
Mastering this recipe takes some practice. Here are the results for the example time series above using three different sets of segment parameters.
Acceptable Deviation: 0 Days; Minimal Segment Duration: 0 Days¶
Acceptable Deviation: 0 Days; Minimal Segment Duration: 1 Day¶
Acceptable Deviation: 1 Day; Minimal Segment Duration: 1 Day¶
Be sure to try out a few examples on your own with different segment parameters to make sure you have got the hang of it!
Now we are ready for Part 3, where we test out this recipe in Dataiku DSS!