Concept: Date Handling in DSS¶
Working with dates poses a number of data cleaning challenges.
There are many date formats, different time zones, and components like “day of the week” which can be difficult to extract. A human might be able to recognize that “1/5/19”, “2019-01-05”, and “1 May, 2019” are all the same date. However, to a computer, these are just three different strings.
Strings representing dates need to be parsed, so that the computer can recognize the true, unambiguous meaning of the Date. The DSS answer to this problem can be found in the Prepare recipe.
When you have a column that appears to be a Date, DSS is able to recognize it as a date. In the example below, the meaning of the first column is an unparsed date.
You could open the processor library, filter for Dates, and search for a step to help in whatever situation you may find yourself. Here, we find the Parse date processor.
You could also take advantage of how DSS suggests transformation steps based on a column’s meaning. Because DSS has identified this column as an unparsed date, it suggests adding the Parse date processor to the script. Both methods achieve the same result.
Once you have chosen the correct processor, it is just a few more clicks to select the correct settings, in this case, the format of the date and the timezone for example.
Once you have a properly parsed date, you’re on your way! DSS will suggest new steps, such as “Compute time since”, “Extract date components”, and “Filter date range”.