Concept: How Dataiku Handles and Displays Date & Time

Dataiku and dates

In Dataiku, “dates” mean “an absolute point in time”, meaning something that is expressible as a date and time and timezone.

For example, 2001-01-20T14:00:00.000Z or 2001-01-20T16:00:00.000+0200, which refer to the same point in time (14:00Z is 2pm UTC, and 16:00+0200 is 4pm UTC+2, so 2pm UTC too).

Dataiku only displays dates in UTC

If you use the Format Date processor with a proper ISO8601 format, it will temporarily show it as a different time zone, but as soon as you write it out or read it in a chart, it will be in UTC again.

If you use a formatter to format as 16:00+0200 and select the output to be a string, then the string value will be preserved, but it’s not a date anymore.

As for dates in SQL

A date column in SQL will optionally be read in Dataiku as a “DSS date” (i.e. an “absolute point in time”), also known as a “timestamp with time zone” in SQL parlance. So when Dataiku reads “2020-02-14” from the SQL table, it has to map it to a time. For that, it assumes that it is corresponding to “midnight”, but which midnight?

On a SQL dataset, there is an “assumed time zone” setting for this. If you select “Local” as “assumed time zone” in the settings of a recipe’s input SQL dataset, then Dataiku will consider that it is reading “2020-02-14 at midnight in Netherlands” (for example, if your local TZ on your server is Europe/Amsterdam). Dataiku then displays this in UTC, so “2020-02-13T23:00:00Z”. If you want it to show “2020-02-14T00:00:00Z”, you must set the assumed time zone to UTC.

Where to learn more?

For more information on managing dates with Dataiku, please see our reference documentation.

For a more hands-on approach, you might also want to check out this brief tutorial on parsing dates with Dataiku.