The NY Taxi Project through the AI Lifecycle

The previous course in this series, Dataiku DSS: The Value Proposition, introduced some of the key features of DSS that enable enterprises to build their own path to AI. With a broad understanding of these capabilities, a closer look at a sample DSS project should provide a more concrete illustration of how this path can take shape.

While doing so, it is often beneficial to speak in terms of some kind of analytics framework or methodology, such as CRISP-DM or ASUM-DM. A Dataiku DSS project can be envisioned through the AI Lifecycle diagram below.

As the diagram below illustrates, this framework comprises a series of iterative cycles that span stages of Question, Discover, Experiment, Deploy and Operationalize. These stages are executed with the DSS Design, Automation, and API nodes working together.


In our example, the question is simple: How to predict taxi fares in New York City, given a starting location, a destination, and a particular time of day?

With this question in mind, the AI lifecycle moves to the Discovery stage. Users begin in the Design node of DSS to explore and transform the necessary data, providing valuable insights into the question at hand.

Informed by these insights, in the next stage, users build and assess the performance of machine learning models. This Experiment stage is also accomplished in the Design node of DSS.

Having chosen a model from the Experiment stage, it is time to define the API service used to deploy the model to the API node of DSS. Finally, users can operationalize the model in the Automation node.

Of course, DSS supports many different paths to achieving Enterprise AI. Although the AI lifecycle in DSS always begins in the Design node, an enterprise’s particular use case, project objectives, and infrastructure choices may determine how the Automation and/or API nodes are utilized.

With a clear business question to investigate, let’s take a closer look at the following stages, beginning with Discovery.