Product Pillar: Dataiku DSS Architecture

Dataiku DSS leverages an enterprise’s existing investments in data infrastructure and employee skills, while being able to grow with an enterprise as those investments change:

  • Administrators install DSS on a Linux server, which can be on-premises at their own data center or on-cloud.

  • Data engineers and cloud architects connect DSS to existing data sources.

  • Data and business analysts log in to DSS via a browser and can perform their tasks through a visual interface.

  • Data scientists have the further option of connecting to DSS through their favorite IDE (such as PyCharm, VSCode, RStudio, and Sublime Text).

  • Analytics leaders and other stakeholders receive reports generated by colleagues working in DSS, or can log in and view analytic dashboards.

  • Developers can use the API to query deployed models and build AI-powered applications.

../../../_images/Architecture_Slides_diagram.png

DSS Instances

An enterprise data stack typically includes development, production, and deployment environments. In order to work across these environments, a separate instance of Dataiku DSS is installed in each environment.

A Dataiku DSS instance is an installation of the product that serves the needs of a particular environment:

  • The Design node instance, in the development environment, is used to create the pipelines that turn data into outputs, such as transformed data, models, dashboards and reports.

  • The Automation node instance, in the production environment, puts pipelines from the Design node into production to turn enterprise data into the final outputs.

  • The API node instance, in the deployment environment, makes model outputs from the Automation or Design node available for use in real-time scoring.

../../../_images/Architecture_Slides_nodes.png

DSS Projects

Pipelines in the Design and Automation nodes are organized into projects, which can be accessed from the main page after logging in to the Dataiku DSS instance.

A Dataiku DSS project is a holder for all work on a particular activity. The project home acts as the command center from which you can see the overall status of a project, view recent activity, and collaborate through comments, tags, and a project to-do list.

../../../_images/intro-project-home.png

From the homepage of the NY Taxi Fares project, users can check the timeline for recent activity and changes from collaborators; access datasets, recipes, and models; and view documentation and dashboards.