Product Pillar: Dataiku DSS Architecture

Dataiku DSS leverages an enterprise’s existing investments in data infrastructure and employee skills, while being able to grow with an enterprise as those investments change:

  • Administrators install DSS on a Linux server, which can be on-premises at their own data center or on-cloud.

  • Data engineers and cloud architects connect DSS to existing data sources.

  • Data and business analysts log in to DSS via a browser and can perform their tasks through a visual interface.

  • Data scientists have the further option of connecting to DSS through their favorite IDE (such as PyCharm, VSCode, RStudio, and Sublime Text).

  • Analytics leaders and other stakeholders receive reports generated by colleagues working in DSS, or can log in and view analytic dashboards.

  • Developers can use the API to query deployed models and build AI-powered applications.


DSS Instances

An enterprise data stack typically includes development, production, and deployment environments. In order to work across these environments, a separate instance of Dataiku DSS is installed in each environment.

A Dataiku DSS instance is an installation of the product that serves the needs of a particular environment:

  • The Design node instance, in the development environment, is used to create the pipelines that turn data into outputs, such as transformed data, models, dashboards and reports.

  • The Automation node instance, in the production environment, puts pipelines from the Design node into production to turn enterprise data into the final outputs.

  • The API node instance, in the deployment environment, makes model outputs from the Automation or Design node available for use in real-time scoring.


DSS Projects

Pipelines in the Design and Automation nodes are organized into projects, which can be accessed from the main page after logging in to the Dataiku DSS instance.

A Dataiku DSS project is a holder for all work on a particular activity. The project home acts as the command center from which you can see the overall status of a project, view recent activity, and collaborate through comments, tags, and a project to-do list.


From the homepage of the NY Taxi Fares project, users can check the timeline for recent activity and changes from collaborators; access datasets, recipes, and models; and view documentation and dashboards.