Concept | Collaboration#

Watch the video

When people with different technical and business skillsets contribute to data initiatives, they typically use a wide variety of tools and approaches, which can make it hard for them to officially work together.

A finished Dataiku project often reflects participation by different roles across its lifecycle. Perhaps engineers or architects established the connections to the data sources and systems. Analysts or citizen data scientists cleansed and prepared the data and built initial insights. Data scientists performed advanced feature engineering and modeling tasks, and ML engineers and IT operators kept the automated jobs and models running well in production.

This article covers some of the ways cross-functional teams can collaborate in Dataiku, including:

Project Flow and homepage#

The Flow serves as a central space where people using visual tools to prepare and analyze data can collaborate seamlessly with those who prefer to code in languages like Python or SQL. Newcomers to the project, regardless of their technical expertise, can easily come up to speed on what’s already been done because the entire pipeline is transparent and visually documented in Dataiku’s Flow.

Dataiku screenshot of a project Flow.

For instance, they can view details about input data such as where they originated and how fresh it is, especially with the help of any custom tags or descriptions the team has added.

Searchable tags can be used on all elements of the projects. Once elements are tagged, users can view tags to easily identify parts of the Flow or filter the Flow using Flow views. With Flow views, it’s easy to see who contributed to a project and when.

On the project homepage, all the contributors are listed along with a running log of recent items.

Dataiku screenshot of a project homepage.

Workspaces#

While the Flow enables transparency and collaboration by visually documenting project pipelines, workspaces take this collaboration a step further by providing a dedicated environment for team members to organize and centralize shared resources.

It acts as a single point of access where stakeholders can easily find and access shared objects like applications, dashboards, webapps, datasets, and wiki articles from different projects available in a Dataiku instance. You can also incorporate external links into a workspace.

Libraries for code sharing#

Coders can collaborate among themselves by sharing code snippets, code recipes and notebooks, and through the use of shared code libraries and Git.

Dataiku screenshot of code snippets shared.

Wikis#

The project wiki is a central place where teams document their motivations and methods to preserve critical knowledge for others. For example, Dataiku uses wikis and all of our off-the-shelves solutions to communicate how the solution works, technical requirements and how to modify for your specific needs.

Dataiku screenshot of a project wiki.

Data catalog#

Even better, these wikis along with each dataset recipe and every other artefact you create in Dataiku are auto indexed in the catalog. The catalog is a good starting point for when you’re looking for existing work to use or even colleagues who are subject matter experts on the topic you’re interested in.

Dataiku screenshot of a project wiki.

Search any keywords or column names and browse or connect the data sources to find and explore assets and datasets relevant to your work.

Feature store#

For data scientists and engineers building models and any other advanced analytics tasks, the feature store is an efficient way to discover and reuse high quality features and curated datasets that others have created to accelerate their own projects.

Dataiku screenshot of a feature store.

Other collaboration features#

Project managers appreciate other collaboration features like the:

  • Project description in to-do lists.

  • Embedded discussions (chat), which keep team discussion and decisions contained within the project rather than in external emails and message threads.

  • Activity views showing contributions over time and by team member.

  • Dashboards, which are a great way to collaborate and share with your team, especially read-only users.

What’s next?#

You now see how collaboration among and across different user profiles is woven into every part of Dataiku. And these features are just the tip of the iceberg. To go further, check out all the collaboration articles.