Tutorial | Git for projects

Dataiku has three primary integrations with Git:

  • Code libraries

  • Plugin development

  • Projects

The first two integrations enable coders and developers to more effectively share their work across Dataiku projects and instances.

Git integration for projects enables even the non-coders on the team to take advantage of version control.

Each change that you make in a Dataiku project is automatically committed to a local Git repository. Thus, any normal contribution to a project passively uses the Git integration for projects.

This tutorial will walk through the active use of the Git integration to:

  • Connect a local project to a remote Git repository.

  • Branch the project in order to do some “experimental” work without affecting the Flow for other members of the data team.

  • Push project changes from the local branch to the remote Git repository.

  • Merge the branch into master.

  • Pull the changes to master from the remote Git repository to the local project.


It is strongly recommended to have a good understanding of the Git model and terminology before using this feature.

Technical requirements

  • Access to a remote Git repository where you can push changes. Ideally it should be an empty repository.

  • Access to a remote Git repository and a Dataiku instance that has been set up to work with remote Git repositories. See Working with Git in the reference documentation.

  • A project to practice with. This tutorial will use the Haiku Starter project, which can be found by selecting +New Project > DSS tutorials > Core Designer > Haiku Starter from the homepage of a Dataiku instance.


You can also download the starter project from this website and import it as a zip file.

Connect to a remote Git repository

  • From the More Options (…) menu in the top navigation bar, select Version Control.

This shows that we are on the master branch of the project.

Dataiku screenshot of the version control page of a project.
  • Click on the change tracking indicator and select Add remote.

  • Enter the URL of the remote and click OK.

  • From the change tracking indicator, select Push.

Dataiku screenshot of the version control page showing the push option.

In your remote Git repository, you can see that the master branch has been successfully pushed.

GitHub screenshot of the project pushed.


Each project must have its own repository.

Branch the project

  • From the branch indicator, click Create new branch.

  • Name the new branch prune-flow and click Next.

  • Click Duplicate and Create Branch.

This creates a duplicate project working on the prune-flow branch.



Key concept: Duplicated projects for branching

A given Dataiku project can only be on one branch at any given time. If you switch the branch of the current project, this will affect all collaborators, and you can’t work on multiple branches at once.

Now we can make our changes to the duplicate project on the prune-flow branch without disturbing the rest of the data team’s use of the master branch of the project. Go to the Flow of the project and see that the Flow forks three ways from the Orders_enriched_prepared dataset.


We will prune the flow by removing the Orders_by_Country_Category and Orders_filtered datasets.


Push branch changes to the remote repository

  • From the project menu in the top navigation bar, select Version Control.

  • From the change tracking indicator, select Push.


Merge branch changes to master

You can see the prune-flow branch has been pushed to your remote Git repository. In order to merge the changes with the master branch, do that in the normal way outside of Dataiku.



Branching and Merge Conflicts.

This tutorial describes an extremely simple branch and merge. If multiple collaborators each create a separate branch off of master, and then try to merge their separate branches back to master, they are likely to encounter Git merge conflicts. These can be difficult to resolve, and we may not be able to solve them for you. Your data team should agree on a plan for how to collaborate on projects using Git in order to avoid merge conflicts.

Pull master changes to local

Finally, to see the merges reflected in Dataiku, return to the original project.

  • From the change tracking indicator, Fetch the changes from the remote Git repo, and

  • then Pull the changes to your local Git.