Concept | Version control for Dataiku projects#
What is version control?#
Version control is the practice of recording revisions and changes to the files (and often to the code) of a project. In other words, version control lets you manage and access the history of your projects.
This practice is largely implemented with Git, an open source revision control system that relies on a set of commands to accomplish tasks such as staging, reverting, and merging changes. Importantly, Git also enables collaborative, concurrent work by supporting branching and merging. This way, colleagues can work on different parts of a project simultaneously without interfering with each other’s progress.
It’s no surprise, then, that Dataiku also uses Git to track changes in Dataiku projects! You’ll have to be familiar with common Git commands and workflows to leverage version control in Dataiku.
Local Git functionality#
Each Dataiku project has its own local Git repository that automatically commits any change to the project. It also enables you to:
Revert to previous commits. See How-to | Undo actions in Dataiku for details.
Create branches to allow multiple changes to be written simultaneously.
Merge to integrate changes from one branch to another branch.
View the change history of the project.
Remote capabilities#
While version control can be managed natively in Dataiku, your organization might want to integrate Dataiku with their established external git-hosting service.
This is why you can connect your Dataiku projects to remote repositories like GitHub, GitLab, and Bitbucket!
See also
Find more information on external connections in Working with Git in the reference documentation.
Project exports#
When you export a project, there is an advanced option to export the local Git repository with the project. This means that the commit history, branches, and other information will appear when you import and open the project.
What’s next?#
Practice using Git functionality in Tutorial | Git for projects.