Cloning a Library from a Remote Git Repository

An important end goal of writing code is to be able to reuse it, whether within a Dataiku DSS project, across projects within a DSS instance, or for projects external to DSS.

To this end, you can define code Libraries within DSS that contain reusable code, and you can connect these libraries to remote git repositories.

Prerequisites

  • Familiarity with code in Dataiku DSS

  • Familiarity with the basics of Git

Technical Requirements

Connect to a Remote Git Repository

Within any DSS project, navigate to Code > Libraries to the Library Editor.

../../../_images/library-editor.png
  • Click Git > Import from Git.

  • Enter https://github.com/dataiku/dss-plugin-sample-correlations as the Repository.

  • Leave master as the branch to checkout.

  • Enter python-lib as the Path in repository. This repository contains a plugin, and for this project library, we only want to retrieve the library that is part of the plugin. To retrieve the entire plugin, we can clone it from the remote Git repo to the Plugin editor.

  • Enter python/compute-corr as the Target path. This determines where in the project library the remote code will be stored.

  • Click Save and Retrieve.

You should now see the contents of the remote library in the Library Editor.

../../../_images/library-cloned.png

The library functions can now be used in code in the DSS project by including an import statement such as:

from compute_corr import *

Pulling Updates from the Remote Repository

When code on the remote repository is updated, you can pull those updates to your local project library. From within the Library Editor:

  • Click Git > Manage references.

  • Click Update on each individual remote Git repository from which you want to pull updates.

  • Alternatively, click Update All References to pull updates from every remote Git repo.

../../../_images/library-update.png

Note

Changes made to your local Dataiku DSS project library cannot be pushed back to the remote Git repository.