Code Environment Administration

This section describes how to create and manage code environments for use in recipes, plugins, and other objects. Actions described in this section require administrator rights.

Note

Find out more about the use and administration of code environments in the reference documentation or the user guide of the Knowledge Base.

How-to | Grant permissions to create or manage code environments

To create a code environment, you’ll first need permissions. An administrator can provide a non-administrator with the permissions to create or manage code environments by configuring the group the user belongs to. To do this:

  1. Navigate to Administration > Security > Groups.

  2. Choose a group and then navigate to the code envs section.

  3. Apply permissions.

Permissions include:

  • Create code envs. This lets the user create code environments and modify the code environments they create.

  • Manage all code envs. This lets the user modify all code environments.

Assigning code environment permissions to a group.

How-to | Create a new code environment

To create a new code environment:

  1. Go to Administration > Code Envs.

  2. Choose New Python Env or New R Env.

You’ll notice the default deployment type is Managed by DSS (recommended). This deployment type ensures a smooth deployment and optimal usage.

In addition, DSS will install the mandatory sets of Dataiku packages by default, as well as Jupyter notebook support packages. Without the mandatory packages, users will not be able to use Dataiku APIs or Jupyter notebooks.

Choosing a DSS managed deployment type for a new Python code environment.

As a best practice, code environments should:

  • Be managed by DSS.

  • Include all mandatory packages.

  • Have support installed for Jupyter notebooks.

How-to | Manage code environment properties

To manage the properties of a code environment:

  1. Go to Administration > Code Envs.

  2. Select an environment.

Properties page of a code environment.

Adding Packages

Managed code environments are created with a set of base packages which correspond to the mandatory and recommended packages that you selected to install when creating the environment. Your current settings require these packages. Therefore, you cannot remove them or modify their version constraints. However, you can add additional packages.

To add packages to a code environment:

  1. Open the Packages to install panel.

  2. Select sets of packages to add.

For more details, visit Add Packages.

Defining Permissions

By defining permissions, you can limit which groups have access to use and update a code environment.

To configure permissions for a code environment:

  1. Open the Permissions panel.

  2. Select a group to which you want to grant access.

  3. Click + Grant Access to Group and then apply permissions.

Configurable permissions include:

  • Use. This property defines which groups are allowed to use a code environment.

  • Update settings & packages. This property defines which groups can update settings and change included packages.

  • Admin. This property gives a group full administrative control over the code environment.

How-to | Configure default code environments

As an administrator, you can configure global default code environments for R and Python at both the instance and the project level.

Instance Level

To define global default R and Python code environments:

  1. Go to Administration > Settings > Misc..

  2. Under Global defaults, type the name of the environment.

Projects will automatically inherit the default code environment settings unless you select a code environment at the project level.

Project Level

To define a code environment at the project level:

  1. From the top navigation bar of a project, open the More options menu ().

  2. Select Settings then Code env selection.

For more details, visit Setting a Code Environment.

How-to | Install system-level package dependencies

Sometimes users need access to a library that has prerequisites for system packages that are not natively installed on the underlying operating system. These system packages must be installed before the given Python library can be used. Typically, an administrator with access to the OS installs these system packages.

When Dataiku is deployed to Cloud Stacks (such as on AWS or Azure), you can install system packages through Fleet Manager. To do this:

  1. Launch Fleet Manager.

  2. Under Settings > Instance templates, choose the instance template you want to modify.

  3. Under Setup actions > + New Action, choose Install system packages.

  4. In Packages to install, specify the packages you want to install on the instance, making sure to input only one package name per line.

  5. Save your changes.

Specifying system packages to install on an instance of Dataiku deployed to Cloud Stacks.

You’ll need to replay the setup actions before the changes can take effect.

To replay the setup actions:

  1. From Instances, choose All and locate the running instance.

  2. Right-click the configuration menu (three vertical dots) and choose >_ Replay setup actions.

  3. Select Confirm.

Repeat these steps for each running instance.

Note

Installing system packages on an instance will only install the system packages on the DSS instance(s) defined by the template. More steps are needed to update the images used for containerized execution with the additional system packages. For details about how an administrator with command line access to the DSS server can perform this step, visit Customization of base images, adding system packages.

Alternatively, for DCS deployments, you can run an Ansible task to customize the base images that are built when a node is provisioned. Note that the Ansible task is set to occur at the stage “After DSS install”.

Warning

While the system is building the custom base images, you’ll notice an increase in provisioning time.

Running an Ansible task to customize the base image when the node is provisioned.

FAQ | Does Dataiku support custom package repositories?

Dataiku supports custom package repositories. Configuring access to custom package repositories requires administrator rights.

This can be done for Python, R, as well as through an internet proxy.

How-to | Point DSS to a custom Python package repository

You can configure DSS to point to a specific package repository. One example is an offline system where access to internet-hosted package repositories is forbidden.

Note

Configuring access to custom package repositories requires administrator rights.

To instruct Pip to point to a specific package repository:

  1. Navigate to Administration > Settings > Misc**.

  2. In Extra options for ‘pip install’, specify the –index-url option and provide the address to the repository.

In our example, we’ve included the “–trusted-host” option to allow the install of a package over a connection without a verified SSL certificate.

You do not need to restart DSS for these changes to take effect.

Extra options for 'pip install' in the Misc. section of the administrator settings.

When Python code environments attempt to install new packages via pip, Dataiku points to the repository you specified.

How-to | Point DSS to a CRAN mirror

You can configure DSS to point to a CRAN mirror. One example is accessing a local repository when working offline.

Note

Configuring access to custom package repositories requires administrator rights.

To instruct Pip to point to a specific package repository:

  1. Navigate to Administration > Settings > Misc.

  2. In Extra options > CRAN mirror URL, specify the URL.

In our example, we’ve used a public mirror, however, you could also point to a locally hosted repository.

You do not need to restart DSS for these changes to take effect.

Extra options for 'pip install' in the Misc. section of the administrator settings.

When R code environments attempt to install new packages, DSS points to the CRAN mirror you specified.

How-to | Provide access to custom package repositories via an internet proxy

DSS supports accessing custom package repositories through a proxy. There are a couple of ways to achieve this.

Typically, a global proxy is configured at the OS level (e.g., using environment variables http_proxy and https_proxy) to route all traffic. For more details, visit HTTP proxies.

Alternatively, you could configure pip to direct traffic to a specific proxy whenever Python code environments attempt to install new packages. This may be particularly relevant for DCS deployments or where access to the backend OS is restricted.

Alternatively, you can configure the DSS frontend such that when building Python code environments pip will direct traffic to a specific proxy. This may be particularly relevant for DCS deployments or where access to the backend OS is restricted.

To instruct Pip to direct traffic to a specific proxy:

  1. Navigate to Administration > Settings > Misc.

  2. In Extra options for ‘pip install’, specify the proxy to apply to pip installations.

You do not need to restart Dataiku for these changes to take effect.

Extra options for pip proxy in the Misc. section of the administrator settings.

When Python code environments attempt to install new packages via pip, Dataiku directs the code environment to the proxy.