R and Dataiku¶

Learn how to integrate R code into Dataiku.

Tip

To validate your knowledge of this area, register for the Shared Code course, part of the Developer learning path, on the Dataiku Academy.

Reference | Upgrading and rolling back the R version used in Dataiku¶

Prerequisites¶

When upgrading the base R version that is being used in a particular Dataiku environment, this is generally a two-step process that includes:

• Upgrading the R distribution itself on the server (typically using the system package manager, such as yum or apt depending on the Linux OS that is being used)

• Rebuilding the default R environment and all managed code environments (i.e. reinstall all R packages for each environment)

The latter is needed because it is important to note that binary compatibility between different versions of R is not guaranteed, which can lead to issues if these R packages are not reinstalled and R environments not rebuilt. In particular, upgrading R from v3.4 to v3.5 has been known to cause issues and result in all installed packages being broken. One such example can be seen in this Github thread.

Rebuilding the default R environment and managed code environments¶

When rebuilding the default R environment (found under <dss_data_directory>/R.lib), you will generally want to rename or remove this directory and then re-run the install-r-integration script. For more detailed instructions about how this can be done, please refer to the Rebuilding the R environment subsection in our R integration documentation.

As for rebuilding managed code environments, this can be done through the Dataiku user interface by navigating to the Administration > Code Envs tab, clicking on the code environment, and then selecting the “Rebuild env” option when updating the code environment.

Please note that if you have manually installed additional packages in the system’s library (as root), they will also need to be rebuilt, as mentioned in the product documentation.

Rolling back to a previous version of R¶

If you had saved the previous versions of the installed packages (as suggested above when renaming the <dss_data_dir>/R.lib directory instead of deleting it), rolling back should be as simple as reinstalling the previous version of R with the appropriate system package manager and then restoring the moved-away packages. Otherwise, these packages will need to be reinstalled again.

Troubleshoot | My R recipes aren’t working after upgrading or migrating my instance¶

The first thing to check when this happens is whether or not the R version has been upgraded on your instance. If so, since R does not maintain binary compatibility, please make sure to first try rebuilding your default R environment and managed code environments (as R.lib in Dataiku and/or your code environments are likely outdated).

If that doesn’t work, then this likely means that there are faulty packages that have been installed at a global level and which need to be removed. This can be done by doing the following:

• Run R from the terminal of your Dataiku server.

./R/bin

• Check for the problematic package(s).

find.package("TheBrokenPackage")

• If a global path (like /usr/share or /usr/lib or /usr/local) is returned, then open an R shell as the root user and remove said package(s).

remove.packages(c("TheBrokenPackage"))

• Repeat the previous steps until there are no more broken packages that remain.

Warning

Please make sure to replace TheBrokenPackage with the name of the actual package.