Concept: Plugins in Dataiku¶
Dataiku contains native visual components that allow you to connect to data, process data, train models, and so on. At the same time, Dataiku allows you the flexibility of implementing custom components and sharing them with others. These custom components are packaged as plugins.
There are four ways to access plugins in Dataiku:
by installing them from the Dataiku plugin store;
by developing them within Dataiku;
by uploading .zip files that package plugins, and
by fetching them from a git repository.
To develop a plugin, you program the backend using a language like Python or R. Then, you create the user interface by configuring parameters in .json files. For this lesson, we’ll focus on plugins that are already available in the Plugin Store.
This content is also included in a free Dataiku Academy course on the Plugin Store, which is part of the Advanced Designer learning path. Register for the course there if you’d like to track and validate your progress alongside concept videos, summaries, hands-on tutorials, and quizzes.
A plugin in Dataiku can contain one or more related components. Each plugin component consists of a graphical user interface (GUI) wrapper around code, and it exposes a single type of Dataiku element, such as a dataset, recipe, webapp, processor, and more.
As an example, let’s take a look at the US Census plugin, which consists of six components — three visual recipes and three dataset connectors.
We can use the visual recipes from this plugin to enrich a dataset with one of the hundreds of socio-demographic variables from the US Census Bureau. We can also use the dataset connectors to build and use the US Census data directly within Dataiku.
A plugin’s documentation is available on the Plugins page of the Dataiku website. The plugin’s page also includes a link to its source code on GitHub. Note that many plugins from the Dataiku Plugin store are open source.
Once installed, the plugin is now available to the Dataiku instance where you installed it. It is available to all users of the instance. For example, the installed US Census plugin can be used directly and visually from the Flow to enrich some input data or connect to US census data. Let’s now explore how to use some plugin components.
Using a Visual Recipe Component¶
The plugin’s visual recipe components can be accessed by clicking the +Recipe button in the Flow, or by accessing them in the right panel.
The plugin’s visual recipe works like any other visual recipe. That is, we can select input parameters in the recipe settings and then run the recipe.
Using a Dataset Connector Component¶
To use one of the dataset connector components, Census USA, we can use the + Dataset button to access new dataset connectors for connecting to US census data.
Then we will input values for parameters, such as the states for which we want to get data, the geography granularity, and the fields we want to extract. At the backend, Dataiku fetches the data from the US Census website.
Using a Processor Component¶
The Zipcode geocoding plugin includes a processor component called zipcode geocoding that is accessible from the Prepare recipe.
Using this processor, we can extract geographic coordinates from location data such as country and zip code. This processor step works just like any other in the Prepare recipe!