Hands-On Tutorial: The Public API in Dataiku¶
The Dataiku APIs give coders the flexibility to complete both routine and complex tasks with code instead of the visual interface.
In this tutorial, you will gain a better understanding of the difference between the HTTP REST API and the Python client to the public API. In addition, you’ll learn how to programmatically interact with Dataiku objects such as projects, datasets, variables, flows, jobs, and scenarios.
Let’s Get Started¶
The use case for this project is simple: run a scenario to print a message informing us whose birthday is today. However, rather than relying on the visual interface, you’ll be using the public API to execute this and a number of other tasks.
To complete this tutorial, you will need:
Dataiku - version 9.0 or above.
A Python environment that includes the packages requests and datetime.
This tutorial was tested using a Python 3.6 code environment, but other Python versions may also be compatible.
You can find instructions for creating a code environment compatible with all courses in the Developer learning path in this article.
When you have completed the tutorial, you will have built the Flow pictured below and much more (all without touching the visual tools):
Create a Project¶
To get started, create the project below.
From the Dataiku homepage, click on +New Project > DSS Tutorials > Developer > APIs in Dataiku DSS.
You can also download the starter project from this website and import it as a zip file.
The starting Flow of this project is very simple.
The birthdates dataset contains records of names and birth dates.
The Prepare recipe extracts the components of the birth dates (day, month, year).
The Filter recipe filters the dataset for the current date.
We could manually build this Flow, but instead, we are going to use the public API to automate this task.
Open the Notebook¶
Unlike most Academy tutorials, most of the instructions for this tutorial are self-contained in a pre-existing Python notebook.
Navigate to the Notebooks page (G+N).
Open the notebook Tutorial Instructions, and begin running the cells one at a time.
Depending on the kernels available to your instance, you may need to select a new kernel, inside the notebook. This tutorial was tested using a Python 3.6 code environment, but other Python versions may also be compatible.