Concept: APIs in Dataiku DSS

In this lesson, you will learn about APIs in Dataiku DSS.

The Value of APIs

Your first introduction to Dataiku DSS may have been through its visual representations of datasets, recipes, and models.

For many use cases, these visual, point-and-click tools can quickly get the job done, while also opening avenues for collaboration with a much broader set of colleagues.

A slide recapping the value of visual tools in Dataiku.

As a coder though, you never want a visual interface to restrict the scope of what’s possible. Dataiku offers APIs to enable coders to more easily interact with DSS objects and the instance itself, purely through code.

The APIs provide coders the flexibility to accomplish the same tasks offered by the visual interface (and much more) by writing code.

A slide recapping the power of APIs in Dataiku.

Actions with the Dataiku DSS APIs

Think of the ability to:

  • read and write datasets;

  • interact with folders and saved models;

  • and perform dynamic SQL queries.

For these kinds of actions, the DSS APIs take care of the low-level engineering needed to connect to DSS objects. However, this is just the beginning of what’s possible.

You can also use the APIs to:

  • create and manage projects, groups, and users;

  • build Flows and check the status of jobs and scenarios;

  • and manage the schema, metadata, and partitioning of datasets.

For these kinds of actions, the DSS APIs allow you to programmatically drive the operation of your instance and the projects kept there.

A slide introducing some of the actions possible with the APIs.

Let’s begin with the first set of actions, such as reading and writing datasets.

Whether using Python or R, these methods are found in a package named dataiku. You’ve already seen it at the top of any default code recipe.

A slide introducing some of the functions found in the dataiku package.

The second set of actions, those that allow you to drive the operation of your instance or projects, are part of the public API. The public API is an HTTP REST API, but it is recommended to access it through the Python API client called dataikuapi.

  • If working inside Dataiku DSS, you won’t even need to explicitly load this package. You’ll be using it “under the hood”.

  • If working outside Dataiku DSS, know that the dataikuapi package is available for public download. You will learn more about this in another lesson on using the APIs outside of DSS.

A slide introducing the public API.

APIs outside Dataiku DSS

No matter which API you are calling, the choice of coding within Dataiku DSS or outside of it remains yours.

The APIs are available “out-of-the-box” inside DSS, but you can also import these packages outside of DSS. We’ll show how this can be done in another lesson.

A slide introducing how Dataiku APIs can be used inside or outside of the platform.

Now that you have a high-level understanding of APIs in Dataiku DSS, let’s start investigating how to use methods from the dataiku package while working inside the platform.