Deliver More AI Projects and More Value#

Dataiku 9 builds on core strengths related to Designing, Deploying, and Distributing AI & Analytics.

Dataiku 9 features at a glance

Faster, More Transparent Model Development#

Dataiku 9 provides data analysts and novice data scientists with more tools to create high-quality machine learning models. The release includes best practice guardrails to prevent common pitfalls, model assertions to capture and test known use cases, and what-if analysis to interactively test model sensitivity. Everyone, including advanced data scientists, can also appreciate the added speed for model development with a distributed hyperparameter search on Kubernetes. The result is that people with different backgrounds can get involved in data science projects and drive success and value.

Easier, More Scalable Production Management#

The new Unified Deployer provides an easy to use, self-service interface for data team members to deploy AI projects, in the form of bundles, from Design nodes to Automation nodes. A data scientist, for example, can easily bundle a project and then pick the IT-configured Automation node where they would like to deploy the project. Programmatic access to the deployment process is also enhanced, allowing for integration with CI/CD systems for project deployment and management.

More capabilities for business analysts and users#

There are new capabilities for business teams engaged in data preparation tasks. The fuzzy join recipe makes it easy to join on close-but-not-equal key values. The formula editor is updated to make it easier to learn and use and time and date preparations are streamlined with new date handling improvements. A smart pattern builder has been added to make constructing regular expressions easier.

Other noteworthy features#

Adding to its web applications capabilities, Dataiku 9 introduces support for the popular Dash application framework, enabling organizations to use an application development framework that best meets their needs.

Streaming analytics is now supported by Dataiku and brings with it a new set of dedicated visual recipes to enable everyone in the organization to analyze and collaborate around real-time event data. The new capability allows for streaming endpoints to send and receive event data from multiple sources like Apache Kafka and allows you to monitor continuous jobs with a dedicated monitoring dashboard.

A new Python-based automated time series forecasting plugin capable of high performance univariate and multivariate forecasting, including the latest deep learning approaches via Deep Learning

Git versioning for notebooks and the ability to import one or multiple notebooks from/to an externally managed git repo like GitHub, Bitbucket, or GitLab.