Dataiku Knowledge
  • Discussions
    • Setup & Configuration
    • Using Dataiku DSS
    • Plugins & Extending Dataiku DSS
    • General Discussion
    • Job Board
    • Community Resources
    • Product Ideas
  • Knowledge
    • Getting Started
    • Knowledge Base
    • Documentation
  • Academy
    • Quick Start Programs
    • Learning Paths
    • Certifications
    • Course Catalog
    • Academy Discussions
  • Community Programs
    • Upcoming User Events
    • Find a User Group
    • Past Events
    • Community Conundrums
    • Dataiku Neurons
    • Banana Data Podcast
  • What's New
  • Getting Started
    • Concept: The Value Proposition of Dataiku
    • A Dataiku Project Walkthrough
    • Business Analyst Quick Start
    • AI Consumer Quick Start
    • Data Scientist Quick Start
    • Data Engineer Quick Start
    • Excel to Dataiku Quick Start
  • Setup and Administration
    • Administrator’s Guide
      • Deploy Dataiku
        • Dataiku Elastic AI Stack: The Full Fleet Architecture
        • Impact of Modifying Instance Templates and Settings
        • Deploying a Dataiku Instance to Cloud Stacks on AWS
        • Modifying Instance Templates and Virtual Networks
        • Managing Dataiku Instances in Fleet Manager (AWS)
        • Deploying a Dataiku Instance to Cloud Stacks on Azure
        • Modifying Instance Templates and Virtual Networks
        • Managing Dataiku Instances in Fleet Manager (Azure)
      • Configure Dataiku
        • Managing Your Dataiku DSS License File
        • Security Model Overview
        • Build Your Security Model - DSS User Authentication
        • Build Your Security Model - User Groups
        • Build your Security Model - Per-Resource Group Permissions
        • Build Your Security Model - Connections - Usage Parameters
        • Build Your Security Model - Global vs Per User Credentials
        • Build Your Security Model - Connections - Metastore
        • Using AWS AssumeRole with an S3 Connection to Persist Datasets
        • Preferred Connections and Format for Dataset Storage
        • How to Work with the DSS Metastore Catalog
        • Create and Manage Code Environments
      • Troubleshoot Dataiku
        • Diagnosing Performance Issues in Dataiku
      • Administrative Best Practices
        • How to Automate Project Cleaning and Maintenance
    • How-To: Set Up Dataiku Workspaces
    • Concept: Architecture Model for Databases
    • Concept: Connections to SQL Databases
    • Hands-On Tutorial: Remapping Connections in a Dataiku Instance
    • Working with MongoDB in Dataiku
    • Integration with Amazon Redshift
    • How to Leverage Compute Resource Usage Data
  • Data Preparation
    • Concept: Recipes in Dataiku
    • Preparing Data with Visual Recipes
      • Concept: Distinct Recipe
      • Concept: Group Recipe
      • Concept: Join Recipe
      • Concept: Pivot Recipe
      • Concept: Prepare Recipe
      • Concept: Date Handling in Dataiku
      • Concept: Formulas in Dataiku
      • Concept: Filter Recipe
      • Concept: Sample Recipe
      • Concept: Sort Recipe
      • Concept: Split Recipe
      • Concept: Stack Recipe
      • Concept: Top N Recipe
      • Concept: Window Recipe
      • Advanced Prepare Recipe Usage
        • Handling Decimal Notations
        • Enriching Web Logs
        • Applying Prepare Steps to Multiple Columns
        • Performing Joins in the Prepare Recipe
        • Become a Master of Dataiku Formulas
        • Custom Python functions in the Prepare Recipe
        • How to standardize text fields using fuzzy values clustering
        • How to fill empty cells of a column with the value of the corresponding row from another column
        • How to Remove Scientific Notation in a Column
        • How to pad a number with leading zeros
        • Safe sums across columns in Dataiku DSS Formulas
        • In a Formula, how to check if a variable belongs to a set of values?
        • How to copy-paste Prepare recipe steps
        • Dealing with Accounting-style negative numbers
        • How-To: Filter and Process Dates Interactively
        • How-To: Extract Patterns With the Smart Pattern Builder
        • Hands-On Tutorial: Visual Logic for Data Preparation
      • The Pivot Recipe
        • Hands-On Tutorial: Reshaping Data from Long to Wide Format
        • Hands-On Tutorial: Creating Excel-Style Pivot Tables with the Pivot Recipe
      • Hands-On Tutorial: Join Datasets
      • Hands-On Tutorial: Window Recipe
      • Hands-On Tutorial: Window Recipe (Deep Dive)
      • Hands-On Tutorial: Fuzzy Join Recipe
      • How to reorder or hide the columns of a dataset
      • How to segment your data using statistical quantiles
    • Data Pipelines & Computation Engines
      • Concept: Computation Engine
      • Concept: Jobs
      • Concept: Dataset Building Strategies
      • Concept: Where does the computation happen?
      • Hands-On Tutorial: Data Pipelines
      • How to Enable SQL Pipelines in the Flow
    • From Excel To Dataiku
      • Data Cleaning
      • Using Formulas
      • Working with Dates
      • Removing Duplicates
      • Filtering Rows
      • Sampling Rows
      • Split a Dataset
      • Append Datasets
      • Joining Datasets
      • Aggregate and Pivot
      • Sorting Values
      • Top Values
  • Data Exploration
    • Concept: Datasets in DSS
    • Concept: Connections
    • A Primer on Connecting to Data Sources
    • Concept: Schema
    • Concept: Storage Type
    • Concept: Meaning
    • Concept: Sampling
    • Concept: Analyze
    • Charts
      • Concept: Charts
      • Concept: In-Database Charts
      • Hands-On Tutorial Paneled and Animated Charts
      • Hands-On Tutorial: Visualization Enhancements
      • Hands-On Tutorial: Charts and Pivot Tables
      • Hands-On Tutorial: Dashboard Management
      • How to display non-aggregated metrics in charts
      • How to sort on a measure that is not displayed in charts?
    • Concept: The Lab
    • How to Export Data from Filtered Results
    • Where can I see how many records are in my entire dataset?
    • Utilizing MS Access in Dataiku DSS
  • Reporting & Insights
    • Dashboards
      • Concept: Dashboards
      • Cannot display a web content insight in a dashboard
      • Hands-On Tutorial: What-If Analysis With Interactive Scoring
    • Webapps in Dataiku
      • Hands-On Tutorial: Dash Webapp
      • Hands-On Tutorial: Bokeh Webapp
      • Hands-On Tutorial: Shiny Webapp
      • Hands-On Tutorial: Standard Webapp
      • Hands-On Tutorial: Create an HTML/JavaScript Webapp to Draw the San Francisco Crime Map
      • Hands-On Tutorial: Adapt a D3.js Template in a Webapp
      • Concept: Webapps in Dataiku
      • Use Custom Static Files (Javascript, CSS) in a Webapp
      • Use a React Frontend to Create a Webapp
      • How-To: Display an Image With Bokeh
      • Upload to Dataiku DSS in a Webapp
      • Download from a Dataiku DSS Webapp
    • Static Insights in Dataiku
      • Concept: Static Insights in Dataiku
      • Hands-On Tutorial: Static Insights
    • Dataiku Applications
      • Concept: An Introduction to Dataiku Applications
      • Concept: The Difference Between Webapps and Dataiku Applications
      • Use Cases of Dataiku Applications
      • Hands-On Tutorial: Create a Visual Dataiku Application
      • Hands-On Tutorial: Create a Dataiku Application-As-Recipe
    • R Markdown Reports
      • Concept: R Markdown Reports
      • Hands-On Tutorial: R Markdown Reports in Dataiku
    • Concept: Visualization Plugins
  • Managing Work & Collaboration
    • Concept: Homepage
    • Concept: Project
    • Concept: Collaboration
    • Concept: Flow
    • Navigating Dataiku with the right panel
    • Hands-On Tutorial: Tags
    • Using Wikis to Share Knowledge
    • How-To: Export a Wiki to PDF
    • Using Discussions to Communicate with Teammates
    • Hands-On Tutorial: Dataiku Workspaces
    • How to copy a recipe in your Flow
    • Git for Projects
    • Flow Views & Actions
      • Flow Views: Zones, Tags, & More
      • Hands-On Tutorial: Flow Zones
      • Hands-On Tutorial: Flow Zones, Tags, & More Flow Views
      • Concept: Schema Propagation & Consistency Checks
      • Concept: Connection Changes & Flow Item Reuse
      • Concept: Dataset Building Strategies
      • Hands-On Tutorial: Perform Flow Actions
    • How-To: Feature Store
    • How-To: Seamless Sharing
    • Best Practices for Collaborating in Dataiku DSS
    • Best Practices to Improve Your Productivity
  • Analytics & Machine Learning
    • Interactive Visual Statistics
      • Concept: Statistics Worksheet
      • Concept: Statistics Card
      • Concept: Categorical and Numerical Variables
      • Concept: Factor and Response
      • Concept: Fit Curves and Distributions
      • Concept: Correlation Matrix
      • Concept: Principal Component Analysis (PCA)
      • Concept: Hypothesis Testing
      • Concept: Test Categories
      • Concept: Grouping Variable
      • Concept: Adjustment Method
      • Hands-On Tutorials: Interactive Visual Statistics
        • Hands-On: Explore the Interactive Statistics Interface
        • Hands-On: Perform Univariate and Bivariate Analysis
        • Hands-On: Fit Univariate and Bivariate Distributions
        • Hands-On: Model the Relationship Between Two Variables
        • Hands-On: Create a Correlation matrix
        • Hands-On: Analyze Effects of Dimensionality Reduction
        • Hands-On: Perform Statistical Tests
      • How-To: Perform Statistical Analysis on Time Series Data
    • Intro to Machine Learning
      • Concept: Introduction to Machine Learning
      • Concept: Predictive Modeling
      • Concept: Model Validation
      • Concept: Model Evaluation
      • Concept: Regression Algorithms
      • Concept: Classification Algorithms
      • Concept: Clustering Algorithms
    • Visual Machine Learning
      • Machine Learning Basics
        • Concept: Preparing a Dataset for Machine Learning
        • Concept: Quick Models
        • Concept: Design Tab Overview
        • ​​Hands-On: Create the Model
        • Concept: Result Tab Overview
        • Concept: Model Summary Overview
        • Hands-On: Evaluate the Model
        • Concept: Feature Handling
        • Concept: Review the Design
        • Concept: Algorithms and Hyperparameters
        • Hands-On: Tune the Model
        • Concept: Explainable AI
        • Concept: Partial Dependence
        • Concept: Subpopulation Analysis
        • Concept: Individual Explanations
        • Concept Summary: Interactive Scoring
        • Hands-On: Explain Your Model
      • Scoring Basics
        • Concept: Deploy the Model
        • Hands-On Tutorial: Deploy the Model
        • Concept: Scoring Data
        • Hands-On Tutorial: Scoring Data
        • Concept: Model Lifecycle Management
      • Regression Models
      • Cluster Models
      • Advanced Visual Machine Learning
        • How To: Use Visual ML Diagnostics
        • How To: Use Visual ML Assertions
        • Hands-On Tutorial: Model Fairness Report
        • How-To: Distributed Hyperparameter Search
        • How-To: Set up Interactive Scoring for a Dashboard Consumer
        • Hands-On Tutorial: What-If Analysis With Interactive Scoring
        • Monitoring model drift with Dataiku
        • How-To: Model Comparisons and Model Evaluation Stores
        • How-To: “What-If Accelerators” Counterfactual and Actionable Recourse
        • Hands-On Tutorial: Visual ML Features
      • Partitioned Models
        • Concept: Partitioned Models
        • Hands-On Tutorial: Partitioned Models
        • How do I train a stratified or partitioned model?
      • Custom Models in Visual ML
        • Custom Preprocessing in Visual ML
        • Custom Modeling in Visual ML
        • Hands-On Tutorial: Custom Preprocessing in the Visual ML Tool
        • Hands-On Tutorial: Custom Modeling in the Visual ML Tool
        • Tuning XGBoost Models in Python
      • Using MLlib with Dataiku
      • Why don’t the values in the Visual ML chart match the final scores for each algorithm?
      • In Visual ML, why am I getting the error “All values of the target are equal” when they are not?
      • Compute a subpopulation analysis for white-box ML
      • Events Aggregator (Plugin)
    • Time Series
      • Time Series Basics
        • Concept: Introduction to Time Series
        • Concept: Time Series Data Types and Formats
        • Concept: Time Series Components
        • Concept: Objectives of Time Series Analysis
      • How-To: Perform Statistical Analysis on Time Series Data
      • Time Series Preparation
        • Concept: Time Series Preparation
        • Concept: Resampling
        • Concept: Time Series Interval Extraction Pt 1
        • Concept: Time Series Interval Extraction Pt 2
        • Concept: Time Series Interval Extraction Pt 3
        • Concept: Time Series Windowing Pt 1
        • Concept: Time Series Windowing Pt 2
        • Concept: Time Series Windowing Pt 3
        • Concept: Time Series Extrema Extraction
        • Hands-On Tutorial: Visualizing Time Series Data
        • Hands-On Tutorial: Resampling Time Series Data
        • Hands-On Tutorial: Interval Extraction
        • Hands-On Tutorial: Time Series Windowing
        • Hands-On Tutorial: Extrema Extraction
      • Time Series Modeling and Forecasting
        • Hands-On Tutorial: Forecasting Time Series (Visual ML Interface)
        • Hands-On Tutorial: Forecasting Time Series (Plugin)
        • Hands-On Tutorial: Forecasting Time Series Data with R and Dataiku
        • Hands-On Tutorial: Deep Learning for Time Series
      • Concept: How Dataiku Handles and Displays Date & Time
    • Natural Language Processing (NLP)
      • Concept: Introduction to Natural Language Processing
      • Concept: The Challenges of Natural Language Processing (NLP)
      • Concept: Cleaning Text Data
      • Concept: Handling Text Features for ML
      • Hands-On Tutorial: Getting Started with NLP
      • Hands-On Tutorial: Cleaning Text Data
      • Hands-On Tutorial: Handling Text Features for ML
      • Hands-On Tutorial: Deep Learning for Sentiment Analysis
      • How to Use the Python Natural Language Toolkit (NLTK) in Dataiku
      • How to use spaCy models in Dataiku
      • Hands-On Tutorial: Sentiment Analysis in Dataiku (Plugin)
      • Hands-On Tutorial: Recognize author’s style using the Gutenberg plugin
    • Image Classification
      • Image Classification with Visual Tools
        • Hands-On Tutorial: Image Classification with the Deep Learning on Images Plugin
        • Hands-On Tutorial: Use the Object Detection in Images Plugin
      • Image Classification with Code / Deep Learning for Images
    • Geospatial Analytics
      • Hands-On Tutorial: Creating Maps in Dataiku without Code
      • Hands-On Tutorial: Geographic Processing with Dataiku
      • Hands-On Tutorial: Working with Shapefiles and US Census Data in Dataiku
      • Hands-On Tutorial: Geo Join
    • Introduction to Deep Learning with Code
    • Active Learning
      • Active Learning for classification problems
      • Active Learning for object detection problems
      • Help on Active Learning Webapp
      • Active Learning for object detection problems using Dataiku Apps
      • Active Learning for tabular data classification problems using Dataiku Apps
    • Reinforcement Learning
      • Introduction to Reinforcement Learning
      • Q-Learning
      • Deep Q-Learning
  • Code
    • Getting Started with Code in Dataiku
      • Concept: Code Notebooks in Dataiku
      • Concept: Code Recipes in Dataiku
      • Concept: Code Environments in Dataiku
      • Concept: External IDE Integrations
      • Hands-On Tutorial: Code Notebooks
      • Hands-On Tutorial: Code Recipes
      • Hands-On Tutorial: Code Environments
    • Python and Dataiku
      • Hands-On Tutorial: The Basics of Python in Dataiku
      • Reading or writing a dataset with custom Python code
      • Hands-On Tutorial: Use SQL from a Python Recipe in Dataiku
      • Hands-On Tutorial: Sessionization in SQL, Hive, Python, and Pig
      • Custom Python Models
      • Tuning XGBoost Models in Python
      • How to add a group to a Dataiku DSS Project using a Python Script
      • How to set a timeout for a particular scenario build step via a custom Python step?
      • How to use Azure AutoML from a Dataiku DSS Notebook
      • How to enable auto-completion in Jupyter Notebook
      • Hands-On Tutorial: Using PySpark in Dataiku
      • How to Export Preprocessed Data
    • SQL and Dataiku
      • Integration with SQL Databases
        • Prerequisites for SQL Integration
        • Concept: Connections to SQL Databases
        • Hands-On Tutorial: Configure the Connection Between Dataiku and PostgreSQL
      • Usage of SQL and Dataiku
        • Concept: Architecture Model for Databases
        • Hands-On Tutorial: Sync Recipe
        • Hands-On Tutorial: Prepare Recipe for Loading a Database
        • Concept: SQL Recipe
        • Hands-On Tutorial: Create a New Dataset With an SQL Query Recipe
        • Hands-On Tutorial: Using Visual Recipes to Perform In-database Operations
        • Concept: In-Database Charts
        • Hands-On Tutorial: In-Database Charts
        • Concept: SQL Notebooks
        • Hands-On Tutorial: SQL Notebooks
    • R and Dataiku
      • Hands-On Tutorial: The Basics of R in Dataiku
      • Hands-On Tutorial: Dataiku DSS for R Users (Advanced)
      • Hands-On Tutorial: Shiny Webapp
      • R Markdown Reports
        • Concept: R Markdown Reports
        • Hands-On Tutorial: R Markdown Reports in Dataiku
      • Hands-On Tutorial: Mining Association Rules and Frequent Item Sets with R and Dataiku
      • Upgrading the R version used in Dataiku
      • Hands-On Tutorial: Using SparkR in Dataiku
    • Shared Code including Git in Dataiku
      • Concept: Intro to Shared Code
      • Concept: Shared Code Libraries
      • Concept: Importing Code from a Remote Git Repository
      • Cloning a Library from a Remote Git Repository
      • Concept: Code Samples
      • How-To: Import a Notebook from GitHub
      • Hands-On Tutorial: Shared Code
    • Work Environment
      • How to Edit Dataiku Projects and Plugins in Visual Studio Code
      • How to Edit Dataiku Projects and Plugins in PyCharm
      • How to Edit Dataiku Recipes and Plugins in Sublime
      • How to Edit Dataiku Recipes in RStudio
      • Hands-On Tutorial: Using Jupyter Notebooks in Dataiku
      • Hands-On Tutorial: My First Code Studio
      • How to Edit a Code Recipe Using Code Studios
      • Setting a Code Environment
      • Memory Optimization Tips: Backend, Python/R, Spark jobs
    • Dataiku APIs
      • Concept: APIs in Dataiku
      • Concept: The dataiku Package
      • Concept: The Public API
      • Hands-On Tutorial: The Public API in Dataiku
      • Concept: APIs Outside Dataiku
    • Webapps in Dataiku
      • Hands-On Tutorial: Dash Webapp
      • Hands-On Tutorial: Bokeh Webapp
      • Hands-On Tutorial: Shiny Webapp
      • Hands-On Tutorial: Standard Webapp
      • Hands-On Tutorial: Create an HTML/JavaScript Webapp to Draw the San Francisco Crime Map
      • Hands-On Tutorial: Adapt a D3.js Template in a Webapp
      • Concept: Webapps in Dataiku
      • Use Custom Static Files (Javascript, CSS) in a Webapp
      • Use a React Frontend to Create a Webapp
      • How-To: Display an Image With Bokeh
      • Upload to Dataiku DSS in a Webapp
      • Download from a Dataiku DSS Webapp
    • Static Insights in Dataiku
      • Concept: Static Insights in Dataiku
      • Hands-On Tutorial: Static Insights
    • Managed Folders
      • Concept: Managed Folders
      • Hands-On Tutorial: Managed Folders
  • Operationalization
    • Automation
      • Concept: Metrics & Checks
      • Concept: Scenarios
      • Concept: Custom Metrics, Checks & Scenarios
      • Model Lifecycle
      • Automation Quick Start
      • Hands-On Tutorial: Automation with Metrics, Checks & Scenarios
      • Reporting Scenario Activities
      • How to Create a Google Chat Reporter
      • How to programmatically set email recipients in a “Send email” reporter using the API?
      • How to create a Jira issue automatically upon a DSS scenario execution failure
      • Can I control which datasets in my Flow get rebuilt during a scenario?
      • How to build missing partitions with a scenario
    • MLOps Practitioner Learning Path
      • Production Concepts
        • MLOps: Definition, Challenges, and Main Principles
        • Six Components of Model Development that Impact MLOps
        • How the Dataiku Architecture Supports MLOps
        • Machine Learning (ML) Model Packages
        • How to Gain Control of MLOps Processes
        • Monitoring Model Performance and Drift in Production
        • Govern
        • Why Monitoring and Feedback is a Crucial Step in the AI Project Lifecycle
      • Technical Prerequisites for MLOps Tutorials
      • Preparing for Production
        • Automation Best Practices
        • Pipeline Optimization Best Practices
        • Documenting Your Project Workflow
        • Hands-On Tutorial: Automation for a Production Environment
      • Projects in Production
        • Concept: Preparing the Automation Node
        • Concept: Batch Deployment
        • Hands-On Tutorial: Batch Deployment
        • Hands-On Tutorial: Monitoring Projects in Production
        • Hands-On Tutorial: Automatically Updating Project Deployments
      • Real-Time APIs
        • Concept: Real-Time API Deployment
        • Concept: API Query Endpoints
        • Concept: API Query Enrichments
        • Hands-On Tutorial: Create Endpoint and Test Queries
        • Concept: API Deployer
        • Hands-On Tutorial: Deploy Real-Time API Service
        • Hands-On Tutorial: Manage Multiple Versions of an API Service
        • Monitor Output of API Endpoints
    • Dataiku Applications
      • Concept: An Introduction to Dataiku Applications
      • Concept: The Difference Between Webapps and Dataiku Applications
      • Use Cases of Dataiku Applications
      • Hands-On Tutorial: Create a Visual Dataiku Application
      • Hands-On Tutorial: Create a Dataiku Application-As-Recipe
    • Hands-On Tutorial: Building your Feature Store in Dataiku
    • Building CI/CD pipelines for Dataiku DSS
      • Building a Jenkins pipeline for API services in Dataiku DSS
      • Building a Jenkins pipeline for Dataiku DSS with Project Deployer
      • Building an Azure Pipeline for Dataiku DSS with Project Deployer
      • Building a Jenkins pipeline for Dataiku DSS without Project Deployer
    • Variables
      • Variables in Flows, Webapps, and Dataiku Applications
      • Concept: Variables 101
      • A Look at Coding with Variables
      • Concept Summary: Defining Variables
      • Concept Summary: Using Variables in a Code Recipe
      • Concept Summary: Modifying the Value of Variables
      • Hands-On Tutorial: Variables for Coders
    • Partitioning
      • Concept: Partitioning
      • How Partitioning Adds Value
      • Partitioned Datasets
      • Running Jobs with Partitioned Datasets
      • Redispatching and Collecting Partitions
      • Partitioning in a Scenario
      • Creating a Partitioned Output by Specifying a Pattern
      • Hands-On Tutorial: Advanced Partitioning: File-Based Using Partition Redispatch
      • Hands-On Tutorial: Column-Based Partitioning
      • Hands-On Tutorial: Advanced Partitioning: Scenarios
      • Hands-On Tutorial: Repartition a Non-partitioned Dataset
  • Plugins
    • Plugin Development
      • Plugin Development (Concepts and Tutorials)
        • Concept: What Are Development Plugins?
        • Concept: Developing Plugins
        • Concept: Git Integration for Plugins
        • Hands-On Tutorial: Plugin Development
      • Examples of Plugin Component Development
        • How to Create a Custom Recipe
        • How to Create a Custom Dataset
        • How to Create a Partitioned Custom Dataset
        • How to Create a Custom Webapp
        • How to Create a Custom Machine Learning Algorithm
      • Setting Up Your Code Editor to Develop Dataiku Plugins
      • Plugin Naming Policies and Conventions
      • What’s Next?
    • Plugin Management
      • Concept: Plugins in Dataiku
      • Concept: Plugin Store Usage
      • Getting Started with the Dataiku DSS Plugin Store
      • Sharing a Plugin as a Zip Archive
      • Hands-On Tutorial: Plugin Store
      • Managing Plugin Versions with Git
      • Cloning a Plugin from a Remote Git Repository
  • Governance
    • Introducing Dataiku Govern
    • Using Govern
    • Governable Items
    • Create a Governance Layer
    • Model and Bundle Registries
    • Business Initiatives
    • Govern Item Pages
    • Workflows and Project Qualification
    • Governed Projects
    • Reviews and Sign-offs
    • Model Maintenance
    • Concept: Catalog and Global Search
    • Concept: Global Search in Dataiku
    • Hands-On Tutorial: Data Governance with the GDPR Plugin
    • Tips: Use Project Folders in Dataiku
    • FAQ: Why can’t I drag and drop a folder into Dataiku?
    • How-To: Duplicate a Dataiku Project
    • How-To: Flow Document Generator
    • Code Sample: Find out which users are logged onto the Dataiku instance
    • FAQ: Which activities in Dataiku require that a user be added to the allowed_user_groups local Unix group?
  • Use Cases
    • Airport Traffic by US and International Carriers
    • Predictive Maintenance in the Manufacturing Industry
    • Churn Prediction
    • Web Logs Analysis
    • Network Optimization
    • Bike Sharing Usage Patterns
    • Crawl budget prediction for enhanced SEO with the OnCrawl plugin
    • A/B Testing for Event Promotion
    • Facies Classification
  • Business Solutions
    • Distribution Spatial Footprint
    • RFM-Enriched Customer Lifetime Value
    • Market Basket Analysis
    • Product Recommendation
    • RFM Segmentation
    • Customer Satisfaction Reviews
    • Demand Forecast
    • News Sentiment Stock Alert System
    • Interactive Document Intelligence for ESG
    • AML Alerts Triage
    • Insurance Claims Modeling
    • Credit Card Fraud
    • Customer Segmentation for Banking
    • Credit Scoring
    • FX P&L Impact Modeling
    • Financial Forecasting
    • Process Mining
    • Real Estate Pricing
    • Optimizing Omnichannel Marketing in Pharma
    • Drug Repurposing through Graph Analytics
    • Pharmacovigilance
    • Social Determinants of Health
    • Factories Electricity & CO2 Emissions Forecasting
    • Production Quality Control
    • Delivery Dock Optimization
    • Batch Performance Optimization
    • How to Leverage Compute Resource Usage Data
  • Dataiku Online
    • Manage Dataiku Online from the Launchpad
    • How to Begin a Dataiku Online Free Trial
    • Start a Dataiku Online Trial from Snowflake Partner Connect
    • Connect to Your Data on Dataiku Online
    • Invite Users to Your Dataiku Online Space
    • Use the Automation Node on Dataiku Online
    • Use the API Node on Dataiku Online
    • Work With Python on Dataiku Online
    • Add Plugins to Your Dataiku Online Space
    • Install Business Solutions on Dataiku Online
    • Obtain Support on Dataiku Online
    • Compute and Resource Quotas on Dataiku Online
    • Setup Single Sign On (SSO)
 
Dataiku Academy
You are viewing the Knowledge Base for version 11 of Dataiku.
  • Docs »
  • Data Preparation »
  • Preparing Data with Visual Recipes

Preparing Data with Visual Recipes¶

Learn more about many of the visual recipes for data preparation.

Concepts¶

  • Concept: Distinct Recipe
  • Concept: Group Recipe
  • Concept: Join Recipe
  • Concept: Pivot Recipe
  • Concept: Prepare Recipe
  • Concept: Date Handling in Dataiku
  • Concept: Formulas in Dataiku
  • Concept: Filter Recipe
  • Concept: Sample Recipe
  • Concept: Sort Recipe
  • Concept: Split Recipe
  • Concept: Stack Recipe
  • Concept: Top N Recipe
  • Concept: Window Recipe

Tutorials¶

  • Advanced Prepare Recipe Usage
  • The Pivot Recipe
  • Hands-On Tutorial: Join Datasets
  • Hands-On Tutorial: Window Recipe
  • Hands-On Tutorial: Window Recipe (Deep Dive)
  • Hands-On Tutorial: Fuzzy Join Recipe

How-tos & Tips¶

  • How to reorder or hide the columns of a dataset
  • How to segment your data using statistical quantiles
Next Previous

© Copyright 2022, Dataiku.

Sphinx theme provided by Read the Docs