Dataiku Knowledge
  • Discussions
    • Setup & Configuration
    • Using Dataiku DSS
    • Plugins & Extending Dataiku DSS
    • General Discussion
    • Job Board
    • Community Resources
    • Product Ideas
  • Knowledge
    • Getting Started
    • Knowledge Base
    • Documentation
  • Academy
    • Quick Start Programs
    • Learning Paths
    • Certifications
    • Course Catalog
    • Academy Discussions
  • Community Programs
    • Upcoming User Events
    • Find a User Group
    • Past Events
    • Community Conundrums
    • Dataiku Neurons
    • Banana Data Podcast
  • What's New
  • Getting Started
    • Dataiku DSS - The Value Proposition
    • Dataiku DSS - Project Walkthrough
      • The NY Taxi Project through the AI Lifecycle
      • The AI Lifecycle: Data Acquisition
      • The AI Lifecycle: Data Exploration
      • The AI Lifecycle: Data Preparation
      • The AI Lifecycle: Experiment
      • The AI Lifecycle: Deploy
      • The AI Lifecycle: Orchestrate
    • Business Analyst Quick Start
    • AI Consumer Quick Start
    • Data Scientist Quick Start
    • Data Engineer Quick Start
    • From Excel To Dataiku DSS
      • Introduction
      • Data Cleaning
      • Using Formulas
      • Working with Dates
      • Removing Duplicates
      • Filtering Rows
      • Sampling Rows
      • Split a Dataset
      • Append Datasets
      • Joining Datasets
      • Aggregate and Pivot
      • Sorting Values
      • Top Values
  • Setup and Administration
    • Administrator’s Guide
      • Deploy Dataiku
        • Dataiku Elastic AI Stack: The Full Fleet Architecture
        • Impact of Modifying Instance Templates and Settings
        • Deploying a Dataiku Instance to Cloud Stacks on AWS
        • Modifying Instance Templates and Virtual Networks
        • Managing Dataiku Instances in Fleet Manager
        • Deploying a Dataiku Instance to Cloud Stacks on Azure
        • Modifying Instance Templates and Virtual Networks
      • Configure Dataiku
        • Build Your Security Model - User Groups
        • Build your Security Model - Per-Resource Group Permissions
        • Build Your Security Model - Connections - Usage Parameters
        • Build Your Security Model - Global vs Per User Credentials
        • Build Your Security Model - Connections - Metastore
        • Using AWS AssumeRole with an S3 Connection to Persist Datasets
    • How-To: Set Up Dataiku Workspaces
    • Concept: Architecture Model for Databases
    • Concept: Connections to SQL Databases
    • Remapping Connections in a Dataiku Instance
    • Working with MongoDB in Dataiku
    • Integration with Amazon Redshift
    • How to Leverage Compute Resource Usage Data
  • Data Preparation
    • Concept: Recipes in DSS
    • Preparing Data with Visual Recipes
      • Concept: Distinct Recipe
      • Concept: Group Recipe
      • Concept: Join Recipe
      • Concept: Pivot Recipe
      • The Pivot Recipe
        • Reshaping Data from Long to Wide Format
        • Creating Excel-Style Pivot Tables with the Pivot Recipe
      • Concept: Prepare Recipe
      • Concept: Date Handling in DSS
      • Concept: Formulas in DSS
      • Advanced Prepare Recipe Usage
        • Handling Decimal Notations
        • Enriching Web Logs
        • Applying Prepare Steps to Multiple Columns
        • Performing Joins in the Prepare Recipe
        • Become a Master of Dataiku Formulas
        • Custom Python functions in the Prepare Recipe
        • How to standardize text fields using fuzzy values clustering
        • How to fill empty cells of a column with the value of the corresponding row from another column
        • How to remove scientific notation in a column
        • How to pad a number with leading zeros
        • Safe sums across columns in Dataiku DSS Formulas
        • In a Formula, how to check if a variable belongs to a set of values?
        • How to copy-paste Prepare recipe steps
        • Dealing with Accounting-style negative numbers
        • How-To: Filter and Process Dates Interactively
        • How-To: Extract Patterns With the Smart Pattern Builder
        • Hands-On Tutorial: Visual Logic for Data Preparation
      • How to reorder or hide the columns of a dataset
      • Concept: Filter Recipe
      • Hands-On Tutorial: Fuzzy Join Recipe
      • Concept: Sample Recipe
      • Concept: Sort Recipe
      • Concept: Split Recipe
      • Concept: Stack Recipe
      • Concept: Top N Recipe
      • Concept: Window Recipe
      • Hands-On Tutorial: Window Recipe
      • Hands-On Tutorial: Window Recipe (Deep Dive)
      • How to segment your data using statistical quantiles
    • Preparing Data with Code Recipes
      • Concept: SQL Recipe
      • Using PySpark in DSS
      • Using SparkR in DSS
    • Preparing Data with Plugin Recipes
      • Events Aggregator (Plugin)
    • Building Data Pipelines
      • Hands-On Tutorial: Data Pipelines
      • Concept: Computation Engine
      • Concept: Jobs
      • Concept: Dataset Building Strategies
      • Where does it all happen?
      • How to Enable SQL Pipelines in the Flow
    • Repartitioning a non-partitioned dataset
  • Exploring Datasets
    • Connecting to and Exploring Data
      • Concept: Datasets in DSS
      • Concept: Partitioning
      • Concept: Connections
      • Concept: Schema
      • Concept: Storage Type
      • Concept: Meaning
      • Concept: Sampling
      • Concept: Analyze
      • Where can I see how many records are in my entire dataset?
      • Utilizing MS Access in Dataiku DSS
    • Charts
      • Concept: Charts
      • Concept: In-Database Charts
      • Paneled and Animated Charts
      • How to display non-aggregated metrics in charts
      • How to sort on a measure that is not displayed in charts?
      • Hands-On Tutorial: Visualization Enhancements
      • Hands-On Tutorial: Charts, Pivot Tables & Dashboard Filter Tiles
    • Exploring Data in the Lab
      • Concept: The Lab
      • Concept: SQL Notebooks
  • Reporting & Insights
    • Dashboards
      • Concept: Dashboards
      • Cannot display a web content insight in a dashboard
      • Hands-On Tutorial: What-If Analysis With Interactive Scoring
    • R Markdown
      • Concept: R Markdown Reports
      • R Markdown Reports in Dataiku DSS
    • Webapps in Dataiku
      • Hands-On: Dash Webapp
      • Hands-On Tutorial: Bokeh Webapp
      • Hands-On Tutorial: Shiny Webapp
      • Hands-On Tutorial: Standard Webapp
      • Hands-On Tutorial: Create an HTML/JavaScript Webapp to Draw the San Francisco Crime Map
      • Hands-On Tutorial: Adapt a D3.js Template in a Webapp
      • Concept: Webapps in Dataiku
      • Use Custom Static Files (Javascript, CSS) in a Webapp
      • Use a React Frontend to Create a Webapp
      • How-To: Display an Image With Bokeh
      • Upload to Dataiku DSS in a Webapp
      • Download from a Dataiku DSS Webapp
    • Static Insights in Dataiku
      • Concept: Static Insights in Dataiku DSS
      • Hands-On Tutorial: Static Insights
    • Concept: Visualization Plugins
  • Managing Your Work & Collaboration
    • Concept: Homepage
    • Concept: Project
    • Concept: Collaboration
    • Concept: Flow
    • Navigating Dataiku DSS with the right panel
    • Tags
    • Using Wikis to Share Knowledge
    • How-To: Export a Wiki to PDF
    • Using Discussions to Communicate with Teammates
    • Hands-On Tutorial: Dataiku Workspaces
    • How to copy a recipe in your Flow
    • Git for Projects
    • Flow Views & Actions
      • Flow Views: Zones, Tags, & More
      • Flow Zones
      • Hands-On Tutorial: Flow Zones, Tags, & More Flow Views
      • Concept: Schema Propagation & Consistency Checks
      • Concept: Connection Changes & Flow Item Reuse
      • Concept: Dataset Building Strategies
      • Hands-On Tutorial: Perform Flow Actions
    • How-To: Feature Store
    • How-To: Seamless Sharing
    • Best Practices for Collaborating in Dataiku DSS
    • Best Practices to Improve Your Productivity
  • Analytics and Machine Learning
    • Interactive Visual Statistics
      • Concept: Statistics Worksheet
      • Concept: Statistics Card
      • Concept: Categorical and Numerical Variables
      • Concept: Factor and Response
      • Concept: Fit Curves and Distributions
      • Concept: Correlation Matrix
      • Concept: Principal Component Analysis (PCA)
      • Concept: Hypothesis Testing
      • Concept: Test Categories
      • Concept: Grouping Variable
      • Concept: Adjustment Method
      • Hands-On: Interactive Visual Statistics
        • Hands-On: Explore the Interactive Statistics Interface
        • Hands-On Tutorial: Perform Univariate Analysis
        • Hands-On: Perform Bivariate Analysis
        • Hands-On: Fit Univariate Distributions
        • Hands-On: Fit Bivariate Distributions
        • Hands-On: Model the Relationship Between Two Variables
        • Hands-On: Create a Correlation matrix
        • Hands-On: Analyze Effects of Dimensionality Reduction
        • Hands-On: Perform One-sample Location Tests
        • Hands-On: Perform One-sample Distribution Tests
        • Hands-On: Perform Two-sample Location Tests
        • Hands-On: Perform Two-sample Distribution Tests
        • Hands-On: Perform N-sample Location Tests
        • Hands-On: Perform Tests on Categorical Variables
      • How-To: Perform Statistical Analysis on Time Series Data
    • Intro to Machine Learning
      • Concept Summary: Introduction to Machine Learning
      • Concept Summary: Predictive Modeling
      • Concept Summary: Model Validation
      • Concept: Model Evaluation
      • Concept Summary: Regression Algorithms
      • Concept Summary: Classification Algorithms
      • Concept Summary: Clustering Algorithms
    • Visual Machine Learning
      • Machine Learning Basics
        • Concept: Preparing a Dataset for Machine Learning
        • Concept: Quick Models
        • Concept: Design Tab Overview
        • Hands-On: Create the Model
        • Concept: Result Tab Overview
        • Concept: Model Summary Overview
        • Hands-On: Evaluate the Model
        • Concept: Feature Handling
        • Concept: Review the Design
        • Concept: Algorithms and Hyperparameters
        • Hands-On: Tune the Model
        • Concept: Explainable AI
        • Concept: Partial Dependence
        • Concept: Subpopulation Analysis
        • Concept: Individual Explanations
        • Concept Summary: Interactive Scoring
        • Hands-On: Explain Your Model
        • Wrap Up: Machine Learning Basics
      • Scoring Basics
        • Concept: Deploy the Model
        • Hands-On: Deploy the Model
        • Concept: Scoring Data
        • Hands-On: Scoring Data
        • Concept: Model Lifecycle Management
        • Wrap Up: Scoring Basics
      • Regression Models
      • Cluster Models
      • Advanced Visual Machine Learning
        • How To: Use Visual ML Diagnostics
        • How To: Use Visual ML Assertions
        • Hands-On Tutorial: Model Fairness Report
        • How-To: Distributed Hyperparameter Search
        • How-To: Set up Interactive Scoring for a Dashboard Consumer
        • Hands-On Tutorial: What-If Analysis With Interactive Scoring
        • Monitoring model drift with Dataiku
        • How-To: Model Comparisons and Model Evaluation Stores
        • How-To: “What-If Accelerators” Counterfactual and Actionable Recourse
        • Hands-On Tutorial: Visual ML Enhancements
      • Partitioned Models
        • Concept Summary: Partitioned Models
        • Hands-On Tutorial: Partitioned Models
        • How do I train a stratified or partitioned model?
        • Wrap Up: Partitioned Models
      • Custom Models in Visual ML
        • Custom Preprocessing in Visual ML
        • Custom Modeling in Visual ML
        • Hands-On Tutorial: Custom Preprocessing in the Visual ML Tool
        • Hands-On Tutorial: Custom Modeling in the Visual ML Tool
        • Tuning XGBoost Models in Python
      • Using MLlib with Dataiku
      • Why don’t the values in the Visual ML chart match the final scores for each algorithm?
      • In Visual ML, why am I getting the error “All values of the target are equal,” when they are not?
      • Compute a subpopulation analysis for white-box ML
    • Time Series
      • Time Series Basics
        • Concept Summary: Introduction to Time Series
        • Concept Summary: Time Series Data Types and Formats
        • Concept Summary: Time Series Components
        • Concept Summary: Objectives of Time Series Analysis
      • How-To: Perform Statistical Analysis on Time Series Data
      • Time Series Preparation
        • Concept Summary: Time Series Preparation
        • Concept Summary: Resampling
        • Concept Summary: Time Series Interval Extraction Pt 1
        • Concept Summary: Time Series Interval Extraction Pt 2
        • Concept Summary: Time Series Interval Extraction Pt 3
        • Concept Summary: Time Series Windowing Pt 1
        • Concept Summary: Time Series Windowing Pt 2
        • Concept Summary: Time Series Windowing Pt 3
        • Concept Summary: Time Series Extrema Extraction
        • Hands-On: Visualizing Time Series Data
        • Hands-On: Resampling Time Series Data
        • Hands-On: Interval Extraction
        • Hands-On: Time Series Windowing
        • Hands-On: Extrema Extraction
      • Time Series Modeling and Forecasting
        • Hands-On Tutorial: Forecasting Time Series (Visual ML Interface)
        • Hands-On Tutorial: Forecasting Time Series (Plugin)
        • Forecasting Time Series Data with R and Dataiku
        • Deep Learning for Time Series
      • How Dataiku DSS Handles and Displays Date & Time
    • Introduction to Deep Learning with Code
    • Natural Language Processing (NLP)
      • Concept: Introduction to Natural Language Processing
      • Hands-On Tutorial: Getting Started with NLP
      • Concept: The Challenges of Natural Language Processing (NLP)
      • Concept: Cleaning Text Data
      • Hands-On Tutorial: Cleaning Text Data
      • Concept: Handling Text Features for ML
      • Hands-On Tutorial: Handling Text Features for ML
      • Sentiment Analysis in Dataiku DSS (Plugin)
      • Recognize author’s style using the Gutenberg plugin
      • Deep Learning for Sentiment Analysis
      • How to Use the Python Natural Language Toolkit (NLTK) in Dataiku
      • How to use spaCy models in Dataiku DSS
    • Image Classification with Visual Tools
      • Hands-On Tutorial: Image Classification with the Deep Learning on Images Plugin
      • Hands-On Tutorial: Use the Object Detection in Images Plugin
    • Image Classification with Code / Deep Learning for Images
    • Geospatial Analytics
      • Creating Maps in Dataiku without Code
      • Geographic Processing with Dataiku
      • Working with Shapefiles and US Census Data in Dataiku
      • Hands-On Tutorial: Geo Join
    • Active Learning
      • Active Learning for classification problems
      • Active Learning for object detection problems
      • Help on Active Learning Webapp
      • Active Learning for object detection problems using Dataiku Apps
      • Active Learning for tabular data classification problems using Dataiku Apps
    • Reinforcement Learning
      • Introduction to Reinforcement Learning
      • Q-Learning
      • Deep Q-Learning
  • Advanced Code
    • Python and Dataiku DSS
      • Python in Dataiku DSS
      • Reading or writing a dataset with custom Python code
      • How to use SQL from a Python Recipe in Dataiku
      • Sessionization in SQL, Hive, Python, and Pig
      • Custom Python Models
      • Tuning XGBoost Models in Python
      • How to add a group to a Dataiku DSS Project using a Python Script
      • How to set a timeout for a particular scenario build step via a custom Python step?
      • How to use Azure AutoML from a Dataiku DSS Notebook
      • How to enable auto-completion in Jupyter Notebook
      • Concept: Managed Folders
      • Hands-On Tutorial: Managed Folders
    • SQL and Dataiku
      • Integration with SQL Databases
        • Prerequisites for SQL Integration
        • Concept: Connections to SQL Databases
        • Hands-On Tutorial: Configure the Connection Between Dataiku and PostgreSQL
      • Usage of SQL and Dataiku
        • Concept: Architecture Model for Databases
        • Hands-On Tutorial: Sync Recipe
        • Hands-On Tutorial: Prepare Recipe for Loading a Database
        • Concept: SQL Recipe
        • Hands-On Tutorial: Create a New Dataset With an SQL Query Recipe
        • Hands-On Tutorial: Using Visual Recipes to Perform In-database Operations
        • Concept: In-Database Charts
        • Hands-On Tutorial: In-Database Charts
        • Concept: SQL Notebooks
        • Hands-On Tutorial: SQL Notebooks
    • R and Dataiku
      • Basics of R in Dataiku DSS
      • Hands-On Tutorial: Dataiku DSS for R Users (Advanced)
      • Hands-On Tutorial: Mining Association Rules and Frequent Item Sets with R and Dataiku
      • Upgrading the R version used in Dataiku DSS
    • Shared Code including Git in Dataiku
      • Concept: Intro to Shared Code
      • Concept: Shared Code Libraries
      • Concept: Importing Code from a Remote Git Repository
      • Cloning a Library from a Remote Git Repository
      • Concept: Code Samples
      • How-To: Import a Notebook from GitHub
      • Hands-On Tutorial: Shared Code
    • Work Environment
      • How to Edit Dataiku Recipes and Plugins in Visual Studio Code
      • How to Edit Dataiku Recipes and Plugins in PyCharm
      • How to Edit Dataiku Recipes and Plugins in Sublime
      • How to Edit Dataiku Recipes in RStudio
      • Using Jupyter Notebooks in DSS
      • Hands-On Tutorial: My First Code Studio
      • How to Edit a Code Recipe Using Code Studios
      • Setting a Code Environment
      • Memory Optimization Tips: Backend, Python/R, Spark jobs
    • Dataiku APIs
      • Concept: APIs in Dataiku DSS
      • Concept: The dataiku Package
      • Concept: The Public API
      • Hands-On Tutorial: The Public API in Dataiku
      • Concept: APIs Outside Dataiku
    • Webapps in Dataiku
      • Hands-On: Dash Webapp
      • Hands-On Tutorial: Bokeh Webapp
      • Hands-On Tutorial: Shiny Webapp
      • Hands-On Tutorial: Standard Webapp
      • Hands-On Tutorial: Create an HTML/JavaScript Webapp to Draw the San Francisco Crime Map
      • Hands-On Tutorial: Adapt a D3.js Template in a Webapp
      • Concept: Webapps in Dataiku
      • Use Custom Static Files (Javascript, CSS) in a Webapp
      • Use a React Frontend to Create a Webapp
      • How-To: Display an Image With Bokeh
      • Upload to Dataiku DSS in a Webapp
      • Download from a Dataiku DSS Webapp
    • Static Insights in Dataiku
      • Concept: Static Insights in Dataiku DSS
      • Hands-On Tutorial: Static Insights
  • Operationalization
    • Automation
      • Concept: Metrics & Checks
      • Concept: Scenarios
      • Concept: Custom Metrics, Checks & Scenarios
      • Reporting Scenario Activities
      • Model Lifecycle
      • Automation Quick Start
      • Hands-On: Automation with Metrics, Checks & Scenarios
      • How to Create a Google Chat Reporter
      • How to programmatically set email recipients in a “Send email” reporter using the API?
      • How to create a Jira issue automatically upon a DSS scenario execution failure
      • Can I control which datasets in my Flow get rebuilt during a scenario?
      • How to build missing partitions with a scenario
    • MLOps Practitioner Learning Path
      • Production Concepts
        • MLOps: Definition, Challenges, and Main Principles
        • Six Components of Model Development that Impact MLOps
        • How the Dataiku Architecture Supports MLOps
        • Machine Learning (ML) Model Packages
        • How to Gain Control of MLOps Processes
        • Monitoring Model Performance and Drift in Production
        • Govern
        • Why Monitoring and Feedback is a Crucial Step in the AI Project Lifecycle
      • Technical Prerequisites for MLOps Tutorials
      • Preparing for Production
        • Automation Best Practices
        • Pipeline Optimization Best Practices
        • Documenting Your Project Workflow
        • Hands-On Tutorial: Automation for a Production Environment
      • Projects in Production
        • Concept: Preparing the Automation Node
        • Concept: Batch Deployment
        • Hands-On Tutorial: Batch Deployment
        • Hands-On Tutorial: Monitoring Projects in Production
        • Hands-On Tutorial: Automatically Updating Project Deployments
      • Real-Time APIs
        • Concept: Real-Time API Deployment
        • Concept: Endpoints and Query Enrichments
        • Hands-On Tutorial: Create Endpoint and Test Queries
        • Concept: API Deployer
        • Hands-On Tutorial: Deploy Real-Time API Service
        • Hands-On Tutorial: Manage Multiple Versions of an API Service
        • Monitor Output of API Endpoints
    • Dataiku Applications
      • An Introduction to Dataiku Applications
      • Hands-On Tutorial: Create a Visual Dataiku Application
      • Hands-On Tutorial: Create a Dataiku Application-As-Recipe
      • Difference Between Webapps and Dataiku Applications
      • Dataiku Applications: Use Cases
    • Hands-On Tutorial: Building your Feature Store in Dataiku
    • Building CI/CD pipelines for Dataiku DSS
      • Building a Jenkins pipeline for API services in Dataiku DSS
      • Building a Jenkins pipeline for Dataiku DSS with Project Deployer
      • Building an Azure Pipeline for Dataiku DSS with Project Deployer
      • Building a Jenkins pipeline for Dataiku DSS without Project Deployer
    • Variables
      • Variables in Flows, Webapps, and Dataiku Applications
      • Concept: Variables 101
      • A Look at Coding with Variables
      • Concept Summary: Defining Variables
      • Concept Summary: Using Variables in a Code Recipe
      • Concept Summary: Modifying the Value of Variables
      • Hands-On Tutorial: Variables for Coders
  • Plugin Development & Management
    • Plugin Development
      • Plugin Development (Concepts and Tutorials)
        • Concept: What Are Development Plugins?
        • Concept: Developing Plugins
        • Concept: Git Integration for Plugins
        • Hands-On Tutorial: Plugin Development
      • Examples of Plugin Component Development
        • How to Create a Custom Recipe
        • How to Create a Custom Dataset
        • How to Create a Partitioned Custom Dataset
        • How to Create a Custom Webapp
        • How to Create a Custom Machine Learning Algorithm
      • Setting Up Your Code Editor to Develop Dataiku Plugins
      • Plugin Naming Policies and Conventions
      • What’s Next?
    • Plugin Management
      • Plugins in Dataiku DSS
      • Plugin Store Usage
      • Getting Started with the Dataiku DSS Plugin Store
      • Sharing a Plugin as a Zip Archive
      • Hands-On Tutorial: Plugin Store
      • Managing Plugin Versions with Git
      • Cloning a Plugin from a Remote Git Repository
  • Governance
    • Concept: Catalog and Global Search
    • Using Global Search in Dataiku DSS
    • Data Governance with the GDPR Plugin
    • How to use project folders in Dataiku DSS
    • Why can’t I drag and drop a folder into Dataiku DSS?
    • How to duplicate a Dataiku DSS project
    • How-To: Flow Document Generator
    • How to find out which users are logged onto the Dataiku DSS instance
    • Which activities in Dataiku DSS require that a user be added to the allowed_user_groups local Unix group?
  • Use Cases
    • Airport Traffic by US and International Carriers
    • Predictive Maintenance in the Manufacturing Industry
    • Churn Prediction
    • Web Logs Analysis
    • Network Optimization
    • Bike Sharing Usage Patterns
    • Crawl budget prediction for enhanced SEO with the OnCrawl plugin
    • A/B Testing for Event Promotion
    • Facies Classification
  • Business Solutions
    • Distribution Spatial Footprint
    • RFM-Enriched Customer Lifetime Value
    • Market Basket Analysis
    • Product Recommendation
    • RFM Segmentation
    • Customer Satisfaction Review
    • Demand Forecast
    • News Sentiment Stock Alert System
    • Interactive Document Intelligence for ESG
    • AML Alerts Triage
    • Insurance Claims Modeling
    • Credit Card Fraud
    • FX P&L Impact Modeling
    • Process Mining
    • Real Estate Pricing
    • Optimizing Omnichannel Marketing in Pharma
    • Drug Repurposing through Graph Analytics
    • Factories Electricity & CO2 Emissions Forecasting
    • Production Quality Control
  • Dataiku Online
    • How to begin a Dataiku Online free trial
    • Starting a Dataiku Online Trial from Snowflake Partner Connect
    • Manage Dataiku Online from the Launchpad
    • How to Connect to Your Data on Dataiku Online
    • How to invite users to your Dataiku Online space
    • How to Add Plugins to Your Dataiku Online Space
    • Work With Python on Dataiku Online
    • How to Create a Python environment
    • How to Manage your Python environments
    • How to obtain support on Dataiku Online
 
Dataiku Academy
You are viewing the Knowledge Base for version 10.0 of DSS.
  • Docs »
  • Data Preparation »
  • Preparing Data with Visual Recipes »
  • Advanced Prepare Recipe Usage

Advanced Prepare Recipe Usage¶

Although it is easy to use, the Prepare recipe is also packed with powerful functionality that may not be immediately obvious.

The following articles build on the material already introduced in the Basics Courses.

The Prepare recipe is the focus of this section, but most of these materials also apply to the visual analysis of the Lab, which can be deployed to the Flow as a Prepare recipe.

Articles¶

  • Handling Decimal Notations
  • Enriching Web Logs
  • Applying Prepare Steps to Multiple Columns
  • Performing Joins in the Prepare Recipe
  • Become a Master of Dataiku Formulas
  • Custom Python functions in the Prepare Recipe
  • How to standardize text fields using fuzzy values clustering
  • How to fill empty cells of a column with the value of the corresponding row from another column
  • How to remove scientific notation in a column
  • How to pad a number with leading zeros
  • Safe sums across columns in Dataiku DSS Formulas
  • In a Formula, how to check if a variable belongs to a set of values?
  • How to copy-paste Prepare recipe steps
  • Dealing with Accounting-style negative numbers
  • How-To: Filter and Process Dates Interactively
  • How-To: Extract Patterns With the Smart Pattern Builder
  • Hands-On Tutorial: Visual Logic for Data Preparation
Next Previous

© Copyright 2022, Dataiku.

Sphinx theme provided by Read the Docs