Contents Menu Expand Light mode Dark mode Auto light/dark mode
Dataiku
  • Academy
    • Join the Academy
      Benefit from guided learning opportunities →
      • Quick Starts
      • Learning Paths
      • Certifications
      • Release Highlights
      • Academy Discussions
  • Community
      • Explore the Community
        Discover, share, and contribute →
      • Learn About Us
      • Ask a Question
      • What's New?
      • Discuss Dataiku
      • Using Dataiku
      • Setup and Configuration
      • General Discussion
      • Plugins & Extending Dataiku
      • Product Ideas
      • Programs
      • Frontrunner Awards
      • Dataiku Neurons
      • Community Resources
      • Community Feedback
      • User Research
  • Documentation
    • Reference Documentation
      Comprehensive specifications of Dataiku →
      • Release Notes
      • User's Guide
      • Specific Data Processing
      • Automation & Deployment
      • APIs
      • Installation & Administration
      • Other Topics
  • Knowledge
    • Knowledge Base
      Articles and tutorials on Dataiku features →
      • User Guide
      • Admin Guide
      • Dataiku Solutions
      • Dataiku Cloud
  • Developer
    • Developer Guide
      Tutorials and articles for developers and coder users →
      • Getting Started
      • Concepts and Examples
      • Tutorials
      • API Reference
Dataiku Knowledge Base

User Guide

  • Getting Started
    • Quick Starts
      • Quick Start | Dataiku for data preparation
      • Quick Start | Dataiku for machine learning
      • Quick Start | Dataiku for MLOps
      • Quick Start | Dataiku for AI collaboration
      • Quick Start | Excel to Dataiku
        • Concept | From Excel to Dataiku
      • Quick Start | Alteryx to Dataiku
      • Quick Start | Dataiku for manufacturing data preparation and visualization
    • Dataiku User Interface
      • Concept | Dataiku Cloud Launchpad
      • Concept | Dataiku Design homepage
      • Concept | Project
      • Concept | Flow
      • Concept | Searching in Dataiku
      • Concept | Flow views, search, and filter
      • Tutorial | Explore the Flow
      • Tutorial | Flow zones
      • Reference | Navigation bar
      • Reference | Right panel navigation
      • How-to | Duplicate a Dataiku project
      • How-to | Find the Dataiku version
      • How-to | Rearrange Flow zones
      • Tip | Flow navigation shortcuts
      • Tip | Anchoring for Flow management
      • Tip | Hide or show Flow items
      • Tip | Using project folders
  • Data Sourcing
    • Data Connections
      • Concept | Data connections
      • Concept | Architecture model for databases
      • Concept | Connection changes
      • Tutorial | Configure a connection between Dataiku and an SQL database
      • Tutorial | Data transfer with visual recipes
      • Reference | A primer on connecting to data sources
      • How-to | Remap a connection when importing a project to a Dataiku instance
      • How-to | Utilize MS Access
    • Dataiku Datasets
      • Concept | Dataiku datasets
      • Concept | Dataset characteristics
      • Concept | Sampling on datasets
      • Concept | Dataset conditional formatting
      • Concept | Analyze data quality in the Explore tab
      • Tutorial | Getting started with datasets
      • How-to | Rename a dataset
      • How-to | Reorder or hide dataset columns
      • How-to | Export a filtered dataset
      • How-to | Apply a filter to summary statistics in the Analyze window
      • Tip | Good dataset naming schemes
      • FAQ | Why can’t I drag a folder into Dataiku?
      • FAQ | Where can I see how many records are in my entire dataset?
  • Data Preparation
    • Visual Recipes
      • Concept | Recipes in Dataiku
      • Concept | Sync recipe
      • Concept | Group recipe
      • Concept | Join recipe
      • Concept | Distinct recipe
      • Concept | Pivot recipe
      • Concept | Sample/Filter recipe
      • Concept | Sort recipe
      • Concept | Split recipe
      • Concept | Stack recipe
      • Concept | Top N recipe
      • Concept | Window recipe
      • Concept | Fuzzy join recipe
      • Concept | Geo join recipe
      • Concept | Labeling recipe
      • Concept | Common steps in visual recipes: Pre-filter, Post-filter, & Computed columns
      • Concept | Dynamic dataset and recipe repeat
      • Concept | Generate recipes using Generative AI
      • Tutorial | Group recipe
      • Tutorial | Join recipe
      • Tutorial | Distinct recipe
      • Tutorial | Pivot recipe
      • Tutorial | Top N recipe
      • Tutorial | Window recipe
      • Tutorial | Fuzzy join recipe
      • Tutorial | In-database data visualization and preparation
      • Tutorial | Geo join recipe
      • Tutorial | Compute isochrones and routes with the Geo Router plugin
      • Tutorial | Working with shapefiles and US census data
      • Tutorial | Dynamic recipe repeat
      • How-to | Insert or delete a recipe within the Flow
      • How-to | Segment your data using statistical quantiles
    • Prepare Recipe
      • Concept | Prepare recipe
      • Tutorial | Prepare recipe
      • Tutorial | Smart pattern builder for string pattern extraction
      • Tutorial | Visual logic processors for data preparation
      • Tutorial | Geographic processors
      • Tutorial | Enrich web logs in the Prepare recipe
      • Reference | Performing joins in the Prepare recipe
      • Reference | Using custom Python functions in the Prepare recipe
      • Reference | Handling decimal notations
      • How-to | Normalize number formats in a Prepare recipe
      • How-to | Handle accounting-style negative numbers
      • How-to | Copy-paste Prepare recipe steps
      • How-to | Apply Prepare steps to multiple columns
      • How-to | Standardize text fields using fuzzy values clustering
      • How-to | Reshape data from wide to long format
      • How-to | Generate Prepare recipe steps with AI
    • Dataiku Formulas
      • Concept | Dataiku formulas
      • Concept | Dataiku formulas cheat sheet
      • Concept | Safe sums across columns in Dataiku formulas
      • Tutorial | Relative referencing in Dataiku formulas
      • How-to | Remove scientific notation in a column
      • How-to | Pad a number with leading zeros
      • How-to | Fill empty cells of a column with the value of the corresponding row from another column
      • FAQ | In a formula, how can I check if a variable belongs to a set of values?
    • Data Pipelines & Computation Engines
      • Concept | Computation engines
      • Concept | Build modes
      • Concept | Data pipeline optimization
      • Concept | Where computation happens in Dataiku
      • Tutorial | Build modes
      • Tutorial | Recipe engines
      • How-to | Access job information
      • How-to | Enable SQL pipelines in the Flow
    • The Lab
      • Concept | Visual analyses in the Lab
      • Tutorial | Visual analyses in the Lab
    • Managing Dates
      • Concept | Date handling in Dataiku
      • Reference | How Dataiku handles and displays date and time
    • From Excel to Dataiku
      • Tutorial | Relative referencing in Dataiku formulas
      • How-to | Work with editable datasets
      • How-to | Import an Excel workbook
      • Reference | Data cleaning
      • Reference | Using formulas
      • Reference | Working with dates
      • Reference | Removing duplicates
      • Reference | Filtering rows
      • Reference | Sampling rows
      • Reference | Split a dataset
      • Reference | Append datasets
      • Reference | Joining datasets
      • Reference | Aggregate and pivot
      • Reference | Sorting values
      • Reference | Top values
    • From Alteryx to Dataiku
      • Reference | Alteryx to Dataiku concept mapping
  • Data Visualization
    • Charts
      • Concept | Charts
      • Concept | In-database charts
      • Tutorial | Charts
      • Tutorial | Pivot tables
      • Tutorial | Paneled and animated charts
      • Tutorial | Custom aggregation for charts
      • Tutorial | No-code maps
      • FAQ | How do I display non-aggregated metrics in charts?
      • FAQ | How do I sort on a measure not displayed in charts?
    • Dashboards
      • Concept | Dashboards
      • Tutorial | Use dashboards to build reports
      • Tutorial | Dashboard management
      • How-to | Manage sampling on insights
      • Reference | Understand source data for filters
      • Troubleshoot | Can’t display a web content insight in a dashboard
    • Webapps
      • Concept | Webapps
      • How-to | Display an image in a Bokeh webapp
    • Static Insights
      • Concept | Static insights
      • Tutorial | Static insights
    • Visualization Plugins
      • Concept | Data visualization plugins
  • Collaboration
    • Collaboration Overview
      • Concept | Collaboration
    • Wikis & Flow Documentation
      • Concept | Explain the Flow with generative AI
      • Concept | Workflow documentation in a wiki
      • Reference | Using the project wiki
      • Reference | Sharing and promoting wikis
      • How-to | Create a wiki article
      • How-to | Export a wiki to a PDF
      • How-to | Generate and export Flow documentation
      • Tip | Link Dataiku objects in a wiki article
    • Tags & Object Descriptions
      • Concept | Tags
      • Tip | Suggestions for using tags
      • Tip | Commenting to document Dataiku objects
    • Sharing Projects & Dataiku Assets
      • Concept | Project permissions and asset sharing
      • Concept | Data Catalog
      • Reference | Managing project access
      • How-to | Set up limited access to projects
      • How-to | Manage project access requests
      • How-to | Share project to non-Dataiku users
      • How-to | Manage object sharing
      • How-to | Enable quick sharing of datasets and objects
      • How-to | Copy Flow items to a new or existing project
    • Discussions
      • Concept | Discussions
      • Reference | Managing discussions
      • How-to | Start discussions in a Dataiku object
    • Workspaces
      • Concept | Workspaces
      • Reference | Centralized versus delegated workspaces
      • How-to | Create a workspace
      • How-to | Share a workspace to non-Dataiku users
    • Project Version Control
      • Concept | Version control for Dataiku projects
      • Tutorial | Git for projects
      • How-to | Undo actions in Dataiku
    • Stories
      • Concept | Dataiku stories
      • Tutorial | Dataiku stories with Generative AI
      • Tutorial | Dataiku stories
      • Reference | Story user interface
      • How-to | Enable Story AI
      • How-to | Import a story
  • Data Quality & Automation
    • Variables
      • Concept | Variables in Dataiku
      • Tutorial | Project variables in visual recipes
      • Tutorial | Coding with variables
    • Data Quality
      • Concept | Metrics
      • Concept | Checks
      • Concept | Data quality rules
      • Concept | Metrics & checks (pre-12.6)
      • Concept | Data lineage
      • Tutorial | Data quality
      • Tutorial | Custom metrics, checks, and data quality rules
      • Tutorial | Data quality and SQL metrics
      • FAQ | What’s the difference between distinct and unique value count metrics?
    • Automation Scenarios
      • Concept | Automation scenarios
      • Concept | Custom metrics, checks, data quality rules & scenarios
      • Tutorial | Automation scenarios
      • Tutorial | Scenario reporters
      • Tutorial | Webhook reporters in scenarios
      • Tutorial | Custom step-based scenarios
      • Tutorial | Custom script scenarios
      • How-to | Automate documentation exports in a scenario
      • How-to | Build missing partitions with a scenario
      • Code Sample | Set a timeout for a scenario build step
      • Code Sample | Set email recipients in a “Send email” reporter
      • FAQ | Can I control which datasets in my Flow get rebuilt during a scenario?
    • Dataiku Applications
      • Concept | Dataiku applications
      • Tutorial | Dataiku applications
      • Reference | Use cases of Dataiku applications
    • Partitioning
      • Concept | Partitioning
      • Concept | How partitioning adds value
      • Concept | Partitioned datasets
      • Concept | Jobs with partitioned datasets
      • Concept | Partitioning by pattern
      • Concept | Partitioning in a scenario
      • Concept | Partition redispatch and collection
      • Tutorial | File-based partitioning
      • Tutorial | Column-based partitioning
      • Tutorial | Partitioning in a scenario
      • Tutorial | Repartition a non-partitioned dataset
      • Tip | Interacting with partitioned datasets using the Python API
  • Machine Learning & Analytics
    • Interactive Statistics
      • Concept | Statistics worksheets
      • Concept | Statistics cards
      • Concept | Generate statistics recipe
      • Concept | Variable types for interactive statistics
      • Concept | Factor and response roles in statistics cards
      • Concept | Statistics cards for fit curves and distributions
      • Concept | Correlation matrices in statistical worksheets
      • Concept | Principal Component Analysis (PCA)
      • Concept | Hypothesis testing
      • Concept | Hypothesis test categories
      • Concept | Grouping variables in statistical testing
      • Concept | Adjustment methods for hypothesis test cards
      • Tutorial | Interactive statistics
      • How-to | Export a statistics card as a recipe
    • Machine Learning Concepts
      • Concept | Introduction to machine learning
      • Concept | Predictive modeling
      • Concept | Model validation
      • Concept | Model evaluation
      • Concept | Regression algorithms
      • Concept | Classification algorithms
      • Concept | Clustering algorithms
    • Feature Engineering
      • Concept | Data preparation for machine learning
      • Concept | Generate Features recipe
      • Tutorial | Generate Features recipe
      • Tutorial | Events aggregator plugin
    • AutoML Model Design
      • Concept | Quick models in Dataiku
      • Concept | The Design tab within the visual ML tool
      • Concept | Features handling
      • Concept | Multimodal ML using LLMs
      • Concept | Feature generation & reduction
      • Concept | Algorithm and hyperparameter selection
      • Concept | ML diagnostics
      • Concept | ML assertions
      • Tutorial | Machine learning basics
      • Tutorial | Model overrides
      • Tutorial | ML diagnostics
      • Tutorial | ML assertions
      • Tutorial | Clustering (unsupervised) models with visual ML
      • Tutorial | MLlib with Dataiku
      • How-to | Distributed hyperparameter search
      • FAQ | How does the AutoML tool automatically select or reject features when training a model?
      • Troubleshoot | In visual ML, I get the error “All values of the target are equal” when they’re not
    • AutoML Model Results
      • Concept | The Result tab within the visual ML tool
      • Concept | Model summaries within the visual ML tool
      • Concept | Explainable AI
      • Concept | Partial dependence plots
      • Concept | Subpopulation analysis
      • Concept | Individual prediction explanations
      • Concept | What if? analysis
      • Concept | Advanced What if? simulators
      • Concept | Interpretation of regression model output
      • Tutorial | Advanced What if simulators
      • Tutorial | Exporting a model’s preprocessed data with a Jupyter notebook
      • How-to | Set up What if analysis for a dashboard consumer
      • FAQ | Why don’t the values in the Visual ML chart match the final scores for each algorithm?
    • Model Scoring
      • Concept | Model deployment to the Flow
      • Concept | Scoring data
      • Concept | Model validation and evaluation
      • Tutorial | Model scoring basics
    • Custom Models in Visual ML
      • Concept | Custom preprocessing within the visual ML tool
      • Concept | Custom modeling within the visual ML tool
      • Concept | Tuning XGBoost models in Python
      • Tutorial | Custom preprocessing & modeling within visual ML
      • Tutorial | Azure AutoML from a Dataiku notebook
    • Time Series
      • Concept | Introduction to time series
      • Concept | Time series data types and formats
      • Concept | Time series components
      • Concept | Objectives of time series analysis
      • Concept | Time series analysis with interactive statistics
      • Concept | Time series preparation
      • Concept | Time series resampling
      • Concept | Time series interval extraction
      • Concept | Time series windowing
      • Concept | Time series extrema extraction
      • Concept | Time series forecasting
      • Tutorial | Time series analysis
      • Tutorial | Time series forecasting (Visual ML)
      • Tutorial | Time series preparation
      • Tutorial | Forecasting time series data with R and Dataiku
      • Tutorial | Deep learning for time series
      • Tutorial | Export preprocessed data (for time series models)
    • Causal Prediction
      • Concept | Causal prediction
      • Tutorial | Causal prediction
    • Text Processing
      • Concept | Regular expressions in Dataiku
      • Concept | Introduction to natural language processing (NLP)
      • Concept | Challenges of natural language processing (NLP)
      • Concept | Cleaning text data
      • Concept | Handling text features for machine learning
      • Tutorial | Build a text classification model
    • Images
      • Concept | Pre-trained image classification models
      • Concept | Optimization of image classification models
      • Concept | Object detection
      • Tutorial | Image classification without code
      • Tutorial | Image classification with code
      • Tutorial | Object detection without code
      • How to | Prepare images for use in a model
    • Geospatial Analytics
      • Concept | Geo join recipe
      • Tutorial | Geographic processors
      • Tutorial | No-code maps
      • Tutorial | Geo join recipe
      • Tutorial | Compute isochrones and routes with the Geo Router plugin
      • Tutorial | Working with shapefiles and US census data
      • Reference | Overview of Dataiku’s geospatial features
    • Partitioned Models
      • Concept | Partitioned models
      • Tutorial | Partitioned models
      • How-to | Train a stratified or partitioned model
    • Deep Learning
      • Tutorial | Deep learning within visual ML
      • Tutorial | Deep learning for time series
    • Active Learning
      • Tutorial | Active learning for classification problems
      • Tutorial | Active learning for object detection problems
      • Tutorial | Help on active learning webapp
      • Tutorial | Active learning for object detection problems using Dataiku apps
      • Tutorial | Active learning for tabular data classification problems using Dataiku apps
    • Responsible AI
      • Concept | Responsible AI
      • Concept | Dangers of irresponsible AI
      • Concept | Responsible AI in the data science practice
      • Concept | Basics of bias
      • Concept | Model fairness
      • Concept | Evaluating group fairness
      • Concept | Interpretability
      • Concept | Model transparency
      • Concept | Deployment biases
      • Tutorial | Responsible AI training
      • Reference | RAI further reading
  • Generative AI and Large Language Models (LLMs)
    • LLM Administration
      • Concept | LLM connections
      • Concept | Guardrails against risks from Generative AI and LLMs
    • Text Processing with Visual LLM Recipes
      • Concept | Large language models and the LLM Mesh
      • Concept | Classify text recipe
      • Concept | Summarize text recipe
      • Concept | Prompt Studios and Prompt recipe
      • Tutorial | Classify text with Generative AI
      • Tutorial | Summarize text with Generative AI
      • Tutorial | Prompt engineering with LLMs
      • Tutorial | Processing text with the Prompt recipe
    • Retrieval Augmented Generation (RAG)
      • Concept | Embed recipes and Retrieval Augmented Generation (RAG)
      • Tutorial | Retrieval Augmented Generation (RAG) with the Embed dataset recipe
      • Tutorial | Build a multimodal knowledge bank for a RAG project
      • Tutorial | Build a conversational interface with Dataiku Answers
    • LLMOps
      • Tutorial | LLM evaluation
  • Code
    • Getting Started with Code in Dataiku
      • Concept | Code notebooks
      • Concept | Code recipes
      • Tutorial | Code notebooks and recipes
    • Python and Dataiku
      • Tutorial | Code notebooks and recipes
      • Tutorial | SQL from a Python recipe in Dataiku
      • Tutorial | Sessionization in SQL, Hive, Python, and Pig
      • Tutorial | PySpark in Dataiku
      • Reference | Reading or writing a dataset with custom Python code
      • How-to | Enable auto-completion in a Jupyter notebook
      • Code Sample | Access info about datasets
    • SQL and Dataiku
      • Concept | SQL notebooks
      • Concept | SQL code recipes
      • Concept | AI SQL Assistant
      • Tutorial | SQL notebooks and recipes
    • R and Dataiku
      • Tutorial | Dataiku for R users
      • Tutorial | R Markdown reports
      • Tutorial | Forecasting time series data with R and Dataiku
      • Tutorial | R Shiny webapps
      • Reference | Upgrading and rolling back the R version used in Dataiku
      • How-to | Edit Dataiku recipes in RStudio
      • Troubleshoot | R recipes aren’t working after upgrading or migrating the instance
    • Work Environment
      • Concept | Code environments
      • Concept | External IDE integrations
      • Tutorial | My first Code Studio
      • How-to | Create a code environment
      • How-to | Set a code environment
      • How-to | Edit Dataiku projects and plugins in VS Code
      • How-to | Edit Dataiku projects and plugins in PyCharm
      • How-to | Edit Dataiku projects and plugins in Sublime
      • How-to | Edit Dataiku recipes in RStudio
      • FAQ | Why should I use a code environment?
    • Shared Code
      • Concept | Introduction to shared code
      • Concept | Shared code libraries
      • Concept | Importing code from a remote Git repository
      • Concept | Code samples
      • Tutorial | Shared code
      • Tutorial | Cloning a library from a remote Git repository
      • How-to | Import a notebook from GitHub
      • Tip | Best practices for notebook development between GitHub and Dataiku
    • Dataiku APIs
      • Concept | Dataiku APIs
      • Concept | The dataiku package
      • Concept | Dataiku public API
      • Concept | Usage of Dataiku APIs outside of Dataiku
      • Tutorial | Dataiku public API
      • Tip | Using the API within Dataiku (Basics)
      • Tip | Automating work in Dataiku with the API
      • Tip | Administering Dataiku remotely
    • Managed Folders
      • Concept | Managed folders
      • Tutorial | Managed folders
  • MLOps & Operationalization
    • MLOps Architecture
      • Concept | Definition, challenges, and principles of MLOps
      • Concept | How model development impacts MLOps
      • Concept | Model packaging for deployment
      • Concept | Dataiku architecture for MLOps
    • Batch Deployment
      • Concept | Automation node preparation
      • Concept | Batch deployment
      • Tutorial | Batch deployment
    • Test Scenarios
      • Tutorial | Test scenarios
    • API Deployment
      • Concept | Real-time APIs
      • Concept | API endpoints
      • Concept | API query enrichments
      • Concept | API Deployer
      • Tutorial | Real-time API deployment
    • Model Monitoring
      • Concept | Process governance for MLOps
      • Concept | Model comparisons
      • Concept | Model evaluation stores
      • Concept | Monitoring model performance and drift in production
      • Concept | Monitoring and feedback in the AI project lifecycle
      • Tutorial | Model monitoring with a model evaluation store
      • Tutorial | API endpoint monitoring
      • Tutorial | Model monitoring in different contexts
      • Tutorial | Deployment automation
      • FAQ | How can I get model monitoring metrics in a dataset format?
    • External Models
      • Tutorial | Surface external models within Dataiku
    • Dataiku Govern
      • Concept | Introducing Dataiku Govern
      • Concept | Centralization in Dataiku Govern
      • Concept | Governance layers
      • Concept | Govern item pages
      • Concept | Workflows and project qualification
      • Concept | Governed projects
      • Concept | Business initiatives
      • Concept | Sign-off process
      • Concept | Model maintenance in Dataiku Govern
      • Concept | Govern roles and permissions
      • Concept | Customizing a Dataiku Govern instance
      • Tutorial | Dataiku Govern framework
      • Tutorial | Govern roles and permissions
      • Tutorial | Blueprint Designer
      • Tutorial | Custom Pages Designer
      • Tutorial | Use imported templates in the Blueprint Designer
      • How-to | Export Govern items
      • How-to | Switch artifact templates (blueprint versions)
      • How-to | Subscribe to email notifications
      • How-to | Export and import blueprint and blueprint versions
      • How-to | Add role assignment rules to a Govern item
      • Tip | Embed a dashboard in Dataiku Govern
    • CI/CD Pipelines
      • Tutorial | Getting started with CI/CD pipelines with Dataiku
      • Tutorial | Jenkins pipeline for API services in Dataiku
      • Tutorial | Jenkins pipeline for Dataiku with the Project Deployer
      • Tutorial | Azure pipeline for Dataiku with the Project Deployer
      • Tutorial | Jenkins pipeline for Dataiku without the Project Deployer
    • Feature Store
      • Tutorial | Building your feature store in Dataiku
      • How-to | Add a dataset to the feature store
      • How-to | Add a feature group to the Flow
  • Plugins
    • Plugin Usage
      • Concept | Plugin management
      • Concept | Plugins in Dataiku
      • How-to | Install a plugin
      • How-to | Update a plugin
      • FAQ | Are plugins supported?
      • FAQ | Where can I find the details of a plugin?
    • Plugin Development
      • Concept | Plugin development
      • Concept | Development plugins
      • Concept | Git integration for plugins
      • Reference | Plugin naming policies and conventions
      • Reference | IDE setup to develop Dataiku plugins
      • How-to | Clone a plugin from a remote git repository
      • How-to | Share a plugin as a zip archive
      • How-to | Edit a plugin
      • FAQ | Why should I create plugins?
      • FAQ | What are some examples of plugins?
      • FAQ | Where can I find the code for a plugin?

Dataiku Cloud

  • Space Management
    • Free Trials of Dataiku Cloud
      • How-to | Begin a free trial from Dataiku
      • How-to | Begin a free trial from Snowflake Partner Connect
      • Tip | Working with Snowflake Partner Connect sample projects
    • Users, Profiles & Groups on Dataiku Cloud
      • Reference | Permission management on Dataiku Cloud
      • How-to | Invite users to your Dataiku Cloud space
      • How-to | Automatically attribute profiles and groups to users
      • How-to | Automatically invite users to your instance
      • How-to | Use trial seats
      • How-to | Activate single sign-on (SSO)
      • Troubleshoot | The invited user didn’t receive an email
    • Support on Dataiku Cloud
      • How-to | Contact support on Dataiku Cloud
      • How-to | Grant Dataiku support access to your instance
      • FAQ | Should I email support -at- dataiku -dot- com if I need help?
    • Production Nodes on Dataiku Cloud
      • How-to | Install the Automation node
      • How-to | Install the API node
      • How-to | Use the referenced data deployment mode on Dataiku Cloud
      • How-to | Deploy an API service from the Automation node on Dataiku Cloud
  • Data Transfer and Security on Dataiku Cloud
    • Reference | Relocatable datasets
    • Reference | Data transfer between cloud storage locations
    • How-to | Secure data connections through AWS PrivateLink
    • How-to | Secure data connections through Azure Private Link
    • How-to | Secure data connections through GCP Private Service Connect
    • How-to | Restrict access to Dataiku Cloud IP addresses
    • How-to | Access data sources through a VPN server
  • Compute and Resource Quotas on Dataiku Cloud
    • Reference | Overview of compute engines on Dataiku Cloud
    • Reference | Leveraging fully managed elastic AI compute
    • Reference | Managing elastic AI compute capacity
    • Reference | Managing containerized execution configurations
    • Reference | Resource quota management
    • Tip | Choosing container sizes
    • Tip | Using Spark
    • Troubleshoot | The job takes an unusually long time to complete
    • Troubleshoot | The job queues for a long time and then fails without ever starting

Additional Offerings

  • Dataiku Solutions
    • Retail & CPG
      • Solution | Customer Satisfaction Reviews
      • Solution | Demand Forecast
      • Solution | Distribution Spatial Footprint
      • Solution | Market Basket Analysis
      • Solution | Product Recommendation
      • Solution | Customer Lifetime Value Forecasting
      • Solution | RFM Segmentation
      • Solution | Inventory Allocation Optimization with Grid Dynamics
      • Solution | Markdown Optimization
      • Solution | Store Segmentation
    • Financial Services & Insurance
      • Solution | AML Alerts Triage
      • Solution | Credit Card Fraud
      • Solution | Insurance Claims Modeling
      • Solution | Credit Scoring
      • Solution | Customer Segmentation for Banking
      • Solution | Interactive Document Intelligence for ESG
      • Solution | News Sentiment Stock Alert System
      • Solution | Next Best Offer for Banking
      • Solution | Credit Risk Stress Testing (CECL, IFRS9)
      • Solution | Lead Scoring
    • Health & Life Sciences
      • Solution | Optimizing Omnichannel Marketing
      • Solution | Pharmacovigilance
      • Solution | Social Determinants of Health
      • Solution | Clinical Site Intelligence
      • Solution | Molecular Property Prediction
      • Solution | Drug Repurposing through Graph Analytics
      • Solution | Dynamic HCP Segmentation
      • Solution | Real-World Data: Cohort Discovery
    • Manufacturing & Energy
      • Solution | Maintenance Performance and Planning
      • Solution | Batch Performance Optimization
      • Solution | Delivery Dock Optimization
      • Solution | Factories Electricity & CO2 Emissions Forecasting
      • Solution | Production Quality Control
      • Solution | Parameters Analyzer
    • Finance Teams
      • Solution | Financial Forecasting
    • Operations
      • Solution | Process Mining
      • Solution | Reconciliation
    • Governance
      • Solution | Leveraging Compute Resource Usage Data
      • Solution | EU AI Act Readiness
      • Solution | LLM Provider Due Diligence
      • Solution | ISO 42001 Readiness
    • Real Estate
      • Solution | Real Estate Pricing
  • Use Cases
    • Data Preparation Use Cases
      • Tutorial | Airport traffic by US and international carriers
      • Tutorial | Network optimization
    • Classification Use Cases
      • Tutorial | Predictive maintenance in the manufacturing industry
      • Tutorial | Churn prediction
      • Tutorial | Facies classification
    • Clustering Use Cases
      • Tutorial | Web logs analysis
    • Plugin Use Cases
      • Tutorial | A/B testing for event promotion (AB test calculator plugin)
      • Tutorial | Crawl budget prediction for enhanced SEO (OnCrawl plugin)
      • Tutorial | Data governance with the GDPR plugin

Admin Guide

  • Deploying Dataiku
    • Dataiku Architecture
      • Reference | Fleet Manager
      • Reference | The Dataiku elastic AI stack
    • Deploying Dataiku Instances to Cloud Stacks
      • Tutorial | Deploy a Dataiku instance to Cloud Stacks on AWS
      • Tutorial | Deploy a Dataiku instance to Cloud Stacks on Azure
    • Instance Templates
      • Reference | Fleet blueprints
      • How-to | Create or modify an instance template
      • How-to | Grant SSH access
      • How-to | Grant security roles
      • How-to | Use the license override setting
      • Tip | Modifying instance templates and settings
      • Tip | The impact of instance template modifications on disk sizes
      • Tip | The impact of instance template modifications on other elements
    • Setup Actions for Instance Templates
      • Reference | Setup actions
      • How-to | Add a new setup action
      • How-to | Run Ansible tasks
      • How-to | Set up Kubernetes and Spark-on-Kubernetes
      • How-to | Remove a setup action
    • Virtual Networks
      • Reference | Creating or modifying a virtual network
      • How-to | View or edit a virtual network
      • How-to | Edit virtual network names
      • How-to | Assign a public IP address
      • How-to | Assign a virtual network ID and subnet name
      • How-to | Create default or custom security groups
      • How-to | Enable Fleet Management configuration options
      • How-to | Choose DNS strategy
      • How-to | Choose an SSL strategy
      • How-to | Reprovision an instance after applying modifications
    • Instance Management from Fleet Manager
      • Reference | Instance lifecycle management from Fleet Manager
      • Reference | Defining settings at the instance level
      • Reference | Setting the disk sizes
      • Reference | Reprovisioning, deleting or stopping an instance
      • Reference | Defining static IP addresses
      • Reference | Defining an SSL strategy
      • Reference | Using the dashboard and agent logs
      • How-to | Upgrade an instance
      • How-to | Configure automatic snapshots of the data disk
  • Configuring Dataiku
    • License File Management
      • How-to | Configure your DSS license
      • How-to | Update a license file through the license override setting
      • How-to | Select a sublicense
      • How-to | Update a license file for a cloud setup
      • How-to | Fetch usage statistics in Fleet Manager
      • How-to | View license information from the DSS Administration menu
    • User Identity & Authentication
      • Reference | Security model overview
      • Reference | User identity
      • Reference | User profiles
      • Reference | Supported authentication methods
      • How-to | Create a local user (manually)
      • How-to | Add LDAP users via LDAP configuration
      • How-to | Add local users from an Azure Active Directory (AAD)
    • User Groups & Permissions
      • Reference | Global vs. per-resource group permissions
      • Reference | Global group permissions
      • Reference | Per-resource group permissions
      • How-to | Set up user groups (overview)
      • How-to | Create a group and assign it global permissions
      • How-to | Verify group membership and permissions
      • How-to | Grant per-project permissions
      • How-to | Control access to code environments
      • How-to | Control access to managed clusters
      • How-to | Assign access to containerized execution
      • How-to | Assign Deployer infrastructure permissions
      • Tip | Creating a permissions model based on user types
      • Code Sample | Add a group to a Dataiku project using Python
      • FAQ | Which activities require that a user be added to the allowed_user_groups local Unix group?
    • Connection Usage Parameters
      • Reference | “Allow write” and “Allow managed datasets” usage parameters
      • Reference | Usage parameters for cloud storage
      • Reference | Usage parameters for SQL databases
    • Connection Security
      • Tutorial | Using AWS AssumeRole with an S3 connection to persist datasets
      • Reference | Security permissions for data connections
      • Reference | Global vs. per-user connection credentials
    • DSS Metastore Catalog
      • Reference | Dataiku metastore catalog
      • Reference | Querying datasets from metastore-aware engines
      • How-to | Configure an internal metastore
      • How-to | Configure an external metastore (AWS Glue Data Catalog)
      • How-to | Synchronize a dataset to the metastore catalog
      • How-to | Import a dataset from the Hive metastore (HMS)
      • How-to | Interact with AWS Glue
      • How-to | Build a chart using a metastore-aware engine
      • How-to | Query datasets from a metastore-aware notebook
    • Preferred Connections and Format for Dataset Storage
      • Concept | Default, fallback, and forced dataset connections
      • How-to | Configure the global default file format
      • How-to | Adjust the default configuration for preferred connections and file formats for a project
      • Tip | Selecting default file formats and preferred connections
    • Code Environment Administration
      • How-to | Grant permissions to create or manage code environments
      • How-to | Create a new code environment
      • How-to | Manage code environment properties
      • How-to | Configure default code environments
      • How-to | Install system-level package dependencies
      • How-to | Point DSS to a custom Python package repository
      • How-to | Point DSS to a CRAN mirror
      • How-to | Provide access to custom package repositories via an internet proxy
      • FAQ | Does Dataiku support custom package repositories?
  • Operating Dataiku
    • Instance Monitoring
      • Tutorial | Self-healing API service deployments on Kubernetes
      • Tutorial | Forward Dataiku logs to Splunk Cloud Platform
      • Tutorial | Use Datadog to monitor Dataiku-managed Elastic AI clusters
      • Code Sample | Find out which users are logged onto the Dataiku instance
      • Solution | Leveraging Compute Resource Usage Data
    • Diagnosing Performance Issues
      • How-to | Get support
      • Troubleshoot | A code recipe takes a long time to run
      • Troubleshoot | Dataiku isn’t using the optimal engine for a visual recipe
      • Troubleshoot | A visual recipe job log says “Computation will not be distributed”
      • Troubleshoot | Diagnosing instance-wide performance
      • Troubleshoot | Sync recipe from Snowflake to S3 takes many hours to complete
      • Troubleshoot | Python or PySpark job takes several hours to complete
      • Troubleshoot | The Dataiku UI is slow to load for all users
      • Tip | Scoping performance issues
      • Tip | Takeaways for performance troubleshooting
    • Project Cleaning and Maintenance
      • Tutorial | Create a scenario for automating project maintenance macros
      • Reference | Project maintenance macros
      • Reference | Project maintenance macros glossary
  • Go back to the homepage
  • Collaboration
Back to top

Discussions#

Learn about how to use discussions on Dataiku objects to share knowledge with collaborators.

Concepts#

  • Concept | Discussions

References#

  • Reference | Managing discussions

How-tos#

  • How-to | Start discussions in a Dataiku object
Next
Concept | Discussions
Previous
How-to | Copy Flow items to a new or existing project
Copyright © 2025, Dataiku
Made with Sphinx and @pradyunsg's Furo
On this page
  • Discussions
    • Concepts
    • References
    • How-tos