Knowledge
Discussions
Setup & Configuration
Using Dataiku DSS
Plugins & Extending Dataiku DSS
General Discussion
Job Board
Community Resources
Knowledge
Getting Started
Knowledge Base
Documentation
Academy
Course Catalog
Learning Paths
Resources
Academy Discussions
Community Programs
Upcoming User Events
Find a User Group
Past Events
Community Conundrums
Dataiku Neurons
Banana Data Podcast
What's New
Getting Started
Dataiku DSS - The Value Proposition
Your Path to Enterprise AI
Product Pillar: Dataiku DSS Architecture
Product Pillar: Inclusive Advanced Analytics
Product Pillar: AutoML
Product Pillar: AI Operationalization
Product Pillar: Performance Scalability
Product Pillar: Sustainable Governance & Processes
Dataiku DSS - Project Walkthrough
The NY Taxi Project through the AI Lifecycle
The AI Lifecycle: Data Acquisition
The AI Lifecycle: Data Exploration
The AI Lifecycle: Data Preparation
The AI Lifecycle: Experiment
The AI Lifecycle: Deploy
The AI Lifecycle: Orchestrate
From Excel To Dataiku DSS
Introduction
Data Cleaning
Using Formulas
Working with Dates
Removing Duplicates
Filtering Rows
Sampling Rows
Split a Dataset
Append Datasets
Joining Datasets
Aggregate and Pivot
Sorting Values
Top Values
Setup and Administration
Concept: Connections to SQL Databases
Remapping Connections in a DSS Instance
Working with MongoDB in DSS
Integration with Amazon Redshift
Data Preparation
Concept: Recipes in DSS
Preparing Data with Visual Recipes
Concept: Distinct Recipe
Concept: Group Recipe
Concept: Join Recipe
Concept: Pivot Recipe
The Pivot Recipe
Reshaping Data from Long to Wide Format
Creating Excel-Style Pivot Tables with the Pivot Recipe
Concept: Prepare Recipe
Concept: Date Handling in DSS
Concept: Formulas in DSS
Advanced Prepare Recipe Usage
How to reorder or hide the columns of a dataset
Concept: Filter Recipe
Concept: Sample Recipe
Concept: Sort Recipe
Concept: Split Recipe
Concept: Stack Recipe
Concept: Top N Recipe
Concept: Window Recipe
Visual Window Analytic Functions
Concept: Architecture Model for Databases
How to segment your data using statistical quantiles
Preparing Data with Code Recipes
Concept: SQL Recipe
Using PySpark in DSS
Using SparkR in DSS
Preparing Data with Plugin Recipes
Events Aggregator (Plugin)
Building Data Pipelines
Data Pipelines
Concept: Computation Engine
Concept: Jobs
Build Datasets
Where does it all happen?
How to enable SQL pipelines in the Flow
Repartitioning a non-partitioned dataset
Exploring Datasets
Connecting to and Exploring Data
Concept: Datasets in DSS
Concept: Partitioning
Concept: Connections
Concept: Schema
Concept: Storage Type and Meaning
Concept: Sampling
Concept: Analyze
Where can I see how many records are in my entire dataset?
Utilizing MS Access in Dataiku DSS
Charts
Concept: Charts
Concept: In-Database Charts
Paneled and Animated Charts
How to display non-aggregated metrics in charts
How to sort on a measure that is not displayed in charts?
Exploring Data in the Lab
Concept: The Lab
Concept: SQL Notebooks
Reporting & Insights
Dashboards
Concept: Dashboards in DSS
Cannot display a web content insight in a dashboard
R Markdown
Concept: R Markdown Reports
R Markdown Reports in DSS
Web Apps in Dataiku DSS
Bokeh Web Apps
Shiny Web Apps
HTML/JavaScript Web Apps
Use custom static files (JS, CSS) in a web app
How to Adapt a D3.js Template in a Web App
Use a React Frontend to Create a Webapp
How to display an image with Bokeh?
Upload to DSS in a web app
Concept: Visualization Plugins
Managing Your Work & Collaboration
Concept: Homepage
Concept: Project
Concept: Collaboration
Concept: Flow
How to copy a recipe in your Flow
Navigating Dataiku DSS with the right panel
Flow Zones
Tags
Using Wikis to Share Knowledge
Using Discussions to Communicate with Teammates
Git for Projects
Best Practices for Collaborating in Dataiku DSS
Best Practices to Improve Your Productivity
Analytics and Machine Learning
Interactive Visual Statistics
Concept: Statistics Worksheet
Concept: Statistics Card
Concept: Categorical and Numerical Variables
Concept: Factor and Response
Concept: Fit Curves and Distributions
Concept: Correlation Matrix
Concept: Principal Component Analysis (PCA)
Concept: Hypothesis Testing
Concept: Test Categories
Concept: Grouping Variable
Concept: Adjustment Method
Hands-On: Interactive Visual Statistics
Intro to Machine Learning
Concept Summary: Introduction to Machine Learning
Concept Summary: Predictive Modeling
Concept Summary: Model Validation
Concept: Model Evaluation
Concept Summary: Regression Algorithms
Concept Summary: Classification Algorithms
Visual Machine Learning
Machine Learning Basics
Interpreting Regression Models’ Outputs
How to identify clusters and name them
Deploy and Score a Model
Concept: Model Lifecycle Management
Concept Summary: Partitioned Models
Hands-On: Partitioned Models
How do I train a stratified or partitioned model?
Using MLLib in the Dataiku DSS interface
Why don’t the values in the Visual ML chart match the final scores for each algorithm?
In Visual ML, why am I getting the error “All values of the target are equal,” when they are not?
Compute a subpopulation analysis for white-box ML
Monitoring model drift with Dataiku DSS
Time Series
Concept Summary: Introduction to Time Series
Concept Summary: Time Series Data Types and Formats
Concept Summary: Time Series Components
Concept Summary: Objectives of Time Series Analysis
Concept Summary: Time Series Preparation
Concept Summary: Resampling
Concept Summary: Time Series Interval Extraction Pt 1
Concept Summary: Time Series Interval Extraction Pt 2
Concept Summary: Time Series Interval Extraction Pt 3
Concept Summary: Time Series Windowing Pt 1
Concept Summary: Time Series Windowing Pt 2
Concept Summary: Time Series Windowing Pt 3
Concept Summary: Time Series Extrema Extraction
Hands-On: Visualizing Time Series Data
Hands-On: Resampling Time Series Data
Hands-On: Interval Extraction
Hands-On: Time Series Windowing
Hands-On: Extrema Extraction
Forecasting Time Series Data with R and Dataiku DSS
Deep Learning for Time Series
How Dataiku DSS Handles and Displays Date & Time
Introduction to Deep Learning with Code
Natural Language Processing (NLP)
Concept Summary: Introduction to Natural Language Processing
Hands-On: Getting Started with NLP
Concept Summary: Preparing Text Data
Hands-On: Cleaning Text Data
Concept Summary: Handling Text Features for ML
Hands-On: Handling Text Features for ML
Sentiment Analysis in Dataiku DSS (Plugin)
Recognize authors style using the Gutenberg plugin
Natural Language Processing with Code
How to use Natural Language Toolkit (NLTK) in DSS
How to use spaCy models in Dataiku DSS
Image Classification with Visual Tools
Hands-On: Create Your Project and Prepare the Data
Hands-On: Install the Deep Learning Plugins
Concept Summary: Pre-Trained Models
Hands-On: Add a Pre-Trained Model to the Flow
Classify a Set of Test Images with the Pre-Trained Model
Hands-On: Transfer Learning to Retrain the Model
Hands-On: Analyze and Understand Your Model with Tensorboard
Hands-On: Object Detection
Wrap Up
Image Classification with Code
Geospatial Analytics
Creating Maps in DSS without code
Geographic Processing with DSS
Working with Shapefiles and US Census Data in DSS
Active Learning
Active Learning for classification problems
Active Learning for object detection problems
Help on Active Learning Webapp
Active Learning for object detection problems using Dataiku Apps
Active Learning for tabular data classification problems using Dataiku Apps
Reinforcement Learning
Introduction to Reinforcement Learning
Q-Learning
Deep Q-Learning
Advanced Code
Python and Dataiku DSS
Python in Dataiku DSS
Reading or writing a dataset with custom Python code
How-To: Use SQL from a Python Recipe in DSS
Sessionization in SQL, Hive, Pig and Python
Custom Python Models
Tuning XGBoost Models in Python
How to add a group to a Dataiku DSS Project using a Python Script
How to set a timeout for a particular scenario build step via a custom Python step?
How to use Azure AutoML from a Dataiku DSS Notebook
How to enable auto-completion in Jupyter Notebook
R and Dataiku DSS
Basics of R in Dataiku DSS
Mining Association Rules and Frequent Item Sets with R and DSS
Upgrading the R version used in Dataiku DSS
Work Environment
Using Jupyter Notebooks in DSS
How to Edit Dataiku Recipes and Plugins in Visual Studio Code
How to Edit Dataiku Recipes and Plugins in PyCharm
How to Edit Dataiku Recipes and Plugins in Sublime
How to Edit Dataiku Recipes in RStudio
Setting a Code Environment
Cloning a Library from a Remote Git Repository
Dataiku DSS Memory Optimization tips: Backend, Python/R, Spark jobs
Dataiku APIs
Operationalization
Automation
Automation
Reporting Scenario Activities
Model Lifecycle
How to Create a Google Chat Reporter
How to programmatically set email recipients in a “Send email” reporter using the API?
How to create a Jira issue automatically upon a DSS scenario execution failure
Can I control which datasets in my Flow get rebuilt during a scenario?
How to build missing partitions with a scenario
Flow Deployment
Deploying to Production
Packaging a Flow into a Bundle
Deploying a Bundle
Versioning a Flow
What’s Next
Deploying to Real-Time Scoring
Deploying multiple models to the API node for A/B testing
Dataiku Applications
Introduction to Dataiku Applications
Create a Visual Application
Create an Application-As-Recipe
Difference Between Webapps and Dataiku Applications
Dataiku Applications: Use Cases
Building a Jenkins pipeline for Dataiku DSS
Building a Jenkins pipeline for API services in Dataiku DSS
Variables
Variables in Flows, Webapps, and Dataiku Applications
Variables 101: Variables for Coders
Plugin Development & Management
Plugin Development
How to Create a Custom Recipe
How to Create a Custom Dataset
How to Create a Partitioned Custom Dataset
How to Create a Custom Web App
How to Create a Custom Machine Learning Algorithm
Setting Up Your Code Editor to Develop Dataiku Plugins
Plugin Naming Policies and Conventions
What’s Next
Plugin Management
Plugins in Dataiku DSS
Plugin Store Usage
Getting Started with the Dataiku DSS Plugin Store
Hands-On: Plugin Store
Sharing a Plugin as a Zip Archive
Managing Plugin Versions with Git
Cloning a Plugin from a Remote Git Repository
Governance
Concept: Global Search
Using global search in Dataiku DSS
Concept: Catalog
Data Governance with the GDPR Plugin
How to use project folders in Dataiku DSS
Why can’t I drag and drop a folder into Dataiku DSS?
How to duplicate a Dataiku DSS project
How to find out which users are logged onto the Dataiku DSS instance
Which activities in Dataiku DSS require that a user be added to the
allowed_user_groups
local Unix group?
Use Cases
Airport Traffic by US and International Carriers
Predictive Maintenance
Churn Prediction
Web Logs Analysis
Network Optimization
Bike Sharing Usage Patterns
Crawl budget prediction for enhanced SEO with the OnCrawl plugin
Dataiku Cloud offer
How to begin a Dataiku Cloud free trial
How to invite users to your Dataiku Cloud
How to create a Snowflake connection on Dataiku Cloud
Starting a Dataiku Cloud trial from Snowflake Partner Connect
Dataiku Academy
You are viewing the Knowledge Base for version
8.0
of DSS.
Docs
»
Getting Started
»
From Excel To Dataiku DSS
From Excel To Dataiku DSS
¶
Learn how to perform familiar Excel operations in Dataiku DSS.
Articles
¶
Introduction
Data Cleaning
Using Formulas
Working with Dates
Removing Duplicates
Filtering Rows
Sampling Rows
Split a Dataset
Append Datasets
Joining Datasets
Aggregate and Pivot
Sorting Values
Top Values