Get started#
Recent advancements in generative AI have made it easy to apply for jobs. But be careful! Scammers have also been known to create fake job applications in the hopes of stealing personal information. Let’s see if you — with Dataiku’s help — can spot a real job posting from a fake one!
Objectives#
In this quick start, you’ll:
Interactively explore a dataset.
Clean data and create new features.
Import new data into a Dataiku project.
Join two datasets together.
Optional: Write code in a notebook and insert it into a visual Flow.
Tip
To check your work, you can review a completed version of this entire project from data preparation through MLOps on the Dataiku gallery.
Create an account#
To follow along with the steps in this tutorial, you need access to a 12.0+ Dataiku instance. If you do not already have access, you can get started in one of two ways:
Start a 14 day free trial. See How-to | Begin a free trial from Dataiku for help if needed.
Install the free edition locally for your operating system.
Open Dataiku#
The first step is getting to the homepage of your Dataiku Design node.
Go to the Launchpad.
Click Open Instance in the Design node tile of the Overview panel once your instance has powered up.
Important
If using a self-managed version of Dataiku, including the locally-downloaded free edition on Mac or Windows, open the Dataiku Design node directly in your browser.
Let’s start by creating a Dataiku project that already includes a labeled dataset of real and fake job postings.
Create the project#
From the Dataiku Design homepage, click + New Project.
Select Learning projects.
Search for and select Data Preparation Quick Start.
Click Install.
From the project homepage, click Go to Flow (or
g
+f
).
From the Dataiku Design homepage, click + New Project.
Select DSS tutorials.
Filter by Quick Starts.
Select Data Preparation Quick Start.
From the project homepage, click Go to Flow (or
g
+f
).
Note
You can also download the starter project from this website and import it as a zip file.