Get started#

Recent advancements in generative AI have made it easy to apply for jobs. But be careful! Scammers have also been known to create fake job applications in the hopes of stealing personal information. Let’s see if you — with Dataiku’s help — can spot a real job posting from a fake one!

Objectives#

In this quick start, you’ll:

  • Interactively explore a dataset.

  • Clean data and create new features.

  • Import new data into a Dataiku project.

  • Join two datasets together.

  • Optional: Write code in a notebook and insert it into a visual Flow.

Tip

To check your work, you can review a completed version of this entire project from data preparation through MLOps on the Dataiku gallery.

Create an account#

To follow along with the steps in this tutorial, you need access to a 12.0+ Dataiku instance. If you do not already have access, you can get started in one of two ways:

  • Start a 14 day free trial. See this how-to for help if needed.

  • Install the free edition locally for your operating system.

Open Dataiku#

The first step is getting to the homepage of your Dataiku Design node.

  1. Go to the Launchpad.

  2. Click Open Instance in the Design node tile of the Overview panel once your instance has powered up.

  3. See this how-to if you encounter any difficulties.

Important

If using a self-managed version of Dataiku, including the locally-downloaded free edition on Mac or Windows, open the Dataiku Design node directly in your browser.

Create the project#

Let’s start by creating a Dataiku project that already includes a labeled dataset of real and fake job postings.

  1. From the Dataiku Design homepage, click + New Project.

  2. Click DSS tutorials in the dropdown menu.

  3. In the dialog, click Quick Starts on the left hand panel.

  4. Choose Data Preparation Quick Start, and then click OK.

Dataiku screenshot of the dialog for creating a new project.

Note

You can also download the starter project from this website and import it as a zip file.