Creating the Project and Importing Datasets

First, we will import into Dataiku DSS the three input files found in the Supporting Data section of the previous lesson. Detailed import steps are provided below. If comfortable importing flat files, feel free to skip to the Preparing the Usage Dataset lesson.

Note

Workflows are organized as projects on a DSS instance. The project home page contains metadata (description, tags), records of activities, and much more.

  1. From the Dataiku homepage, go to + New Project > Blank Project. Name it Predictive Maintenance. Note that the new project is automatically assigned a project key. We can leave the default or assign something else in its place, but it cannot be changed once the project is created.

  1. Let’s create the first dataset and name it usage. Here’s one method to do that:

  • Click on the + Import Your First Dataset button on the project homepage.

  • Select Files > Upload your files.

  • Upload the usage.csv.gz file downloaded from the Supporting Data section of the previous lesson.

  • Click on Preview to view the default import settings used by Dataiku DSS.

    • These settings can be adjusted as necessary. For example, if the data is stored in a JSON format, then we can detect that in this step.

  • If the import settings look ok, click Create.

../../../_images/settings-preview.png
  1. Repeat these actions to import the two remaining files as datasets: failure.csv.gz and maintenance.gz.csv.

Once all three datasets are present in the Flow, you are ready to proceed.