How to | Prepare images for use in a model#

To include images as input for certain tasks in Dataiku, you must first extract the image metadata into a dataset.

You can do this using the List Contents recipe, which creates datasets for input into labeling tasks, computer vision models, or large language models (LLMs) that accept images as input.

  1. Upload the images into a managed folder.

  2. Select the folder in the Flow.

  3. In the Actions panel, select the List Contents recipe.

  4. Choose the output dataset name and storage options, and click Create Recipe.

  5. In the recipe Settings tab, choose the columns needed in the output dataset:
    • path to the file in the folder

    • basename filename without the extension

    • extension of the file

    • last_modified date

    • size of the file in bytes

  6. Optionally, you might want to add a folder level mapping, which creates columns containing the names of specific levels in the folder hierarchy. To do this, click + Add Level Mapping, then set the folder level and the column name(s).

  7. Click Run to execute the recipe and create the output dataset.

Using the List Contents recipe to prepare image files for processing.