How to | Prepare images for use in a model#
To include images as input for certain tasks in Dataiku, you must first extract the image metadata into a dataset.
You can do this using the List Contents recipe, which creates datasets for input into labeling tasks, computer vision models, or large language models (LLMs) that accept images as input.
Upload the images into a managed folder.
Select the folder in the Flow.
In the Actions panel, select the List Contents recipe.
Choose the output dataset name and storage options, and click Create Recipe.
- In the recipe Settings tab, choose the columns needed in the output dataset:
path to the file in the folder
basename filename without the extension
extension of the file
last_modified date
size of the file in bytes
Optionally, you might want to add a folder level mapping, which creates columns containing the names of specific levels in the folder hierarchy. To do this, click + Add Level Mapping, then set the folder level and the column name(s).
Click Run to execute the recipe and create the output dataset.