Concept | Summarize text recipe#

The Summarize text recipe enables you to summarize texts into smaller ones using Large Language Models (LLMs).

If the input text is too long to be managed by the LLM, the recipe:

  1. Intelligently chunks the long input text into manageable sections.

  2. Processes each section separately.

  3. Compiles a coherent summary.

Text summarization in Dataiku using Generative AI.

Recipe settings#

In the recipe settings page, you:

  1. Select which LLM to use. The dropdown lists only LLM connections that are available in the current instance. Some models are specifically designed for summarization. Yet, you can use generic text generation models like OpenAI GPT 3.5.

  2. Assign a text column as input. This column is the one that contains the text to summarize.

  3. Optionally, specify a language in which the summary should be written.

  4. Optionally, set the length of the desired summary. Depending on the model, you can set a minimum and maximum summary length expressed in tokens.

Screenshot of the settings page of a Summarize text recipe.

Note

The settings may vary from one model to another.

Output dataset#

After running the recipe, if you look at the output dataset, you can see:

  • The summary for each row in the summarized_text column.

  • Any error messages in the llm_error_message column. The model fills this column in only when it returns an error. Otherwise, it leaves the field blank.

Screenshot of an output dataset of the Summarize text recipe.

What’s next?#

Continue learning about text summarization with LLMs by working through the Tutorial | Summarize text with Generative AI article.