Concept | Prompt Studios and Prompt recipe#

Large language models (LLMs) are a type of generative AI that specializes in generating coherent responses to input text, known as prompts. With an extensive understanding of language patterns and nuances, LLMs can be used for a wide range of natural language processing tasks, such as classification, summarization, translation, question answering, entity extraction, spell check, and more.

When interacting with a generative model, you can modify and craft your prompt to achieve a desired outcome. This process is known as prompt engineering. Prompts guide the model’s behavior and output, but they don’t modify the underlying model. This makes prompt engineering a quicker, easier way to tailor a model for your business needs without fine-tuning, which can be time-consuming and costly.

Prompt engineering in Dataiku#

In Dataiku, Prompt Studios and the Prompt recipe work together to test and iterate on LLM prompts, then operationalize them in your data pipeline.

Prompt Studios and the Prompt recipe allow you to securely connect to LLMs either through a generative AI service or a private model, including open-source models. You can first experiment with different prompt designs in Prompt Studios, before deploying a desired prompt in your data pipeline with the Prompt recipe.

The prompt engineering workflow in Dataiku.

Important

To use Prompt Studios and the Prompt recipe, your administrator must configure at least one connection to a supported LLM provider and make sure instruct/prompt models are made available.

Prompt Studios#

Prompt Studios is the playground where you can iterate and test different prompts and models to engineer the optimal prompt for a specific use case. You can find Prompt Studios in the top navigation bar under the Visual Analyses menu.

Within a Prompt Studio, you can create multiple prompts and iterations. For each new prompt, you can create either a Single-shot prompt or Prompt template, which can be used later in the Prompt recipe. You can also choose from sample prompts created by Dataiku.

The new prompt info window where you choose from prompt templates or single-shot prompts.

Single-shot prompt#

A single-shot prompt is a one-off prompt that queries the LLM and is designed for quick experimenting. Single-shot prompts are not reusable, meaning they can’t be converted into a Prompt recipe and their results cannot be compared in Prompt Studio.

A single-shot prompt in Prompt Studios.

Prompt template#

Prompt templates offer advanced capabilities to create reusable, production-ready prompts. You can use datasets for test cases, compare the results between prompts, and convert the template into a Prompt recipe.

Prompt templates have two modes for different kinds of tasks: text prompts and structured prompts.

Text prompts#

Think of text prompts as the manual mode for writing prompts. You write the entire prompt that will be sent to the LLM, using placeholders (in double curly brackets) to define parameters. The placeholders generate inputs that you then map to a dataset column or enter manually.

In the following example, the prompt asks the LLM to Give the topic of each article based on the {{headline}} and short {{description}}. The inputs headline and description are automatically created, and then you can map them to columns in a dataset.

Dataiku will send the text prompt as written, with the dataset as values for given parameters, to the LLM.

A text prompt and corresponding inputs.

Structured prompts#

With structured prompts, you write the task in plain text, and then optionally define the inputs and examples to include. You can enter inputs manually or map them to a dataset’s columns. Then Dataiku integrates all the information into a structured prompt to send to the LLM.

The following prompt is the same as the example above, but using structured prompt mode. The prompt is written in plain text, with no parameters defined. Then inputs are added manually and mapped to columns using the Prompt Studio interface.

A structured prompt and corresponding inputs.

In either structured or text mode, you can quickly iterate on your prompt design, and add new inputs, examples, and test cases, before deploying your prompt.

Examples#

Examples allow you to add sample inputs and the desired output to help the LLM learn what the output should look like. You can add one or more examples in structured prompts.

A structured prompt with one provided example of input and output.

Test cases#

Prompt Studios will run a small number of test cases through the LLM to help you gauge the performance of your prompt. You can manually write test cases or use eight rows from a dataset as test cases.

The test cases help you see the actual output the model will return given your dataset.

Each time you run inference on your prompt and test cases, you will notice an estimated cost for 1,000 records above the results. LLM providers are not free, and each API call has an associated cost that typically depends on length of the prompt.

Managing costs is an important part of prompt engineering, which is why Dataiku gives you full transparency on how much this setup would cost to run on 1,000 records.

A structured prompt and corresponding inputs.

Settings#

You can change hyperparameter settings to control the behavior of the LLM:

  • Temperature: The temperature controls the randomness, or creativity, of the model. The higher the temperature, the more random or creative the responses will be. You can set the value between 0 and 2.

  • Max output tokens: LLMs process text by converting words and subwords into tokens. The number of max output tokens roughly equates to the maximum number of words returned by the model.

You can also set validation and formatting rules, such as validating the expected output format as a JSON object or forbidding certain words from the output. Note that these validation and formatting rules won’t change the LLM response, but you will receive a warning if the validation is not verified.

Prompt settings for validation and formatting.

Comparing prompts#

Prompt Studios track your prompts in the left sidebar. You can navigate between prompts, view each prompt’s history, restore previous versions, and duplicate prompts.

When using prompt templates, you can also compare the output of multiple prompts by selecting them in the sidebar and clicking Compare. Prompt Studios will build a table comparing the outputs and costs of the selected prompts.

A comparison of the inputs and outputs of two similar prompts.

Prompt recipe#

The Prompt recipe puts your prompt into action on a dataset. It generates an output dataset and appears in the Flow.

You can create a new Prompt recipe directly from Prompt Studios by saving a prompt template that is mapped to a dataset as a recipe. This allows you to experiment with your prompts before operationalizing them with the recipe.

You can also create a new Prompt recipe directly from the Flow or from the Actions panel of a dataset. With this method, you can write a Structured prompt or a Text prompt directly in the recipe, with the same settings as a prompt template in a Prompt Studio.

The Prompt recipe settings screen.

What’s next?#

Continue learning about prompt engineering with LLMs by working through the Tutorial | Prompt engineering with LLMs article.