Tutorial | Prompt engineering with LLMs#

Designing high-quality prompts is the quickest and most efficient way to leverage generative AI models and adapt them for specific business purposes. This technique, known as prompt engineering, allows you to modify the behavior of a large language model without costly and highly technical fine-tuning methods.

In Dataiku, Prompt Studios allow you to test and iterate on prompts until the model produces your desired response, and the Prompt recipe allows you to operationalize the prompt.

Get started#

Objectives#

In this tutorial, you will:

  • Use Prompt Studios to engineer prompts for a large language model to summarize and classify text.

  • Push your final prompt to the Flow using the Prompt recipe.

Prerequisites#

To complete this tutorial, you will need:

  • A Dataiku instance (version 12.5 and above). Dataiku Cloud is compatible.

  • A connection to at least one supported generative AI model. Your administrator must configure the connection beforehand in the Administration panel > Connections > New connection > LLM Mesh. Supported model connections include models such as OpenAI, Hugging Face, Cohere, etc.

  • No prior knowledge of working with large language models (LLMs), though it would be useful to read the article Concept | Prompt Studios and Prompt recipe before completing this tutorial.

Create the project#

This tutorial uses a dataset of articles from the Reuters news agency that is publicly available on Kaggle. We’ll work with a subset of 1,000 articles to reduce computation cost.

To create the project:

  1. From the Dataiku Design homepage, click + New Project > DSS tutorials > ML Practitioner > LLM - Prompt Engineering.

    Screenshot of the dataset, which includes 1,000 news article headlines and descriptions.

Note

You can also download the starter project from this website and import it as a zip file.

The dataset includes a headline, date, and short description of each article. Let’s say we want to know the main subjects of each article, and for the results to be returned in JSON format so we can easily use them in downstream recipes and models. We can use Prompt Studio to build a prompt that automates this process.

Note

To view the full text in the Headlines or Description columns, right click on the value and select Show complete value, or use the keyboard shortcut Shift + V.

Open a new Prompt Studio#

We’ll start by creating a new Prompt Studio to experiment with and iterate on the prompts.

  1. In the top navigation bar, select Visual Analyses > Prompt Studios.

  2. Click New Prompt Studio in the top right.

  3. Give the new studio the name Financial headlines, then click Create.

Screenshot of the steps to create a new Prompt Studio.

In the new Prompt Studio, you can choose between three modes to start a new prompt. We’ll use Managed mode, which will generate the prompt for us after we enter some details. This mode also allows us to create a Prompt recipe after we design the prompt.

  1. In the Add a new prompt window, select Managed mode.

  2. From the Templates that appear below, leave the default Blank template.

  3. Select Create.

Screenshot of new prompt window, with Prompt template selected.

The next page, the Prompt design page, first requires selecting a large language model (LLM) to run the prompt. The selections available will depend on the connection set up by your administrator. For example, if you are connected to OpenAI, you can use GPT models.

  1. Choose the LLM you want to use with your prompt.

Screenshot of choosing an llm connection.

Design the prompt#

On the Prompt design page, you can add your prompt text, provide examples with the desired output, and run test cases using the LLM, before deploying the prompt on your entire dataset.

Screenshot of the prompt design page in Prompt Studio.

Our prompt will instruct the model to determine the topic of each provided news article. To prevent it from creating too many topics, we’ll also give it a list of potential topics we’re interested in.

To write the first iteration of the prompt:

  1. In the first input window, copy and paste the following prompt, replacing the text Explain here what the model must do. Use the Copy button at the top right of the block for easier copying.

    Determine whether each topic of the following list of topics is covered in the financial news article provided.
    
    List of topics: fed and central banks, company and product news, corporate debt and earnings, energy and oil, currencies, gold and metals, IPO, legal and regulation, M&A and investments, markets, politics, stock movement.
    
  2. Create two Inputs: Headline and Text preview. We’ll add two Test cases taken from the headline dataset to gauge how our model runs the prompt as it is.

  3. Click on the button with three lines to the right of the inputs to switch the mode to Write test cases directly.

    Steps to create a prompt and inputs.
  4. Click + Add test case and copy and paste the following text into the corresponding boxes:

Headline:

Manufacturing, vaccine data power stocks higher; U.S. dollar dips

Text preview:

Stocks across the globe rose on Wednesday following data pointing to a recovery in manufacturing and on bets for a COVID-19 vaccine, while the risk-on mood pushed the U.S. dollar lower.
  1. Add another test case with the following text in the boxes:

Headline:

U.S. weekly jobless claims up slightly; leading indicator rises

Text preview:

The number of Americans filing for unemployment benefits rose just marginally last week, suggesting strong job growth in March that should underpin consumer spending.
  1. Select Run to pass the prompt and test cases to your selected model.

Screenshot of the prompt design, showing results from the first iteration.

Depending on the model you selected, you’ll see different results. In this case, using ChatGPT 3.5 Turbo, we can see several problems with the responses. The model answered in complete sentences, and each response is in a slightly different format. Neither format is very useful for further analysis.

You might notice other issues depending on your results. For example, the model might return topics that were not in your initial list.

We can fix all of these issues with a bit of prompt engineering.

Iterate on the prompt#

First, let’s specify the format for the model’s results. The most useful format for downstream recipes would be a JSON object where topics are listed as the keys along with 0 or 1 values to indicate whether they are present. Later, we could easily parse the JSON using a Prepare recipe and use the results in other recipes or machine learning models.

We can also instruct the model to bucket any topics not in our initial list under an “other” topic, to keep our results clean and consistent with the topics we’re interested in.

  1. Copy and paste the following text to your prompt, under the list of topics:

Format your response as a JSON object with each topic as the keys, and 0 or 1 as values. Add the “other” key to list potential topics not listed above. Use an array as value.

Another way to help the model understand what is expected is to provide examples with input and the desired output. We can do this in Examples located above the test cases.

  1. In the Examples area, click Add example.

  2. Copy and paste the following text into the corresponding boxes:

Input: Headline

Travel stocks soar as encouraging vaccine study lifts Europe

Input: Text preview

European stocks closed at over a five-week high on Wednesday, with travel stocks surfing a wave of optimism following reports of progress in developing a COVID-19 vaccine.

Output

{
"company and product news": 0,
"corporate debt and earnings": 0,
"IPO": 0,
"M&A and investments": 0,
"stock movement": 1,
"markets": 1,
"legal and regulation": 0,
"politics": 0,
"currencies": 0,
"gold and metals": 0,
"energy and oil": 0,
"fed and central banks": 0,
"other": []
}
  1. Add another example with the following text:

Input: Headline

Oil climbs 2% on U.S. stock draw but gains capped as OPEC+ set to ease cuts

Input: Text preview

Oil prices rose 2% on Wednesday, supported by a sharp drop in U.S. crude inventories, but further gains were limited as OPEC and its allies are set to ease supply curbs from August as the global economy gradually recovers from the coronavirus pandemic.

Output

{
"company and product news": 0,
"corporate debt and earnings": 0,
"IPO": 0,
"M&A and investments": 0,
"stock movement": 0,
"markets": 0,
"legal and regulation": 0,
"politics": 0,
"currencies": 0,
"gold and metals": 0,
"energy and oil": 1,
"fed and central banks": 0,
"other": []
}
  1. Run the prompt again and review the results.

Screenshot of the final prompt with examples and results from test cases.

Results from the test cases should be much improved. The output is in JSON format, which can be easily parsed, and any topics not in our original list are included within the “other” array. If there are no other topics in the output, the “other” array will be empty.

Use the dataset for test cases#

Now that we have an efficient and useful prompt, we can run it on more test cases from the dataset.

  1. Click on Use a dataset for test cases in the top right.

  2. Next to Mapped from columns in, select reuters_headlines as the dataset.

  3. Map the Headline description to the column Headlines and the Text preview description to the Description column.

Screenshot with the columns mapped from the reuters_headlines dataset.

We can also set a validation rule to check that the output conforms to our expected JSON object format.

  1. Next to the Prompt design and model selection, click Settings.

  2. Under Validation & formatting in the settings, select the Expected output format of JSON object.

  3. Save the settings.

Screenshot with JSON object selected in the prompt settings.

When we run the model again, it will use a small sample of test cases from the dataset instead of the test cases we entered manually. Results will be based on the same prompt and examples.

  1. Click Run.

  2. Explore the results.

The model ran on eight test cases selected from the dataset (you can change the number of test cases by clicking on the gear button next to Run). It also returns an estimated cost per 1,000 records to run the full dataset. Results are colored green if they pass the validation test for JSON objects and colored red if they fail the test.

Tip

If your results don’t pass the validation test, make sure your example outputs are in the correct format for JSON objects.

Screenshot with results that passed the validation test.

We can see that all the results passed the validation test, and we’re ready to save our prompt as a recipe so we can deploy it to the Flow.

Deploy a Prompt recipe#

You can save your crafted prompt as a recipe directly from Prompt Studio. The recipe will process the entire dataset using the prompt we created.

  1. In the top right, select Save as recipe, and Create in the next window to create a new Prompt recipe.

  2. In the New prompt recipe window, make sure the Input dataset is reuters_headlines.

  3. Give the Output dataset a name or use the default, and choose a storage connection and output file format.

  4. Click Create recipe.

The New prompt recipe info window.

Dataiku creates a new Prompt recipe with the settings pre-filled from your work in Prompt Studio. You’ll see the prompt text, input columns, and examples with output.

Screenshot with JSON object selected in the prompt settings.

If you’d like, you can run the recipe to use the model on the entire headlines dataset and create an output dataset ready to be used for further analysis or model training.

Important

If you’re using a commercial model, this will incur some costs from the generative AI company for processing 1,000 rows.

The New prompt recipe info window.

What’s next?#

Congratulations! You have designed an effective prompt using Prompt Studios and deployed it to the Flow using the Prompt recipe.

You can explore other use concepts and tutorials using LLMs in the Knowledge Base.