FAQ | Are the LLM connections secure?#

The LLM connections act as a secure API gateway between your applications and the underlying services. It sits between the Dataiku components and an API endpoint, transmitting the required requests to the relevant service.

LLM connection security.

This gateway is set by an administrator and allows you to:

  • Control the user access.

  • Enable security measures such as rate limiting, network, and tuning parameters (to define the number of retries, timeout, and maximum parallelism).

  • Apply security policies.

Centrally manage access keys, models, and permissions#

When configuring the LLM connections, the administrator uses an API Key (or references an S3 connection for AWS Bedrock) and defines which models are made available.

Note

A connection to Hugging Face does not require any credential or API key.

This ensures that access keys are centrally managed and that only approved services are used.

The administrators also configure the security settings at the connection level to define which user groups can use the connection, thus ensuring the underlying services are properly managed.

Screenshot of the security settings for an LLM connection.

Use filters to detect and manage sensitive data#

Still at the connection level, the administrators can define a certain number of filters and moderation techniques to restrict the most sensitive data sent to self-hosted, private LLMs for example.

The LLM connections currently provide two main filters to detect and manage sensitive data:

  • PII (Personal Identifiable Information) detection

  • Forbidden terms

PII detection#

The PII detection allows you to filter every query and define an action to take (reject the query, replace the PII with a placeholder, etc.).

It leverages the presidio library.

Screenshot of the PII detection settings in LLM connections.

Forbidden terms#

The administrator can add forbidden term filters for both queries and responses, to remove specific data from input or outputs.

The filter defines the source dataset that contains the forbidden terms and a matching mode.

Screenshot of the Forbidden terms settings in LLM connections.