How-to | Secure data connections through GCP Private Service Connect#

For certain plans, Dataiku enables customers to secure access to specific data sources through GCP Private Service Connect.

Important

GCP Private Service Connect isn’t available in all Dataiku plans. You can’t configure it through the Dataiku Cloud Launchpad. Please contact the Dataiku support team to complete this configuration.

Dataiku Cloud supports GCP Private Service Connect with:

Google Cloud Storage and BigQuery#

To connect to Google Cloud Storage (GCS) or BigQuery through GCP Private Service Connect, please contact the Dataiku Cloud support team to enable the Private Service Connect endpoint for GCP services and APIs.

You will need to share your Dataiku Cloud space ID. Your space ID can be found in the Settings panel of your Launchpad.

This is a one-time setup that will allow you to leverage GCP Private Service Connect for all your GCS and BigQuery connections.

A Google Cloud SQL database#

To connect to a Google Cloud SQL database through GCP Private Service Connect, you will need to share the following information with the Dataiku Cloud support team:

  • The GCP region of your Cloud SQL account. If the region you need isn’t available, the support team will let you know when the GCP region is enabled.

  • The DNS name of your Cloud SQL instance.

  • The service attachment is a single URI that’s automatically assigned to a PSC-enabled Cloud SQL instance.

After sharing the needed information, the support team will provide you the Dataiku GCP project name to be added to the allowed projects list of your Cloud SQL instance.

When the support team confirms your endpoint is created on the Dataiku side, you will be able to reach your Cloud SQL instance with the DNS name you provided from your Dataiku instance.

A GCP-hosted Snowflake database#

To connect to a GCP-hosted Snowflake database through GCP Private Service Connect, you will need to share the following information with the Dataiku Cloud support team:

After sharing the needed information, you will need to ask Snowflake support to allow GCP Private Service Connect from Dataiku’s GCP project. Ask Snowflake support to allow GCP Private Service Connect from Dataiku’s GCP project explains this.

Finally, the Dataiku Cloud support team will let you know when the Private Service Connect connection is enabled and share the endpoint to use in your Snowflake connection. Use the GCP Snowflake endpoint in your Snowflake connections explains this.

Retrieve the Private Service Connect config from Snowflake#

  1. Having completed the above set of instructions, in Snowflake, create a new SQL worksheet.

  2. Run the following SQL commands with the ACCOUNTADMIN role:

    select SYSTEM$GET_PRIVATELINK_CONFIG();
    
  3. Click on the output to open a new panel on the right.

  4. Click on the Click to Copy icon to copy the JSON result.

    ../_images/snowflake-result.png

Ask Snowflake support to allow GCP Private Service Connect from Dataiku’s GCP project#

  1. In the Snowflake console, go to the Support section in the left panel.

  2. Create a new support case by clicking on Support Case in the top right corner.

  3. Fill the title with something meaningful, for example Enable GCP Private Service Connect.

  4. In the details section of your Snowflake support case, request to allow-list Dataiku’s GCP account providing the project ID that’s shared by the Dataiku Cloud support team.

  5. In the Where did the issue occur? section, select GCP Private Service Connect under the Managing Security & Authentication category, leave the severity to Sev-4, and click on Create Case.

  6. Wait for Snowflake support to allow-list Dataiku’s GCP account before continuing to the next set of instructions.

Use the GCP Snowflake endpoint in your Snowflake connections#

You can use the Private Service Connect endpoint shared by the Dataiku Cloud support team both in new and existing Snowflake connections. To do that:

  1. In the Dataiku Cloud Launchpad, navigate to a new or existing Snowflake connection.

  2. For the host value, fill the value of the Private Service Connect endpoint.

A GCP-hosted arbitrary data source#

Administrators can leverage GCP Private Service Connect to expose any service running inside their VPCs to Dataiku Cloud. To connect to a VM running in your managed instance group hosted in a GCP project, you will need to publish a service by using Private Service Connect to make it accessible to Dataiku.

Create the GCP service attachment#

You will need a GCP service attachment in one of the regions supported by Dataiku. To create it:

  1. Setup a regional internal proxy Network Load Balancer with VM instance group backends. Follow the GCP documentation to create all the necessary components.

  2. Follow the GCP documentation to publish the service by using Private Service Connect.

Allow Dataiku to access your service attachment#

The Dataiku support team will provide you with the Dataiku GCP project ID or network URI. This information is required to configure access permissions for your service attachment, enabling secure connectivity between the Dataiku environment and your services.

An on-premise data source#

You can configure GCP Private Service Connect for on-premise data sources if you have access to a GCP account:

  1. Connect your on-premise data source to your VPC as described in this GCP documentation on hybrid connectivity.

  2. Follow the steps from A GCP-hosted arbitrary data source to connect to your data source.