How-to | Access data sources through a VPN server#

Dataiku Cloud offers multiple ways to connect to your data sources leveraging a VPN:

OpenVPN#

You can configure an OpenVPN tunnel between Dataiku Cloud and your network to access your private data sources. The OpenVPN server is under your control, and it exposes your data sources. Dataiku uses an OpenVPN client to establish the VPN connection and reach them.

Important

  • OpenVPN is not available in all Dataiku plans. You may need to reach out to your Dataiku Account Manager or Customer Success Manager.

  • The private subnets exposed by your OpenVPN server should not overlap the following CIDR ranges: 10.0.0.0/16, 10.1.0.0/16, 172.20.0.0/16 or 10.94.0.0/16.

To configure the VPN:

  1. Go to Launchpad’s Extensions panel.

  2. Add the VPN extension.

  3. Provide an OpenVPN configuration file for clients.

You can choose between:

Routing all traffic

If this option is selected, all outgoing traffic from Dataiku will go through the VPN tunnel. In this case, ensure that all your data sources are accessible from your VPN server, and that your VPN server can also route traffic to the internet so your Cloud instance can function properly.

Routing the traffic to a list of IP ranges

If you deselected the all traffic option, you must list all addresses or ranges for which the traffic will be routed through the VPN.

Optionally, a private DNS server can be used. This lets you use your own DNS server to resolve the domains of your private data sources that are accessed through the VPN. You have to fill in the IP address of this DNS server, and the list of domains that should be resolved using this DNS server. The other domains will still be resolved by the regular Dataiku DNS servers.

Note

To enable VPN tunneling, the Dataiku instance needs to be restarted. This operation could take up to 15 minutes.

VPN IPsec#

You can connect to your on-premise data sources using a site-to-site VPN leveraging the IPsec protocol.

Important

  • VPN IPsec is not available in all Dataiku plans. You may need to reach out to your Dataiku Account Manager or Customer Success Manager.

  • You must allow Dataiku Cloud Public IPs to reach your VPN Gateway device. These IP addresses depend on your Dataiku Cloud region and are listed in the Launchpad connection forms.

Create your configuration file#

Each VPN tunnel configuration tends to have unique edge cases. We strongly recommend that you contact the Dataiku account team to complete this configuration.

Create your secrets file#

Currently, only secret key authentication is supported for IPSec. The secrets.conf file should appear as below:

# ipsec.secrets =this file holds shared secrets for authentication.
: PSK "your_secret_key"

Add the extension on your Launchpad#

Once you have created the required configuration and secrets files, you can actually create the extension on your Launchpad. If you need to create tunnels for both Design and Automation nodes, you will have to create separate configs.

To configure the VPN:

  1. Go to Launchpad’s Extensions panel.

  2. Add the IPsec extension.

  3. Input your created configuration and secrets files for all required nodes.

  4. If needed, you can enable the source address translation. This will allow you to select the specific IP used by your Dataiku Cloud instance traffic.

  5. Set the internal IP of your database as the host parameter for the connections that use the VPN tunnel.

Note

  • The translated address should be in the format A.B.C.D/M=X.Y.Z.T, where A.B.C.D/M is the target subnet and X.Y.Z.T is the NAT gateway address.

  • For example, if your target subnet is 192.168.20.0/24 and you expect traffic to come from 192.168.10.1, the translated address should be 192.168.20.0/24=192.168.10.1.