Tutorial | Use Datadog to monitor Dataiku-managed Elastic AI clusters#
Get started#
Dataiku leverages Elastic AI clusters powered by Kubernetes to horizontally scale its processing capabilities. Dataiku administrators can thus make use of powerful AIOps tools to monitor cluster and workload health, raise alerts, and perform automated and human-assisted remediation of incidents.
This tutorial describes how to integrate a Dataiku-managed Elastic AI cluster with one such AIOps tool: Datadog.
Objectives#
In this tutorial, you will:
Create a Datadog API key.
Deploy the Datadog agent to a Dataiku-managed Elastic AI cluster.
View the cluster metrics in the Datadog portal.
Prerequisites#
This tutorial is aimed at Dataiku administrators working for an organization that leverages (or is evaluating) the use of Datadog for Kubernetes cluster monitoring. It requires a basic understanding of SSH, the Dataiku Elastic AI ecosystem, and Datadog Kubernetes monitoring.
To reproduce the steps in this tutorial, you’ll need:
Dataiku 12.0 or later
Administrative rights on Dataiku and Datadog
A self-hosted Dataiku installation with an attached Dataiku-managed Elastic AI cluster
SSH access to Dataiku server
Retrieve a Datadog API Key#
Prior to deploying the Datadog agent to a Dataiku-managed Elastic AI Cluster, a Datadog administrator must retrieve an API key from Datadog. Please refer to the Datadog documentation on API keys for instructions on how to create and retrieve an API key.
Deploy the Datadog Kubernetes agent to an Elastic AI cluster#
The following procedure details how to deploy the Datadog agent to a Dataiku-managed Elastic AI cluster. Note that this procedure requires SSH access to the machine.
SSH onto the Dataiku server.
Switch to the dssuser.
sudo su - dssuser
Set the
KUBECONFIG
environment variable to target the Elastic AI cluster onto which the Datadog agent should be deployed:export KUBECONFIG=/PATH/TO/DATADIR/clusters/<Elastic AI cluster name>/exec/kube_config
where
/PATH/TO/DATADIR
is the path to the Dataiku instance data directory, and<Elastic AI cluster name>
is the name of the Elastic AI cluster.Install the Datadog operator onto the Elastic AI cluster:
helm repo add datadog https://helm.datadoghq.com helm install datadog-operator datadog/datadog-operator
Important
Helm must be installed prior to running this command.
Create a Kubernetes secret for the Datadog API key retrieved in the previous section.
kubectl create secret generic datadog-secret --from-literal api-key=<secret value>
Create a YAML file with appropriate configuration values for the Datadog Agent. For example, the following YAML file is a representative example for an Elastic AI cluster on EKS:
apiVersion: "datadoghq.com/v2alpha1" kind: "DatadogAgent" metadata: name: "datadog" spec: global: clusterName: "<Elastic AI cluster name>" site: "DATADOG DOMAIN" registry: "public.ecr.aws/datadog" credentials: apiSecret: secretName: "datadog-secret" keyName: "api-key"
See also
For example YAML files for other Kubernetes distributions, refer to the Datadog documentation on Kubernetes distributions.
Deploy the Datadog agent to the cluster.
kubectl apply -f datadog-agent.yaml
See also
For additional information on deploying and configuring the Datadog agent on Kubernetes, please refer to the Datadog documentation on Kubernetes.
What’s next?#
Once the Datadog agent has successfully completes deploying, the Elastic AI cluster logs, metrics and traces should now be available to review in the Datadog portal.
Dataiku administrators can now leverage the full suite of AIOps functionality provided by Datadog, including cluster monitoring, incident management, and dashboards and visualizations.