Tutorial | Forward Dataiku logs to Splunk Cloud Platform#

Get started#

Dataiku nodes generate a comprehensive set of application, audit, and resource consumption logs. These logs are generated in a format that allows for ingestion by leading AIOps tools.

This tutorial describes how to ingest these logs into one such AIOps tool: Splunk Cloud Platform.

Objectives#

In this tutorial, you will:

  • Create an source type and index on the Splunk Cloud Platform.

  • Install the Splunk Universal Forwarder on a Dataiku server.

  • Configure the Splunk Universal Forwarder to forward Dataiku audit logs to the Splunk Cloud Platform.

Note

Organizations leveraging Splunk will likely already have established practices on how to deploy the Splunk Universal Forwarder. This tutorial should therefore not be taken as a deployment reference, but rather as a guide on how to configure Splunk and the Splunk Universal Forwarder in the context of Dataiku.

Prerequisites#

This tutorial is aimed at Dataiku administrators, working in conjunction with Splunk Cloud Platform administrators. It requires basic understanding of Dataiku logging and Splunk’s log aggregation functionality.

To reproduce the steps in this tutorial, you’ll need:

  • Dataiku 12.0 or later

  • Administrative rights on Dataiku and Splunk Cloud Platform

  • A self-hosted Dataiku installation

  • SSH access to the Dataiku server (or other means of deploying and configuring the Splunk Universal Forwarder)

Configure Splunk Cloud Platform#

Prior to adding Dataiku logs to the Splunk Cloud Platform, a Splunk administrator needs to create a source type and an index. They will also need to retrieve credentials for the Splunk Universal Forwarder.

  1. Create a source type (for example, called dku_logs) with the timestamp format as "%Y-%m-%dT%H:%M:%S.%3q%z" and timestamp prefix .*"timestamp":.

    Splunk Source Type configuration.
  2. Create an index (for example, called dataiku). For testing purposes, set a 10GB maximum raw data size and 7 days retention.

    Splunk Index.
  3. Download the Splunk Universal Forwarder credentials, as described in the Splunk documentation found above.

Deploy the Splunk Universal Forwarder to Dataiku#

The following procedure details how to deploy the Splunk Universal Forwarder to a Linux server hosting a Dataiku node. Note that this procedure requires SSH and root access to the machine.

  1. SSH onto the Dataiku server.

  2. Switch to the root user.

    sudo su -
    
  3. Create the unprivileged Linux user under which the Splunk Universal Forwarder will run.

    useradd -m splunkfwd
    groupadd splunkfwd
    
  4. Create the Splunk home directory, download the Splunk Universal Forwarder within it, untar it, and ensure the unprivileged Linux user created in the previous step owns the resulting files.

    mkdir /opt/splunk
    cd /opt/splunk
    
    wget -O splunkforwarder-9.4.0-6b4ebe426ca6-linux-amd64.tgz "https://download.splunk.com/products/universalforwarder/releases/9.4.0/linux/splunkforwarder-9.4.0-6b4ebe426ca6-linux-amd64.tgz"
    
    tar xzf splunkforwarder-9.4.0-6b4ebe426ca6-linux-amd64.tgz
    chown -R splunkfwd:splunkfwd /opt/splunk
    
  5. Add a file to store the local Splunk credentials:

    export SPLUNK_HOME="/opt/splunk/splunkforwarder"
    vi $SPLUNK_HOME/etc/system/local/user-seed.conf
    

    with:

    [user_info]
    USERNAME = admin
    PASSWORD = <local splunk password>
    
  6. Start the Splunk Universal Forwarder and authenticate it with the Splunk Cloud Platform:

    $SPLUNK_HOME/bin/splunk start --accept-license
    $SPLUNK_HOME/bin/splunk install app /PATH/TO/splunkclouduf.spl -auth admin:<local splunk password>
    $SPLUNK_HOME/bin/splunk restart
    

    Important

    The splunkclouduf.spl is the Splunk Cloud Platform credentials file downloaded in the previous section, which must be placed onto the Dataiku server (via SCP or otherwise) prior to running this command.

See also

For additional information, refer to Splunk’s documentation on deploying and configuring the Splunk Universal Forwarder on Linux.

Configure the Splunk Universal Forwarder to forward Dataiku Logs#

Now that the Splunk Universal Forwarder is installed, it can be configured to forward Dataiku logs to the Splunk Cloud Platform, following the Splunk Cloud Platform Admin Manual.

For example, to configure the Forwarder to send Dataiku audit logs to Splunk, run the following command:

$SPLUNK_HOME/bin/splunk add monitor /PATH/TO/DATDIR/run/audit/ -index dataiku -sourcetype dku_logs -host $HOSTNAME -auth admin:<local splunk password>

where /PATH/TO/DATADIR is the path to the Dataiku instance data directory.

If the command runs successfully, the Dataiku audit logs should appear under the dataiku index in the Splunk Cloud Platform almost immediately.

Dataiku logs in Splunk.

What’s next?#

Now that Dataiku logs are available in Splunk, IT operations managers can take full advantage of all the AIOps capabilities offered by Splunk.