Tip | The impact of instance template modifications on disk sizes#

DSS instances are based on a data disk and an Operating System (OS) disk. The data disk contains everything stateful relevant for DSS to run. This is why Fleet Manager only snapshots the data disk. The data disk is the only thing that matters when provisioning or reprovisioning an instance because the OS disk is always replaced at provisioning time.

Caution

You should avoid storing anything outside the data disk because when you upgrade or reprovision an instance, everything stored outside the data disk is lost.

Data disk#

The data disk contains all the DSS configuration and its data files. Fleet Manager uses Elastic Block Storage (EBS) volumes as the storage layer for the data disk.

It’s possible to set a starting size for the data disk and the maximum size the disk is allowed to reach. The Fleet Manager agent in the DSS instance will automatically grow the disk whenever the space occupied reaches 80% until it reaches the maximum allowed size.

Even though it’s not best practice to store data in local filesystem connections, sometimes it’s convenient for small datasets or lookups. Furthermore, DSS will need a reasonably sized data disk to store logs, code environments, and anything else that cannot be offloaded to cloud storage.

OS disk#

The OS disk is where the OS and other binaries are installed. The OS disk can be considered as temporary because it is replaced every time the instance is reprovisioned. However, a good reason to have a reasonably sized OS disk (20GB to 50GB) is because Python and R packages, along with ML models, might use the OS’ default temp folder location to store temporary files. There are ways to alter this behavior, but unfortunately, not all packages/tools abide by the same conventions.