Skip to content

OCP on AWS - Using Instance Disks for ephemeral storage

This document describe the steps used to evaluate the performance of different disks on EC2 Instances in AWS. The disk types includes ephemeral (local disks) and block storages gp2, gp3, io1 and io2.

The tool used will be the FIO, and the intention is to stress the disk, using the baseline burst IO balance of gp2 to define the total time to run the tests. For example, if the EBS gp2 of 200GiB takes 20 minutes to consume all the burst balance for the stress tests, we will repeat the same time, increasing 5 minutes, for other disks which hasn't that limitation.

Table Of Contents:

  • Create the environment
    • Create the MachineConfig
    • Create the MachineSet with Instance Type with ephemeral storage
    • Create the MachineSet with extra EBS with type gp2
    • Create the MachineSet with extra EBS with type gp3
    • Create the MachineSet with extra EBS with type io1
    • Create the MachineSet with extra EBS with type io2
  • Run the Benchmark
  • Analyse the results
  • Review

Create the environment

Create MachineConfig

Steps to create the MachineConfig to mount the extra device.

TODO

Create MachineSet for ephemeral disk Instance

TODO

export INSTANCE_TYPE="m6id.xlarge"
create_machineset

Create MachineSet for gp2 disk Instance

TODO

export INSTANCE_TYPE="m6i.xlarge"
export EXTRA_BLOCK_DEVICES="
      - deviceName: /dev/xvdb
        ebs:
          volumeType: gp2
          volumeSize: 230
"
create_machineset

Create MachineSet

TODO

export INSTANCE_TYPE="m6i.xlarge"
export EXTRA_BLOCK_DEVICES="
      - deviceName: /dev/xvdb
        ebs:
          volumeType: gp3
          volumeSize: 230
"
create_machineset

Create MachineSet

TODO

export INSTANCE_TYPE="m6i.xlarge"
export EXTRA_BLOCK_DEVICES="
      - deviceName: /dev/xvdb
        ebs:
          volumeType: io1
          volumeSize: 230
          iops: 3000
"
create_machineset

Create MachineSet

TODO

export INSTANCE_TYPE="m6i.xlarge"
export EXTRA_BLOCK_DEVICES="
      - deviceName: /dev/xvdb
        ebs:
          volumeType: io2
          volumeSize: 230
          iops: 3000
"
create_machineset

Run the benchmark

TODO

Analyse the Results

TODO

Review

TODO

Results

Costs

References

TODO


Create the MachineConfig

The MachineConfig should create the systemd units to:

  • create the filesystem on the new device
  • mount the device on the path /var/lib/containers
  • restore the SELinux context

Steps:

  • Export the device path presented to your instance for ephemeral device (in general /dev/nvme1n1):
export DEVICE_NAME=nvme1n1
  • Create the MachineConfig manifest
cat <<EOF | envsubst | oc create -f -
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: worker
  name: 98-var-lib-containers
spec:
  config:
    ignition:
      version: 3.1.0
    systemd:
      units:
      - contents: |
          [Unit]
          Description=Make File System on /dev/${DEVICE_NAME}
          DefaultDependencies=no
          BindsTo=dev-${DEVICE_NAME}.device
          After=dev-${DEVICE_NAME}.device var.mount
          Before=systemd-fsck@dev-${DEVICE_NAME}.service

          [Service]
          Type=oneshot
          RemainAfterExit=yes
          ExecStart=-/bin/bash -c "/bin/rm -rf /var/lib/containers/*"
          ExecStart=/usr/lib/systemd/systemd-makefs xfs /dev/${DEVICE_NAME}
          TimeoutSec=0

          [Install]
          WantedBy=var-lib-containers.mount
        enabled: true
        name: systemd-mkfs@dev-${DEVICE_NAME}.service
      - contents: |
          [Unit]
          Description=Mount /dev/${DEVICE_NAME} to /var/lib/containers
          Before=local-fs.target
          Requires=systemd-mkfs@dev-${DEVICE_NAME}.service
          After=systemd-mkfs@dev-${DEVICE_NAME}.service

          [Mount]
          What=/dev/${DEVICE_NAME}
          Where=/var/lib/containers
          Type=xfs
          Options=defaults,prjquota

          [Install]
          WantedBy=local-fs.target
        enabled: true
        name: var-lib-containers.mount
      - contents: |
          [Unit]
          Description=Restore recursive SELinux security contexts
          DefaultDependencies=no
          After=var-lib-containers.mount
          Before=crio.service

          [Service]
          Type=oneshot
          RemainAfterExit=yes
          ExecStart=/sbin/restorecon -R /var/lib/containers/
          TimeoutSec=0

          [Install]
          WantedBy=multi-user.target graphical.target
        enabled: true
        name: restorecon-var-lib-containers.service
EOF

Create the MachineSet

The second steps is to create the MachineSet to launch the instance with ephemeral disks available. You should choose one from AWS offering. In general instances with ephemeral disks finishes the type part with the letter "d", for example the instance of the Compute optimized family (C) in the 6th-generation of Intel processors (i) with ephemeral storage, will be the type C6id.

In my case I will use the instance type and size c6id.xlarge which provides a ephemeral storage of 237 GB NVMe SSD.

export INSTANCE_TYPE=c6id.xlarge

Get the CLUSTER_ID:

export CLUSTER_ID="$(oc get infrastructure cluster \
    -o jsonpath='{.status.infrastructureName}')"

Create the MachineSet:

create_machineset() {
  # Required environment variables:
  ## DISK_TYPE         : Used to create the node label and name suffix of MachineSet
  ## CLUSTER_ID        : Can get from infrastructure object
  ## INSTANCE_TYPE     : InstanceType
  # Optional environment variables:
  ## EXTRA_EBS_DEVICE  : New EBS definition to be created  (default: '')
  ## AWS_REGION        : AWS Region (default: us-east-1)
  ## AWS_ZONE          : Availability Zone part of AWS_REGION  (default: us-east-1a)
  cat <<EOF | envsubst | oc create -f -
apiVersion: machine.openshift.io/v1beta1
kind: MachineSet
metadata:
  labels:
    machine.openshift.io/cluster-api-cluster: ${CLUSTER_ID}
  name: ${CLUSTER_ID}-worker-${DISK_TYPE}
  namespace: openshift-machine-api
spec:
  replicas: 0
  selector:
    matchLabels:
      machine.openshift.io/cluster-api-cluster: ${CLUSTER_ID}
      machine.openshift.io/cluster-api-machineset: ${CLUSTER_ID}-worker-${DISK_TYPE}
  template:
    metadata:
      labels:
        machine.openshift.io/cluster-api-cluster: ${CLUSTER_ID}
        machine.openshift.io/cluster-api-machine-role: worker
        machine.openshift.io/cluster-api-machine-type: worker
        machine.openshift.io/cluster-api-machineset: ${CLUSTER_ID}-worker-${DISK_TYPE}
    spec:
      metadata:
        labels:
          disk_type: "${DISK_TYPE}"
      providerSpec:
        value:
          ami:
            id: ami-0722eb0819717090f
          apiVersion: machine.openshift.io/v1beta1
          blockDevices:
          - ebs:
              encrypted: true
              iops: 0
              kmsKey:
                arn: ""
              volumeSize: 120
              volumeType: gp3
${EXTRA_BLOCK_DEVICES:-}
          credentialsSecret:
            name: aws-cloud-credentials
          deviceIndex: 0
          iamInstanceProfile:
            id: ${CLUSTER_ID}-worker-profile
          instanceType: ${INSTANCE_TYPE}
          kind: AWSMachineProviderConfig
          placement:
            availabilityZone: ${AWS_ZONE:-us-east-1a}
            region:  ${AWS_REGION:-us-east-1}
          securityGroups:
          - filters:
            - name: tag:Name
              values:
              - ${CLUSTER_ID}-worker-sg
          subnet:
            filters:
            - name: tag:Name
              values:
              - ${CLUSTER_ID}-private-${AWS_ZONE:-us-east-1a}
          tags:
          - name: kubernetes.io/cluster/${CLUSTER_ID}
            value: owned
          userDataSecret:
            name: worker-user-data
EOF
}

Wait for the node be created

oc get node -l disk_type=ephemeral -w

Make sure the device has been mounted correctly to the mount path /var/lib/containers

oc debug node/$(oc get nodes -l disk_type=${disk_type} -o jsonpath='{.items[0].metadata.name}') -- chroot /host /bin/bash -c "df -h /var/lib/containers"

Review

Running fio-etcd

We will use the quick FIO test using the tool that is commonly used to evaluate the disk for etcd.

Used on OpenShift for etcd](https://access.redhat.com/articles/6271341) quick tests

export label_disk=ephemeral
export node_name=$(oc get nodes -l disk_type=${label_disk} -o jsonpath='{.items[0].metadata.name}')
export base_path="/var/lib/containers/_benchmark_fio"

Run quick FIO test (used for etcd):

  • Running on ephemeral device
export disk_type=ephemeral
export base_path="/var/lib/containers/_benchmark_fio"

oc debug node/${node_name} -- chroot /host /bin/bash -c \
    "mkdir -p ${base_path}; podman run --volume ${base_path}:/var/lib/etcd:Z quay.io/openshift-scale/etcd-perf" > ./results-${disk_type}-fio_etcd.txt
  • Running on the root volume (EBS):
export disk_type=ebs
export base_path="/var/lib/misc/_benchmark_fio"

oc debug node/${node_name} -- chroot /host /bin/bash -c \
    "mkdir -p ${base_path}; podman run --volume ${base_path}:/var/lib/etcd:Z quay.io/openshift-scale/etcd-perf" > ./results-${disk_type}-fio_etcd.txt

Running stress test with FIO

Run stress FIO test:

FIO parameters recommened on AWS Doc for General Pourpose disks (GP)

oc debug node/${node_name} -- chroot /host /bin/bash -c \
    "echo \"[0] <=> \$(hostname) <=> \$(date) <=> \$(uptime) \"; \
    lsblk; \
    mkdir -p ${base_path}; \
    for offset in {1..2} ; do \
        echo \"Running [\$offset]\"; \
        podman run --rm \
            -v ${base_path}:/benchmark:Z \
            ljishen/fio \
                --ioengine=psync \
                --rw=randwrite \
                --direct=1 \
                --bs=16k \
                --size=1G \
                --numjobs=5 \
                --time_based \
                --runtime=60 \
                --group_reporting \
                --norandommap \
                --directory=/benchmark \
                --name=data_${disk_type}_\${offset} \
                --output-format=json \
                --output=/benchmark/result_\$(hostname)-${disk_type}-\${offset}.json ;\
        sleep 10; \
        rm -f ${base_path}/data_${disk_type}_* ||true ; \
        echo \"[\$offset] <=> \$(hostname) <=> \$(date) <=> \$(uptime) \"; \
    done; \
    tar cfz /tmp/benchmark-${disk_type}.tar.gz ${base_path}*/*.json" \
    2>/dev/null | tee -a ${log_stdout}

oc debug node/${node_name} -- chroot /host /bin/bash -c \
    "cat /tmp/benchmark-${disk_type}.tar.gz" \
    2>/dev/null > ./results-fio_stress-${disk_type}-${node_name}.tar.gz

References