Introduction to XtremIO Quality of Service
Enterprises are leveraging Dell EMC XtremIO high performance storage to consolidate multiple applications and environments in a single cluster. In such consolidated deployments there are tens of applications, each with different performance needs and importance to the organization, running dozens of workloads against the storage array. Customers may want to limit some low-priority applications so that they will not use too much of the storage array’s resources (noisy neighbors), and in that way to ensure that high-priority workloads will be better serviced by the cluster.
Another strong use case of XtremIO is its integrated Copy Data Management (iCDM) capabilities, allowing customers to create multiple copies of their production data within the same storage array for their test, development and data analytics environments, with close to none extra capacity used for those copies, and thus consolidating entire applications’ life-cycle in the same cluster and saving up lots of power and space. When doing so, customers may want to make sure that their non-production copies are not overloading the array and that there is enough storage resources to service production needs.
For such and similar reasons we present a new feature in XtremIO storage clusters – Quality of Service!
With this feature, starting in version 6.2, customers can now restrict a single Volume, an entire Consistency Group or an Initiator Group, to ensure that they will not pass a certain limit in terms of Bandwidth used or IOPS ran, and in that way allowing other workloads to utilize more of the cluster’s resources.
To use this feature all that is needed is to configure a new QoS Policy in the new QoS Policies tab under Configuration in the XMS WebUI and assign it to the entity in need of restricting.
There are 3 inputs relevant when setting a QoS Policy:
- The Limit Type
- The Limit itself
- Burst Percentage
The Limit Type would be one of two values – Fixed or Adaptive. A Fixed QoS Policy sets a solid limit for the XtremIO entity assigned to with no regards to its size. For instance, 200MB/s Max BW for a Volume. On the other hand, an Adaptive QoS Policy sets an adjusting limit for the XtremIO entity, taking its size (in GBs) into consideration. For instance 1MB/s Max BW per 1GB of size. An Adaptive QoS Policy can either be assigned to a Volume or a Consistency Group (it cannot be assigned to an Initiator Group, since an Initiator Group has no storage capacity). When assigning an Adaptive QoS Policy to a Consistency Group, the size to consider is the sum of all of its Volumes.
The Limit itself can be either a specific Max Bandwidth value (to be set in either MB/s or KB/s), or a calculated Max Bandwidth value using a chosen IO Size and a requested Max IOPS limit:
(MAX_BW = MAX_IOPS x IO_SIZE)
Let’s see a couple of QoS Policies configured in the WebUI, followed by their corresponding XMCLI commands:
On the left: an Adaptive QoS Policy limiting the assigned XtremIO entity to Max 1MB/s BW per 1GB with 50% Burst;
On the right: a Fixed QoS Policy limiting the assigned XtremIO entity to Max 6,400IOPS of 16KB blocks (=100MB/s)
And the XMCLI commands to create such Policies:
> add-qos-policy qos-policy-name=”QoS_Adaptive_BW” limit-type=adaptive max-bw=”1m” burst-percentage=50
> add-qos-policy qos-policy-name=”QoS_Fixed_IOPS” limit-type=fixed max-iops=6400 io-size=”16k”
For our example we will set a fixed BW QoS Policy of 200MB/s (as can be seen in the image below) and assign it to one of our Volumes (we will touch Burst Percentage at the second part of this post).
A Fixed 200MB/s QoS Policy
To view existing QoS Policies from the XMCLI run:
XMCLI show-qos-policies output
Let’s put our QoS Policy to use. We are running a deployment with 4 VMs, each running on a separate XtremIO Volume, with each VM running similar intensive IO workload against our single-brick XtremIO cluster.
A 4 Volume intensive workload
We will identify the VM running on vol4 as low priority workload in our environment and assign our new QoS Policy to it, to restrict its resource consumption and free up storage resources for the other Volumes in the system.
We will first assign the Policy to the Volume in Monitor Only mode so we can first view its expected effect on the Volume. This is done through the Volumes tab in the Configuration section:
Assigning a QoS Policy to a Volume
Or from the XMCLI:
> modify-volume vol-id=”vol4″ qos-policy-id=”QoS_Fixed_200MB_BW” qos-enabled-mode=monitor_only
(Assigning a QoS Policy to other entities from the XMCLI is done the same way with the respective modify command for the entity – modify-consistency-group and modify-initiator-group.)
We created a custom report in the XMS Reports section to see the expected effect of the QoS Policy on vol4. In the graph we can see the actual BW of the Volume, the QOS Effective Max BW which is the QoS Policy we assigned to the Volume (for now, in a “Monitor Only” mode), and the QOS Exceeded BW (= QoS_Effective_BW – Acutal_BW) that the Volume now consumes with respect to the QoS Policy.
Assigning the QoS Policy in “Monitor Only” mode
After reviewing the effects of the QoS Policy on the Volume we decided we in fact want to enable it. We Modify Assigned Policy for the Volume, change the QoS State to Enabled and click apply. We will see the effects in the performance graphs instantly:
Assigning the QoS Policy in “Enabled” mode – vol4 graph
Assigning the QoS Policy in “Enabled” mode – performance dashboard
We can see in our custom report for vol4 that the actual BW of the Volume decreased to align with the QoS Effective Max BW, and that the QoS Exceeded BW reset to zero.
In the general Performance Dashboard of all 4 Volumes we can see that vol4‘s BW, that is now limited by a QoS Policy, is reduced to 200MB/s while the other Volumes’ BW now increased, utilizing the array’s resources that were freed when we limited vol4, thus achieving precisely the goal we aimed for!
We can also view QoS related properties for the Volume from the XMCLI with the show-volume command, for instance:
> show-volume vol-id=”vol4″ prop-list=[qos-policy-id,qos-enabled-mode,qos_effective_max_bw,bw]
XMCLI show-volume output
When defining a QoS Policy on XtremIO, we can also allow limited entities to have momentary bursts of IOs above the maximum configured BW allowed by the Policy. This is done using the Burst % input of the QoS Policy.
When an XtremIO Volume is limited by a QoS Policy with Burst Percentage above 0%, every time that that Volume is running a load against the storage cluster that is lower than its QoS limitation, it starts accumulating IO “credits”. Those credits are calculated as follows: if L is the QoS limit and X is the actual BW used, than for every second of the workload the volume gets L – X credits and can accumulate up to B credits (when B is the QoS Effective Burst).
In the example below, the actual BW in the beginning of the load is about 160MB/s while the QoS limitation is 200MB/s and the Effective Burst is 800MB (configured as 400 Burst Percentage, since 400% of 200MB/s is 800MB). This means that every second in the beginning of the load the Volume “earns” 40MB credits, and after about 20 seconds it reaches the max of 800MB credits.
The next time that the Volume runs a load that exceeds its QoS Policy, it is allowed to use its IO credits and run a load higher than its QoS Policy, until it uses all of its IO credits (in our example – 800MB of aggregated BW), in that point the Volume regresses back to the QoS BW limitation of 200MB/s.
Burst IOs in use under a QoS Policy
XtremIO 6.2 QoS feature is giving you, our customers and partners a safe of mind where it comes to isolating workloads from manipulating the array performance and / or, if you want to ensure a specific bandwidth per your internal customers, as demonstrated, the feature is very powerful, yet, very easy to configure
You can watch a demo showing everything by clicking the link below