In the Part1 of this series, I covered the challenges around Storage Chargeback and Showback in Virtualized environments running on traditional storage platforms that use LUN/Volume Abstraction Layers and in Part2, I covered how Tintri simplifies and brings more accuracy to the Capacity Centric Model with its VM centric design.
In this post I will cover how Tintri makes it easy to incorporate a more accurate Performance based model into Chargeback/Showback that can be used in combination with the Capacity Centric Model.
As I stated in my first post – although everyone would like to add a performance centric model to Chargeback/Showback, it is not that popular given the complexities around implementing something like that on the storage side in a virtualized environment (refer my first post in the series).
As we all know, one of the big advantages (and differentiator) that Tintri has is that its abstraction layer is a vDisk (VMDK, VHD etc.) and not a LUN or a Volume. This allows it to see at the right level of abstraction (rather than looking at a LUN or a Volume). The other advantage that it has is that because of its tight API integration with various hypervisors, it sees a vDisk as a vDisk and not as another file. This ability allows it to create an IO Profile of every vDisk in the system. This IO profile gives Tintri an understanding of the type of IO taking place in side a vDisk (Random, Sequential, %Reads, %Writes, Block Size) based on which it assigns ‘Performance Reserves‘ to every vDisk. The Performance Reserves are a combination various resources in the storage platform – like Flash, CPU shares, Network buffers etc. and are shown by the Tintri Management interface in % values.
So if we consider a Database (DB) VM with C: Drive, D: Drive (DB files) and E: Drive (Logs), Tintri would look at those individual drives, learn about the IO profiles (say D: is random IO, E: is sequential and C: not much IO) and then automatically assigns Performance Reserves to these with a goal to deliver 99-10% IO from flash and sub-ms response times. The first benefit (as I have discussed in my other posts) of this approach is that Tintri automatically tunes itself unlike some other storage products that just show some values (most of them at a LUN/Volume level) and require the admin to take action – like add/buy more flash, reshuffle the VMs, reduce CPU/cache load etc. The second benefit, the one that we will discuss here in more detail, is that IT/Service Providers now have a metric against which they can do a Chargeback/Showback.
Performance Reserves are independent of IOPs and are a measure of how much Storage resources are consumed (or will be consumed) by the vDisk in order to get 99-100% IO from flash and sub-ms response times. Traditional Performance centric models that use IOPs as the measure don’t take into consideration the amount of Storage resources consumed by the vDisk to do for a particular type/size of IO.
The problem here is the traditional underlying storage architecture that continues to use LUNs/Volumes as the abstraction layer. These Storage products in some cases don’t have an ability to provide a Quality of Service to even those abstraction layers, let alone looking at individual vDisks and automatically tuning various storage resources for a workload.
Here we look at a simple example of how three different VMs having different type of IO characteristics get their Performance Reserves assigned by Tintri.
The VM ss_testld_1 in the first screenshot is doing 2,751 IOPs (Ave. 8K block size) with 90% Reads and has 5.2% Performance Reserves allocated to it.
The VM ss_testld_2 in the second screenshot is doing 1,112 IOPs (Ave. 81.6K block size) with 90% Reads and has 10.8% Performance Reserves allocated to it.
The VM ss_testld_3 in the third screenshot is doing 1,458 IOPs (Ave. 8K block size) with 90% Writes and has 4.6% Performance Reserves allocated to it.
So what we do we see here?
The VM ss_testld_1 is doing 90% reads just like the VM ss_testld_2 and is doing almost 2.5x IOPs (2,751) than the VM ss_testld_2 (1,112) but still has lower Performance Reserves/Footprint (5.2% Vs 10.8%) because it has a much smaller block size (Ave. 8K Vs Ave. 81.6K)
So, if we look at just IOPs, the cost of running ss_testld_1 seems higher but in reality the cost of running ss_testld_2 is more than double that of ss_testld_1.
In the same way the VM ss_testld_1 is doing almost double the IOPs (2,751) than the VM ss_testld_3 (1,458) with the same block size (Ave. 8K) but they are using almost similar performance reserves 5.2% and 4.6%) because the latter is heavy on write (90% Writes).
Here if we look at just the IOPs, the cost of running ss_testld_1 again seems higher but in reality the cost of running ss_testld_3 is almost the same as ss_testld_1 even with ss_testld_3 doing half the IOPs.
Taking a very simplistic example for Chargeback/Showback, if we consider the cost of 100% Reserves at say $100,000 based on cost to acquire/install/support Tintri
- ss_testld_1 would cost around $5.2K to run
- ss_testld_2 would cost around $10.8K to run
- ss_testld_3 would cost around $4.6K to run
If we had taken a IOPs centric model, the costs would have been completely different with no relation to the amount of storage resources consumed to run a particular type of workload.
The other big challenge with the IOPs centric model is its unpredictability. Let’s say we sized a storage platform for X IOPs (say 8K) with some read:write, rand:seq mix and then priced per IOP accordingly. What if the platform doesn’t deliver those IOPs (say it delivers Y IOPs) because our assumptions were wrong or because of completely unpredictable workloads. The difference (X-Y) now has to be absorbed somewhere. So to cover that, IT/Service Provider would charge more next time and become uncompetitive.
The Performance Reserve metric is independent of the IOPs and can be combined with a Capacity Centric Model (for capacity hungry VMs) to give a more accurate Chargeback/Showback model.
The cool thing about all of Tintri’s metrics is that they are all exposed through our REST APIs and Powershell Integration. Therefore these can plug into any customized model as well, giving a more predictable, simplified and accurate Chargeback/Showback Model.
Thanks for reading…