Planning for storage capacity is a critical issue for cloud and hosting service providers, and not doing it the right way can impact business growth and profitability. To be able to cater to performance-sensitive workloads as well as support a broader set of business applications in multitenant environments, service providers are turning to all-flash arrays. But with most solutions available on the all-flash array market today, capacity scaling poses significant risks, as most all-flash storage solutions are largely based on a scale-up model.
Because of their significant performance, most all-flash arrays will run out of capacity before experiencing performance issues. In a scale-up model, adding capacity means adding new storage systems (controllers plus disks). And that comes at a significant cost. It also generally involves data migration with the associated risk of data loss, and the burden of having more and more islands of storage to manage.
Scale-up designs, in which performance and capacity are co-dependent, also suffer from the well-known problem that as capacity is added, the system’s controller performance is spread over more data and applications. Thus scale-up designs are prone to noisy-neighbor problems in which a small number of applications can monopolize all controller resources, significantly degrading performance for all other applications on the array.
The Scale-Out Comparison
In comparison, a good scale-out architecture enables service providers to scale capacity up or down by simply adding and removing nodes within the existing system. A well-designed system should not require data migration nor an increase in storage management time. To eliminate noisy-neighbor problems, a good scale-out design manages capacity independently from performance, using capabilities like Quality of Service (QoS) to guarantee performance to each application. Performance doesn’t degrade as more capacity is added, and service providers do not need to add capacity (that may ultimately go unused) to mitigate performance issues.
Scale-out designs that meet the requirements of service providers must have the ability to set minimum, maximum, and burst levels of performance. Of these three fine-grained controls, the minimum is most important as it provides a hard floor for each application, guaranteeing the ability to always deliver those provisioned IOPS. For service providers who spend time troubleshooting customer performance issues, a scale-out design with QoS is imperative for reducing troubleshooting headaches and improving customer satisfaction.
End-of-life upgrades pose another set of risks for service providers. In a scale-up architecture, moving from one controller-based storage system to a new generation after a three- to five-year service cycle is the bane of a storage administrator’s and customers’ existence. It can take six months or more of planning, testing, and execution to complete, along with expensive professional service contracts. The end result is often customer downtime.
With scale-out architectures that allow the mixing of hardware generations, hardware upgrades become a trivial process. Service providers can simply add new nodes to the cluster and remove the old ones without migration or downtime and without rebalancing, restriping, or volume reallocation. The ability to mix generations and also different capacity, performance, and protocols within the same cluster means that new storage nodes that offer higher capacity and performance can be added as providers grow. It also enables service providers to swap capacity and nodes between their data centers. If one data center has unused capacity while another needs more capacity, a node can be moved from one location to the other without workload disruption.
Upgrading capacity—and/or performance—without the right technology is a headache. If capacity planning is not 100 percent accurate, providers will need to upgrade. And upgrades require that services be taken offline at planned maintenance intervals, which generates customer complaints. Only with scale-out, resilient, and easy-to-manage architecture can service providers plan and scale their capacity without the risk of high cost, data migration, or downtime.
MARA MCMAHON is segment director at Boulder, Colo.-based SolidFire, now part of NetApp Inc., overseeing all go-to-market programs and initiatives for service providers. Reach her on Twitter @mcmahonmara.