Cost of Cloud for High-Availability Applications
Shortly after contracting with a cloud service provider, a bill arrives that causes sticker shock. There are unexpected and seemingly excessive charges. Those responsible seem unable to explain how this could have happened. The situation is urgent because the amount threatens to bust the IT budget unless cost-saving changes are made immediately. So how do we manage the Cost of Cloud for High-Availability Applications?
This cloud services sticker shock is often caused by mission-critical database applications. Especially these tend to be the most costly for a variety of reasons. These applications need to run 24/7. They require redundancy, which involves replicating the data and provisioning standby server instances. Data replication requires data movement, including across the wide area network (WAN). And providing high availability can result in higher costs to license Windows to get Windows Server Failover Clustering (versus using free open source Linux), or to license the Enterprise Edition of SQL Server to get Always On Availability Groups.
Before offering suggestions for managing Cost of Cloud for High-Availability Applications, it is important to note that the goal here is not to minimize those costs. But instead to optimize the price/performance for each application. In other words, it is appropriate to pay more when provisioning resources for those applications that require higher uptime and throughput performance. It is also important to note that a hybrid cloud infrastructure—with applications running in whole or in part in both the private and public cloud—will likely be the best way to achieve optimal price/performance.
Understanding Cloud Service Provider Business And Pricing Models
The sticker shock experience demonstrates the need to thoroughly understand how cloud services are priced and managing Cost of Cloud for High-Availability Applications. Only then can the available services be utilized in the most cost-effective manner.
All cloud service providers (CSPs) publish their pricing. Unless specified in the service agreement, that pricing is constantly changing. All hardware-based resources, including physical and virtual compute, storage, and networking services, inevitably have some direct or indirect cost. These are all based to some extent on the space, power, and cooling these systems consume. For software, open source is generally free. But all commercial operating systems and/or application software will incur a licensing fee. And be forewarned that some software licensing and pricing models can be quite complicated. So be sure to study them carefully.
In addition to these basic charges for hardware and software, there are potential á la carte costs for various value-added services. This includes security, load-balancing, and data protection provisions. There may also be “hidden” costs for I/O to storage or among distributed microservices, or for peak utilization that occurs only rarely during “bursts.”
Because every CSP has its own unique business and pricing model, the discussion here must be generalized. And, in general, the most expensive resources involve compute, software licensing, and data movement. Together they can account for 80% or more of the total costs. Data movement might also incur separate WAN charges that are not included in the bill from the CSP.
Storage and networking within the CSP’s infrastructure are usually the least costly resources. Solid state drives (SSDs) normally cost more than spinning media on a per-terabyte basis. But SSDs also deliver superior performance, so their price/performance may be comparable or even better. And while moving data back to the enterprise can be expensive, moving data from the enterprise to the public cloud can usually be done cost-free (notwithstanding the separate WAN charges).
Formulating Strategies For Optimizing Price/Performance
Covering the Cost of Cloud for High-Availability Applications needs meticulous checks. Here are some suggestions for managing resource utilization in the public cloud in ways that can lower costs while maintaining appropriate service levels for all applications. This include those that require mission-critical, high uptime and throughput.
In general, right-sizing is the foundational principle for managing resource utilization for optimal price/performance. When Willie Sutton was purportedly asked why he robbed banks, he replied, “Because that’s where the money is”. In the cloud, the money is in compute resources, so that should be the highest priority for right-sizing.
For new applications, start with minimal virtual machine configurations for compute resources. Add CPU cores, memory and/or I/O only as required to achieve satisfactory performance. All virtual machines for existing applications should eventually be right-sized. Begin with those that cost the most. Reduce allocations gradually while monitoring performance constantly until achieving diminishing returns.
It is worth noting that a major risk associated with right-sizing is the potential for under-sizing. However it can result in unacceptably poor performance. Unfortunately, the best way to assess an application’s actual performance is with a production workload, making the real world the right place to right-size. Fortunately, the cloud mitigates this risk by making it easy to quickly resize configurations on demand. So right-size aggressively where needed. However be prepared to react quickly in response to each change.
Storage, in direct contrast to compute, is generally relatively inexpensive in the cloud. But be careful using cheap storage, because I/O might incur a separate—and costly—charge with some services. If so, make use of potentially more cost-effective performance-enhancing technologies such as tiered storage, caching, and/or in-memory databases, where available, to optimize the utilization of all resources.
Software licenses can be a significant expense in both private and public clouds. For this reason, many organizations are migrating from Windows to Linux, and from SQL Server to less-expensive commercial and/or open source databases. But for those applications for which “premium” operating system and/or application software is warranted, check different CSPs to see if any pricing models might afford some savings for the configurations required.
Finally, all CSPs offer discounts, and combinations of these can sometimes achieve a savings of up to 50%. Examples include pre-paying for services, making service commitments, and/or relocating applications to another region.
Creating And Enforcing Cost Containment Controls
Self-provisioning for cloud services might be popular with users. But without appropriate controls, this convenience makes it too easy to over-utilize resources, including those that cost the most.
Begin the effort to gain better control by taking full advantage of the monitoring and management tools all CSPs offer. This is likely to encounter a learning curve of course. Because the CSP’s tools may be very different from, and potentially more sophisticated than, those being used in the private cloud.
One of the more useful cost containment tools involves the tagging of resources. Tags consist of key/value pairs and metadata associated with individual resources. And some can be quite granular. For example, each virtual machine, along with the CPU, memory, I/O, and other billable resources it uses, might have a tag. Other useful tags might show which applications are in a production versus development environment, or to which cost center or department each is assigned. Collectively, these tags could constitute the total utilization of resources reflected in the bill.
Organizations that make extensive use of public cloud services might also be well-served to create a script. Include loading information from all available monitoring, management, and tagging tools into a spreadsheet or similar application for detailed analyses and other uses, such as chargeback, compliance, and trending/budgeting. Ideally, information from all CSPs and the private cloud would be normalized for inclusion in a holistic view to enable optimizing price/performance for all applications running throughout the hybrid cloud.
Handling The Worst-Case Use Case: High Availability Applications
In addition to the reasons cited in the introduction for why high-availability applications are often the most costly, all three major CSPs—Google, Microsoft, and Amazon—have at least some high availability-related limitations. Examples include failovers normally being triggered only by zone outages and not by many other common failures; master instances only being able to create a single failover replica; and the use of event logs to replicate data, which creates a “replication lag” that can result in temporary outages during a failover.
None of these limitations is insurmountable, of course—with a sufficiently large budget. The challenge is finding a common and cost-effective solution for implementing high-availability across public, private, and hybrid clouds. Among the most versatile and affordable of such solutions is the storage area network (SAN)-less failover cluster. These high-availability solutions are implemented entirely in software that is purpose-built to create. As implied by the name, a shared-nothing cluster of servers and storage with automatic failover across the local area network and/or WAN to assure high availability at the application level. Most of these solutions provide a combination of real-time block-level data replication, continuous application monitoring, and configurable failover/failback recovery policies.
Some of the more robust SAN-less failover clusters also offer advanced capabilities. For example WAN optimization to maximize performance and minimize bandwidth utilization, robust support for the less-expensive Standard Edition of SQL Server. And let’s not forget manual switchover of primary and secondary server assignments for planned maintenance, and the ability to perform routine backups without disruption to the applications.
Maintaining The Proper Perspective
While trying out some of these suggestions in your hybrid cloud, endeavor to keep the monthly CSP bill in its proper perspective. With the public cloud, all costs appear on a single invoice. By contrast, the total cost to operate a private cloud is rarely presented in such a complete, consolidated fashion. And if it were, that total cost might also cause sticker shock. A useful exercise, therefore, might be to understand the all-in cost of operating the private cloud—taking nothing for granted—as if it were a standalone business such as that of a cloud service provider. Then those bills from the CSP for your mission-critical applications might not seem so shocking after all.
Article from www.dbta.com