Fifty Ways to Improve Your High Availability

Date: April 5, 2021

Tags: Application availability, High Availability, high availability - SAP, SQL Server High Availability

Fifty Ways to Improve Your High Availability

I love the start of another year. Well, most of it. I love the optimism, the mystery, the potential, and the hope that seems to usher its way into life as the calendar flips to another year. But, there are some downsides with the turn of the calendar. Every year the start of the New Year brings ‘____ ways to do_____. My inbox is always filled with, “Twenty ways to lose weight.” “Ten ways to build your portfolio.” “Three tips for managing stress.” “Nineteen ways to use your new iPhone.” The onslaught of lists for self improvement, culture change, stress management, and weight loss abound, for nearly every area of life and work, including “Thirteen ways to improve your home office.” But, what about high availability? You only have so much time every week. So how do you make your HA solution more efficient and robust than ever. Where is your list? Here it is, fifty ways to make your high availability architecture and solution better:

Get more information from the cluster faster
Set up alerts for key monitoring metrics
Add analytics. Multiply your knowledge
Establish a succinct architecture from an authoritative perspective
Connect more resources. Link up with similar partners and other HA professionals
Hire a consultant who specializes in high availability
100x existing coverage. Expand what you protect
Centralize your log and management platforms
Remove busywork
Remove hacks and workarounds
Create solid repeatable solution architectures
Utilize your platforms: Public, private, hybrid or multi-cloud
Discover your gaps
Search for Single Points of Failure (SPOFs)
Refuse to implement incomplete solutions
Crowdsource ideas and enhancements
Go commercial and purpose built
Establish a clear strategy for each life cycle phase
Clarify decision making process
Document your processes
Document your operational playbook
Document your architecture
Plan staffing rotation
Plan maintenance
Perform regular maintenance (patches, updates, security fixes)
Define and refine on-boarding strategies
Clarify responsibility
Improve your lines of communication
Over communicate with stakeholders
Implement crisis resolution before a crisis
Upgrade your infrastructure
Upsize your VM; CPU, memory, and IOPs
Add redundancy at the zone or region level
Add data replication and disaster recovery
Go OS and Cloud agnostic
Get training for the team (cloud, OS, HA solution, etc)
Keep training the team
Explore chaos testing
Imitate the best in class architectures
Be creative. Innovation expands what you can protect and automate.
Increase your automation
Tune your systems
Listen more
Implement strict change management
Deploy QA clusters. Test everything before updating/upgrading production
Conduct root cause analysis exercises on any failures
Address RCA and Closed Loop Corrective Action reports
Learn your lesson the first time. Reuse key learnings.
Declutter. Don’t run unnecessary services or applications on production clusters
Be persistent. Keep working at it.

So, what are the ideas and ways that you have learned to increase and improve your enterprise availability? Let us know!

-Cassius Rhue, VP, Customer Experience

Reproduced from SIOS