March 7, 2023 |
Video: High Availability for State, local government, and education (SLED)Video: High Availability for State, local government, and education (SLED)In this video, Dave Bermingham, SIOS Director of Customer Success, discusses the company’s provision of high availability solutions to state, local government, and education (SLED) organizations. Dave highlights the importance of high availability for SLED organizations, specifically mentioning communication and collaboration tools used by emergency services, financial management systems, student information systems, and learning management systems, which all need to be constantly accessible. He highlights the key features that a high availability solution should have, such as being cost-effective, reliable, providing redundancy, maintaining high-performance levels, detecting failures and performing recovery actions, scalable, and integratable with existing systems and infrastructure. Bermingham gives two examples of SIOS’s SANless clustering solution in action. The first example is how they provided high availability at both the application and data center level to eliminate downtime during university enrollment. The second example is how they worked with an integrator to ensure the call center CAD system was highly available and able to dispatch police, fire, or rescue teams during multiple disasters. It’s important to consider adding a high availability clustering solution like SIOS that can address the application level high availability needs which can then contribute towards maintaining application performance. Reproduce with permission from SIOS |
March 2, 2023 |
8 Changes That Can Undermine Your High Availability Solution8 Changes That Can Undermine Your High Availability SolutionAs VP of Customer Experience I have observed that most organizations are conscious and careful of deploying any tools or processes that could have an impact on their businesses’ high availability. These companies typically go through great care with regards to HA including strict change vetting for any HA Clustering, DR, Security, or Backup solution changes. Most companies understand changes to these tools needs to be carefully considered and tested so as to avoid impact to the overall application availability and system stability. IT administrators are aware that even the most inconspicuous change in their HA Clustering, Disaster Recovery, Security, or Backup solution can lead to a major disruption. However, changes in other workplace and productivity tools are most often not considered with the same diligence. Eight changes that can be undermining your HA solution:
Your existing tools often encapsulate a lot of documentation around the company, decisions, integrations, and overall HA architecture. As teams transition to new tools, these documents are often lost, or access becomes blocked or hampered. Suggested Improvement: Export and import all existing documents into the new tool. Use archive storage and backups to retain complete copies of the data before the import.
Similar to the lost documents, requirements are often the first thing to be lost when transferring tools. Suggested Improvement: Document known requirements, export requirements related documents from any existing productivity tools.
Almost as important as the documentation and requirements is the history behind changes, revisions, and decisions. Many organizations keep historical information within workplace and office productivity tools. Such information could include decisions around tools and solutions that have been previously evaluated. When these workplace tools are changed or transitioned, this type of history can be lost. Existing tools often have a lot of tacit knowledge involved with them as well. As the new tools are integrated, that knowledge and mindshare disappears. Two decades ago our team migrated bug tracking solutions. The knowledge gap between the tools was huge and impacted multiple departments, including the IT team now tasked with managing, backing up, and resolving issues. Suggested Improvement: Be sure to adequately train and transfer mindshare and knowledge between new tools. Be sure that history, context and decisions around the current tools and previous tools are documented before terminating the current tool
Every new tool has a different set of security and access rules. Often in the transition teams end up with too many admins, not enough admins, or too many restrictions on permissions. Suggested Improvement: Map access and user controls, based on requirements and security rules, in advance and have a process for quick resolution.
Email and contact system migrations are rarely seamless. Even upgrades between existing versions can have consequences. One downside of a migration from one tool (Exchange to Gmail) could be lost contacts. Our team worked with a customer who once called our support team for help obtaining their partner contacts. Their transition for email systems had stalled and access to critical contacts was delayed. Suggested Improvement: Plan for contact migration and validation. Be sure that any critical contacts for your HA cluster are definitely a part of a validated migration step.
Broken integrations is a very common item that impacts high availability, monitoring and alerting. As companies move towards newer productivity tools, existing integrations may no longer work and require additional development. As a relative example, a company previously using Skype for messaging, moved away to Slack. Many of the tools that delivered messages via Skype needed to be adjusted. In your HA environment, a broken integration between dashboards or alert systems could mean critical notifications are not received in a timely manner. Suggested Improvement: Map out any automated workflows to help identify integration points between tools. Also work to identify any new requirements and integration opportunities. Plan and test integrations during the proof of concept or controlled deployment phase.
Every tool set has a champion and a critic. The champion may or may not be the same as your administrator. The role of the champion changes within each organization and often with each tool, but what is common among them is their willingness to address issues, problems, or challenges with the new productivity tool for the benefit of themselves and others. The champion is the first to find the new features, uncover and report new issues, and help onboard new people to the toolset. Champions go beyond mindshare and history. Often with the changing of tool sets, your team will lose a champion.
New tools, even those not directly related to HA have an impact on your team’s productivity. Even tools related to priority management, development and code repositories require ramp up and onboarding time. This time often translates into lost productivity, which can translate into risks to your cluster. Make sure your processes related to all of the existing and new tools are documented well so that the change to a new tool does not cause confusion, break process flow and lead to even greater losses of productivity. Suggested Improvement: Reduce the risk of lost productivity by using training tools, leveraging a product champion, and making sure that rollout focuses on shortening the learning curve Changing workplace productivity tools so that you don’t undermine your high availability solution requires capturing requirements, identifying key documents, transferring mindshare, mapping dependencies, testing and configuring proper access, identifying a toolset champion. It’s making sure that your new tools actual improve productivity rather than pull your key resources away from maintaining uptime. Cassius Rhue, VP Customer Experience Reproduced with permission from SIOS |
February 28, 2023 |
High Availability Options for SQL Server on Azure VMsHigh Availability Options for SQL Server on Azure VMsMicrosoft Azure infrastructure is designed to provide high availability for your applications and data. Azure offers a variety of infrastructure options for achieving high availability, including Availability Zones, Paired Regions, redundant storage, and high-speed, low-latency network connectivity. All of these services are backed by Service Level Agreements (SLAs) to ensure the availability of your business-critical applications. This blog post will focus on high availability options when running SQL Server in Azure Virtual Machines. Azure InfrastructureBefore we jump into the high availability options for SQL Server, let’s discuss the vital infrastructure that must be in place. Availability Zones, Regions, and Paired Regions are key concepts in Azure infrastructure that are important to understand when planning for the high availability of your applications and data. Availability Zones are physically separate locations within a region that provides redundant power, cooling, and networking. Each Availability Zone consists of one or more data centers. By placing your resources in different Availability Zones, you can protect your applications and data from outages caused by planned or unplanned maintenance, hardware failures, or natural disasters. When leveraging Availability Zones for your SQL Server deployment, you qualify for the 99.99% availability SLA for Virtual Machines. Regions are geographic locations where Azure services are available. Azure currently has more than 60 regions worldwide, each with multiple Availability Zones. By placing your resources in different regions, you can provide even greater protection against outages caused by natural disasters or other significant events. Paired Regions are pre-defined region pairs that have unique relationships. Most notably, paired Regions replicate data to each other when geo-redundant storage is in use. The other benefits of paired regions are region recovery sequence, sequential updating, physical isolation, and data residency. When designing your disaster recovery plan, it is advisable to use Paired Regions for your primary and disaster recovery locations. Using Availability Zones and Paired Regions in conjunction with high availability options such as Availability Groups and Failover Cluster Instances, you can create highly available, resilient SQL Server deployments that can withstand a wide range of failures, minimizing downtime. SQL Server Availability Groups and Failover Cluster InstancesSQL Server Availability Groups (AGs) and SQL Server Failover Cluster Instances (FCIs) are both high availability (HA) and disaster recovery (DR) solutions for SQL Server, but they work in different ways. An AG is a feature of SQL Server Enterprise edition that provides an HA solution by replicating a database across multiple servers (called replicas) to ensure that the database is always available in case of failure. AGs can be used to provide HA for both a single database and multiple databases. SQL Server Standard Edition supports something called a Basic AG. There are some limitations to Basic AGs in SQL Server. Firstly, a Basic AG only supports a single database. You need an AG for each database and the associated IP address and load balancer if you have more than one database. Additionally, Basic AGs do not support read-only replicas. While Basic AGs provide a simple way to implement HA for a single database, they may not be suitable for more complex scenarios. On the other hand, a SQL Server FCI is a Windows Server Failover Cluster (WSFC) that provides an HA solution by creating a cluster of multiple servers (called nodes) that use shared storage. In the event of a failure, the SQL Server instance running on one node can fail over to another. In SQL Server 2022 Enterprise Edition, the new Contained Availability Groups (CAG) address some of the AG limitations by allowing users to create system databases to CAG, which can then be replicated. CAG eliminates the need to synchronize things like SQL logins and SQL Agent jobs manually. Availability Groups and Failover Cluster Instances have their own pros and cons. AGs have advanced features like readable secondaries and synchronous and asynchronous replication. However, AGs require the Enterprise Edition of SQL Server, which can be cost-prohibitive, particularly if you don’t need any other Enterprise Edition features. FCIs protect the entire SQL Server instance, including all user-defined databases and system databases. FCIs make management easier since all changes, including those made to SQL Server Agent jobs, user accounts and passwords, and database additions and deletions, are automatically reconciled on all versions of SQL Server, not just SQL 2022 with CAG. FCIs are available with SQL Server Standard Edition, which makes it more cost-effective. However, FCIs require shared storage, which presents challenges when deploying in environments that span Availability Zones, Regions, or hybrid cloud configurations. Read more about how SIOS software enables high availability for SQL servers. Storage Options for SQL Server Failover Cluster InstancesRegarding storage options for SQL Server Failover Cluster Instances that span Availability Zones, there are three options: Azure File Share, Azure Shared Disk with Zone Redundant Storage, and SIOS DataKeeper Cluster Edition. There is a fourth option, Storage Spaces Direct (S2D), but that is limited to single AZ deployments, so clusters based on S2D would not qualify for the 99.99% SLA and would be susceptible to failures that impact and entire AZ. Azure File ShareAzure File Share with zonal redundancy (ZRS) is a feature that allows you to store multiple copies of your data across different availability zones in an Azure region, providing increased durability and availability. This data can then be shared as a CIFS file share, and the cluster connects to it using the SMB 3 protocol. Azure Shared DiskAzure Shared Disk with Zone Redundant Storage (ZRS) is a shared disk that can store SQL Server data for use in a cluster. SCSI persistent reservations ensure that only the active cluster node can access the data. If a primary Availability Zone fails, the data in the standby availability zone becomes active. Shared Disk with ZRS is only available in the West US 2, West Europe, North Europe, and France Central regions. SIOS DataKeeper Cluster EditionSIOS DataKeeper Cluster Edition is a storage HA solution that supports SQL Server Failover Clusters in Azure. It is available in all regions and is the only FCI storage option that supports cross Availability Zone failover and cross Region failover. It also enables hybrid cloud configurations that span on-prem to cloud configurations. DataKeeper is a software solution that keeps locally attached storage in sync across all the cluster nodes. It integrates with WSFC as a third-party storage class cluster resource called a DataKeeper volume. Failover Cluster controls all the management of the DataKeeper volume, making the experience seamless for the end user. Learn more about SIOS DataKeeper. SummaryIn conclusion, Azure provides various infrastructure options for achieving high availability for SQL Server deployments, such as Availability Zones, Regions, and Paired Regions. By leveraging these options, in conjunction with high availability solutions like Availability Groups and Failover Cluster Instances, you can create a highly available, resilient SQL Server deployment that can withstand a wide range of failures and minimize downtime. Understanding the infrastructure required and the pros and cons of each option is essential before choosing the best solution for your specific needs. It’s advisable to consult with a SQL and Azure expert to guide you through the process and also review the Azure documentation and best practices. With the proper planning and implementation, you can ensure that your SQL Server deployments on Azure are always available to support your business-critical applications. Contact us for more information about our high availability solutions. Reproduced with permission from SIOS |
February 24, 2023 |
Exploring High Availability Use Cases in Regulated IndustriesExploring High Availability Use Cases in Regulated IndustriesWhile downtime in business-critical systems, databases, and applications imposes costs on every organization, different industries have different consequences associated with unplanned downtime. In this post, we explore high availability use cases and SIOS customer success stories in the financial services, healthcare, manufacturing, and education industries. High Availability for Financial ServicesRanging from small credit unions to regional banks to global investment firms, financial services is a highly regulated, fast-paced industry in which billions of dollars in electronic transactions occur every second. Thus, the average cost of downtime ($300,000 for a single hour of downtime according to the ITIC 2021 Hourly Cost of Downtime Survey) can be significantly higher for a financial services firm than for other industries. Large Financial Services Firm Adds High Availability/Disaster Recovery for Critical Securities Applications on Oracle DatabasesOne of the oldest financial services firms in China provides securities and futures brokerage as well as investment banking, asset management, private equity, alternative investments, and financial leasing services. It is listed on both the Shanghai and the Hong Kong Stock Exchanges. The firm has 343 branches spanning 30 provinces, municipalities, and autonomous regions in the Peoples Republic of China. It also operates across 14 countries and major international cities, serving approximately 18 million customers. THE ENVIRONMENT THE CHALLENGE THE EVALUATION THE SOLUTION THE RESULTS High Availability for HealthcareDowntime for applications and storage in the healthcare industry can literally be a matter of life and death. It’s imperative to assure reliable access to critical systems used in hospitals and surgery centers, as well as electronic health records (EHR) and medical imaging technology such as picture archiving and communication systems (PACS). The healthcare industry has also increasingly been targeted in ransomware attacks, leading to significant downtime. Lifehouse Hospital Ensures High Availability in Amazon Web Services with SIOS DataKeeperChris O’Brien Lifehouse Hospital (www.mylifehouse.org.au) specializes in state-of-the-art research and treatment of rare and complex cancer cases. The not-for-profit hospital sees more than 40,000 patients annually for screening, diagnosis, and treatment. THE CHALLENGE Lifehouse chose Amazon Web Services (AWS) and had hoped to “lift and shift” its environment directly to the AWS cloud. To simulate its on-premises configuration, Singer chose a “cloud volumes” service available in the AWS Marketplace. Failover clusters were configured using Amazon FSx software-defined storage volumes to share data between active and standby instances. However, the software-defined cloud volumes had a substantial adverse impact on throughput performance. With the “No Protection” option, the cloud volumes performed well, but “no protection” wasn’t really an option for the mission-critical MEDITECH application and its database. THE SOLUTION THE RESULTS High Availability for ManufacturingThere’s a great deal of attention on the supply chain today. Although much of the recent focus has been on logistics issues (such as congestion in the Port of Los Angeles) and cyberattacks (such as the Colonial Pipeline ransomware attack), many critical challenges extend further up the supply chain. For more than 50 years, lean manufacturing (also known as just-in-time inventory management) has been a hallmark of efficiency in the manufacturing industry. However, “just-in-time” means exactly that and there’s no room for system or application downtime in manufacturing. SIOS DataKeeper Cluster Edition Protects Van de Lande Data SystemsVan de Lande BV (VDL) specializes in the manufacture of PVC-U and PE pressure fittings and valves for plastic piping systems, both made from tube and injection molded. Its products are used all over the world in industrial and technical installations. What sets VDL apart is its impressive range of product types and sizes, and its continuous commitment to product improvement and enhancement. As a result, VDL has been the brand of choice for builders of systems and installations for more than 50 years. THE CHALLENGE Before implementing the SIOS DataKeeper solution, VDL relied on shared storage (SAN) for its main storage. To improve performance, they decided to move to local storage based on solid-state disk (SSD) instead of traditional spinning disks. However, VDL relies heavily on the availability of its ERP database. With only one primary data processing system, VDL needed a reliable, comprehensive disaster recovery solution to ensure the availability of its systems in the event of a sitewide disaster. To prevent downtime, the company needed its servers to replicate data to a backup server for disaster protection. If one server fails, the other server takes over operation. This failover process sustains operations, maximizes uptime, and enables user productivity. THE SOLUTION VDL uses SIOS DataKeeper Cluster Edition software to ensure continuous availability of applications, databases, and web services. SIOS DataKeeper software integrates with WSFC to create a “mirrored” server system between two Windows cluster nodes. If the primary node fails, WSFC transfers all operations to the other node while enabling continuous access to applications and data (which is protected at the volume level). SIOS DataKeeper software enables disaster recovery without the long downtime and recovery time associated with traditional backup and restore technology. SIOS DataKeeper works with Microsoft WSFC to monitor system and application health, maintain client connectivity, and provide uninterrupted data access, giving VDL the reliable, fault-resilient system the company needed. SIOS DataKeeper Cluster Edition further extends the capabilities of Microsoft Cluster Services and Windows Server Failover Clustering. SIOS DataKeeper Cluster Edition also supports real-time replication of Hyper-V virtual machines between physical servers across either LAN or WAN connections. For companies like VDL, SIOS DataKeeper Cluster Edition software reduces the cost of deploying clusters by enabling them to create a SANless cluster that eliminates the cost, complexity, and single point of failure risk of a SAN in a traditional shared storage cluster. The cluster implementation ran smoothly and took less than a day. Following a thorough evaluation of the VDL server configuration and testing, the installation team found that the SANless cluster with SIOS DataKeeper Cluster Edition software met all of their criteria for disaster recovery, performance, and high availability. During the system failover test, the network services team easily failed over and failed back the system quickly and easily. THE RESULTS One two-node cluster works as a file server and iSCSI server, while the other supports a SQL Server cluster and Dynamics NAV web services. The IT infrastructure consists of three Hyper-V hosts with 60 VMs installed on it, one BackupExec server, 50 desktop users, and 25 mobile barcode scanners, which are connected via web services to the ERP system. Every host contains 240GB SSDs in a RAID 60 configuration with a total of 3TB local storage. The systems are connected through 10 Gigabit interfaces. High Availability for EducationIn the wake of the global pandemic, distance learning has become a key teaching format in postsecondary education, as well as primary and secondary education. In postsecondary education, distance learning enables global outreach for colleges and universities to attract a diverse student body. Thus, uptime has become increasingly important in education, with students and professors requiring access to various systems including library databases, student records, and high-performance computing (HPC)—for example, to support medical research, testing applications, and more. Downtime can also be costly as students (potentially from around the world) rush to register online, vying for limited class space. Major University Gives SIOS LifeKeeper for Linux Top MarksWhen a leading university in New York decided to revamp its enterprise resource planning (ERP) system, it hoped to improve performance, especially during peak registration periods, and reduce overall total cost of ownership (TCO). The university serves more than 10,000 students and uses an Oracle database to maintain all the information for student registration on an HP/UX SAN-based storage environment with replication of its full SAN architecture between two fabrics in a cluster. THE CHALLENGE THE SOLUTION Employing servers configured with SanDisk Fusion ioMemory-based IO accelerators, the university could replace its bulky and expensive SAN-based setup with a streamlined server set running Linux. The integration of ioMemory-based IO accelerators with both the servers and SIOS LifeKeeper for Linux offers better performance and availability than traditional legacy solutions, making the SIOS solution a perfect fit. THE RESULTS Learn more about SIOS high availability solutions. Reproduced with permission from SIOS |
February 20, 2023 |
New SIOS Documentation SiteNew SIOS Documentation SiteFeatures New Easy-to-Use Site Layout and Improved Navigation The SIOS Product Management, Product Marketing, and Technical Documentation teams are excited to announce our new documentation site on a new, easier-to-use platform. Check out the new site here: docs.us.sios.com. The new layout of our documentation site has improved the following functionalities:
We’d love to hear your feedback! Within our documentation, please provide feedback by posting a comment on specific topics to help us keep our content as up-to-date and relevant as possible. Please see our improved “Solutions” sections for answers to common questions or concerns by searching “Solutions” within our documentation pages. Overall Navigation: When it comes to the new navigation, most of the items on the new page are “anchored”; so by selecting a button you will be taken to the selected section. We can first start off by selecting our operating system. This will bring us to products offered within the operating system Windows/Linux. After selecting our operating system, we will now select our product. Below each product (Datakeeper/LifeKeeper/Evaluation Guides/Step-by-Step Guides) are short descriptions of each solution within each product. To see the description of each product, hover over the Product Information dropdown. Once a product is selected, we will now see the most commonly used topics for the solutions within a product. By hovering over a topic for a few moments, you will see a brief description of what each topic is about. After a topic is selected, you will land on the most recent version number of the topic selected. If the most recent version is NOT the version you are running, please utilize the dropdown menu in the top to reach the version you are looking for. SIOS recommends upgrading to the latest version for the newest features, bug fixes, and overall improvements in our product. Navigational Tips: Let’s scroll back up to the top of our new documentation page layout. Below selecting your operating system we have a link to navigation tips to keep in mind when using the new documentation site for better ease of use. Here you will see a list of general terms of what each topic is about as well as information on how to view the general terms by hovering over a topic for a few moments. Below General Terms we have Navigations Tips:
You can always get back to our main documentation page from the navigational tips page by selecting the home or back button. Below the Operating System text, you can follow the link to the versions via “Product Support Schedule”. (Note: After a product is released, it is supported for at least 3 years.) For support assistance, please follow “support.us.sios.com” for information in contacting support. This will lead up to the new documentation page. For our customers in Japan, please click the link here in order to view our new page in Japanese. I hope this helped in learning how to better navigate our new documentation site. Thank you for choosing SIOS! Reproduced with permission from SIOS |