June 19, 2022 |
How does Data Replication between Nodes Work?How does Data Replication between Nodes Work?In the traditional datacenter scenario, data is commonly stored on a storage area network (SAN). The cloud environment doesn’t typically support shared storage. SIOS DataKeeper presents ‘shared’ storage using replication technology to create a copy of the currently active data. It creates a NetRAID device that works as a RAID1 device (data mirrored across devices). Data changes are replicated from the Mirror Source (disk device on the active node – Node A in the diagram below) to the Mirror Target (disk device on the standby node – Node B in the diagram below). In order to guarantee consistency of data across both devices, only the active node has write access to the replicated device (/datakeeper mount point in the example below). Access to the replicated device (the /datakeeper mount point) is not allowed while it is a Mirror Target (i.e., on the standby node). Reproduced with permission from SIOS |
June 15, 2022 |
How a Client Connects to the Active NodeHow a Client Connects to the Active NodeAs discussed earlier, once a High Availability cluster has been configured, two or more nodes run simultaneously and users connect to the “active” node. When an issue occurs on the active node, a “failover” condition occurs and the “standby” node becomes the new “active” node. When a failover occurs there must be a mechanism that either allows a client to detect the failover condition and to reconnect, or a seamless transfer of the user’s active client session to the active node. A Virtual IP AddressUsually a “virtual” IP address is created when a cluster is configured and the client communicates with the active node using a virtual IP address. When a failover occurs, the virtual IP address is reassigned to the new active node and the client reconnects to the same virtual IP address. As an example, let us assume that there are two nodes, A and B, with IP addresses of 10.20.1.10 and 10.20.2.10. In this example, we will define a virtual IP address of 10.20.0.10 which should be considered to be assigned to the current active node. This is similar to assigning a second IP address to one network interface card on one node. If the command ip a is entered on the active node, both IP addresses will appear (as on lines 10 and 12 in this Linux example): The ARP ProtocolWhen a client attempts to find a server using an IP address, the client typically uses ARP (Address Resolution Protocol) to find the MAC (Media Access Control) address of the target machine. Once a client broadcasts a message to find the target IP address, the active node answers with its MAC address and the client resolves the request and connects to it. ARP Alternatives for the Cloud EnvironmentIn the cloud environment, however, it is not possible to identify the active node using ARP as many layers are abstracted in the virtual environment. An alternative method based on the network infrastructure in use in the specific cloud environment may be required. There are normally several options, and a selection should be made from the following list. Reproduced with permission from SIOS
|
June 11, 2022 |
Public Cloud Platforms and their Network Structure DifferencesPublic Cloud Platforms and their Network Structure DifferencesThere are several public cloud platforms including Amazon Web Services (AWS), Microsoft Azure and Google Cloud. While there are many similarities in their infrastructures, there are some differences. In many cases a VPC (Virtual Private Cloud) or a VNET (Virtual Network) that is tied to a region is created. One or more VPCs may be defined for a logical group of applications. By so doing, different systems are divided into separate unconnected networks unless different VPCs are specifically connected. Under a VPC many different subnets can be defined. Based on the purpose, some subnets are configured as “public” subnets which are accessible to the internet and some are configured as “private” subnets which are not accessible to the internet. Some cloud providers (such as Azure and Google Cloud) allow subnets to be defined across Availability Zones (different datacenters), while some (such as AWS) do not allow subnets to be defined across Availability Zones. In the latter case, a subnet will need to be defined for each Availability Zone. ![]() In this guide, we’ll use different Availability Zones for each node. Once the basic functionality of the SIOS product is understood, it might be appropriate to explore different scenarios (similar to those in use in your own network infrastructure) that involve distributing workloads across different subnets, modifying the IP ranges for these subnets, changing the manner in which the network is connected to the Internet, etc. Reproduced with permission from SIOS
|
June 7, 2022 |
How Workloads Should be Distributed when Migrating to a Cloud EnvironmentHow Workloads Should be Distributed when Migrating to a Cloud EnvironmentDetermining how Workloads (nodes) should be distributed is a common topic of discussion when migrating to the public cloud with High Availability in mind. If workloads are located within an on-premise environment, more often than not the locations of these workloads are defined by the location(s) of established datacenters. In many cases choosing another location in which to host a workload is not an available option. With a public cloud offering there are a wide range of geographical regions as well as availability zones to choose from. An Availability Zone is generally analogous to one or more datacenters (physical locations) being located in the same physical region (e.g., in California). These datacenters may be located in different areas but are connected using high-speed networks to minimize connection latency between them. (Note that hosting services across several datacenters within an availability region should be transparent to the user). As a general rule, the greater the physical distance between workloads, the more resilient the environment becomes. It’s a reasonable assumption that natural disasters such as earthquakes won’t affect different regions at the same time (e.g., both U.S. west coast and east coast at the same time). However, there is still a chance of experiencing service outages across different regions simultaneously due to system-wide failures (some cloud providers have previously reported simultaneous cross-region outages such as in the US & Australia). It may be appropriate to consider creating a DR (disaster recovery) plan defined across different cloud providers. Another perspective worthy of consideration is the cost to protect the resources. Generally the greater the distance between workloads, the more costs are incurred for data transfer. In many cases, data transfer between nodes within the same datacenter (Availability Zone) is free while it might costs $0.01/GB or more to transfer data across Availability Zones. This additional cost might be doubled (or more) when data is transferred across regions (i.e. $0.02 / GB). In addition, due to the increased physical distance between workloads, greater data latency between nodes should be anticipated between locations. Through consideration of these factors, generally speaking, it is recommended to distribute workloads across Availability Zones within the same Region. Reproduced with permission from SIOS
|
June 3, 2022 |
Benefits of SIOS Protection Suite/LifeKeeper for LinuxBenefits of SIOS Protection Suite/LifeKeeper for Linux
Reproduced with permission from SIOS |