Thursday, 27 April 2017

Enterprise Readiness with SAP HANA – Host Auto-failover, System Replication, Storage Replication

Continuing on from the earlier blog on Backup & Recovery that we covered last year, we will focus on the Building phase for data centers as we continue this topic in 2017. This segment will focus on what happens inside the data center, and specificically the high availability capabilities and options by SAP HANA, which can be deployed according to the IT managers’ landscape requirements. Of particular focus would be the System Replication feature, which will be explained in detail.

High Availability within Data Centers

SAP HANA comes with three different high availability and disaster recovery deployment modes that can be used:
  1. Host auto-failover
  2. System replication
  3. Storage replication

1. Host auto-failover

This method is appropriate for systems within a single data center. Host auto-failover focuses on replacing failed parts of a system, such as hosts or nodes, with a standby server. In this method, the main memory of the standby system is not preloaded with data from SAP HANA. When failure occurs, this configuration option selects one or more hosts, which are currently running as a standby, for immediate takeover (see Figure 3). This configuration can also scale out to a larger configuration with multiple hosts.

This feature will be managed by the name service in SAP HANA within a scale-out cluster, and multiple standbys can be executed with this feature. A regular check is run by the name service on each cluster member to determine if each node is still active. When a failure is detected, SAP HANA initiates a fully automated takeover by the standby hardware. Multiple takeovers can also be executed using multiple standby servers.

Figure 1. Host auto-failover

SAP HANA Tutorials and Materials, SAP HANA Certifications, SAP HANA Guide

2. System Replication

For enterprises with multiple data centers, system replication could be another recommended approach to ensure fast takeovers and minimum performance ramp-up in the event of system failure. With system replication, both data and log content are transferred using the SAP HANA database kernel. SAP HANA is also responsible for the replication process of data and logs, replication the information immediately after executed transactions. In the case of a transaction, both sites (primary and secondary) must acknowledge the commit to finish the transaction.

The below shows an example of system replication which is optimized for performance. An initial data load is performed first on the secondary system, to ensure both systems reflect the same data with one another. From there, two setup options can be used: the continous log replay setup, which is available since the SPS11 release of SAP HANA, or the delta data shipping setup option.

To maximize throughput efficiency and takeover performance, a “hot standby” continuous log replay setup can be used. Upon the initial data load on the secondary system, the log is then transferred in a steady stream from the primary to the secondary server, where it is further replayed (redo) in the secondary SAP HANA system. This feature reduces takeover times as well as network traffic.

Figure 2. System Replication

SAP HANA Tutorials and Materials, SAP HANA Certifications, SAP HANA Guide

Difference of footprint between Delta Data shipping and continuous log replay

In the delta data shipping setup, the operation on the secondary server is only active as a shadow of the primary server. Data and log streams in this case are only taken for local storage.

In the continuous log-replay setup, the operation on the secondary server has a higher footprint. This is because additional resources are needed on the shadow production instance to run a continuous replay of the logs from the primary server. As a result, the non-production instance in this case is smaller as compared to the delta-data shipping setup.

Multiple options for System Replication

To put it all together, the table below shows the four options available with SAP HANA system replication as well as their differences. It also shows the different configurations that can be modified to run nonproduction workloads, and further optimize the use of hardware assets.

Table 1. system replication options
SAP HANA Tutorials and Materials, SAP HANA Certifications, SAP HANA Guide

3. Storage Replication

Storage replication lastly, is useful for single or multiple data centers. This feature provisions a whole system on a replaced or alternative set of disks or storage system. Because of the remote management of this replication mode, it does not support preloading of data into the main memory of SAP HANA. More will be covered on storage replication in the next chapter in running our data center.

Balancing between the HA/DR Options

The above three options should be balanced according to requirements between the need for cost and performance. The two key metrics that can help guide decision making are the recovery-point objectives as well as recovery-time objective. To provide a rough guide, the table below helps compare product options according to these priorities.

Table 2. Balancing High Availability and Disaster Recovery Options

SAP HANA Tutorials and Materials, SAP HANA Certifications, SAP HANA Guide

Moving forward to System Replication Active/Active 

SAP also offers high availability solutions in which transactions run on the primary server and analytics can run on the standby server. This feature helps maximize hardware assets and processing performance, and is one of the latest releases from SAP in 2016. For more about this feature, see the other blog on, “Active/Active System Replication.”