Learning vSphere 6.5-Part-10-VCHA failover testing


Last 2 post of this series were revolving around the high availability feature for vCenter that is introduced in vSphere 6.5 and we discussed the VCHA architecture and also learnt how to configure VCHA.

In this post we will be testing the HA feature and will see what happens when the Active Node of VCHA cluster goes down.

If you have missed earlier post of this series, you can read them from below links:

1: Installing and Configuring Esxi

2: VCSA Overview

3: vCenter Server and PSC Deployment Types

4: System Requirements for Installing vCenter Server

5: Installing vCenter Server on Windows

6: Deploying vCSA with embedded PSC

7: Deploying External PSC for vCSA

8: Understanding vCenter Server High Availability (VCHA)

9: Configuring vCenter Server High Availability (VCHA)

Lets jump into lab and test this awesome feature.

We will be testing failover via 2 method:

  • Automated failover (let system do the magic)
  • Manual failover (user will intentionally bring down active node of VCHA cluster)

Automated Failover Testing

1: To test the failover, login to vCenter web client and navigate to Configuration > vCenter HA and before performing a failover look at the Active/Passive node info and note which IP is active at the moment.

To start failover, hit the Initiate Failover button at top right corner.

haf-1

2: System will ask you that if you want to initiate the failover process and if you want to start failover immediately. If you check this box, then the recent DB changes from Active to Passive will not be replicated to Passive node.

I think it should be best to let database commit the running transactions to the passive node and because of that I chose not to check mark the immediate failover option.

haf-2

 

3: Once failover is initiated and if you try access IP/FQDN of active node you will see below screen

haf-3

4: Once the failover is completed, you will notice that the previous passive node now had become active. Compare this screenshot with screenshot shown in step 1 and you will notice the difference in IP for the active node.

haf-4

5: If you click on HA monitoring tab, system will report you that all nodes are up and running and overall health status of VCHA cluster is good and application state/ DB replication etc are all in place and working fine.

So what happens in automated failover testing is that the Active node is forced to fail by system so that the Passive node will become Active and the active node will become passive once it is recovered from failed state (recovery is done by system itself)

haf-5

2: Manual Failover Testing

A: To perform a manual failover testing, lets power off the Active node intentionally.

haf-6

B: After few seconds of powering off the Active node, if you try to access the vCSA IP you will see message about “Failover in Progress”

haf-3

c: Once the failover is completed and vSphere web client allow you to login again, you will observe that  health status of VCHA cluster is deteriorated and now you have a new active node and the previous active node had become passive and is currently down (because we have not powered on the node yet)

haf-7

D: Also if you go to Monitoring tab for VCHA, you will see that system is reporting that VCHA cluster has lost one of its node and the DB replication between Active node and Passive node is not happening.  Also you will notice Application state out of sync.

haf-8

At this point I hope you have understood the difference between the Automated failover test and Manual failover test and what happens when during failover.

I hope this post is informational to you. Feel free to share this on social media if it is worth sharing. Be sociable 🙂

About Alex Hunt

Hi All I am Manish Kumar Jha aka Alex Hunt. I am currently working in VMware Software India Pvt Ltd as Operations System Engineer (vCloud Air Operations). I have around 5 Years of IT experience and have exposure on VMware vSphere, vCloud Director, RHEL and modern data center technologies like Cisco UCS and Cisco Nexus 1000v and NSX. If you find any post informational to you please press like and share it across social media and leave your comments if you want to discuss further on any post. Disclaimer: All the information on this website is published in good faith and for general information purpose only. I don’t make any warranties about the completeness, reliability and accuracy of this information. Any action you take upon the information you find on this blog is strictly at your own risk. The Views and opinions published on this blog are my own and not the opinions of my employer or any of the vendors of the product discussed.
This entry was posted in Vmware. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s