To replace a damaged or non-functional Zyxel USG FLEX H-Series firewall in a High Availability (HA) setup, it's essential to follow best practices to minimize downtime and restore HA functionality effectively. While replacing a failed device essentially involves reconfiguring the HA setup from scratch, this guide outlines key steps, best practices, and important nuances to help streamline the process and avoid common pitfalls.
Here's a step-by-step guide to replace and rebuild HA in such a scenario:
- You have two USG FLEX H-Series firewalls (Primary and Secondary).
- The faulty "Primary" device will be replaced with a new, working unit of the same model.
- The secondary/standby unit is still functioning and currently active.
HA Topology:
- USG FLEX 500H - Site5(FLEX500HP_1) - Primary (The main site where all devices on our site are physically registered and the main firewall)
- USG FLEX 500H - Site5(FLEX500HP_2) - Secondary (A site with only a backup firewall device physically registered)
This setup can be somewhat confusing, as the firewalls are registered to different sites. However, the firewall at the secondary (Backup/Passive) site will always appear offline, while the firewall at the primary (main) site will consistently show as online, regardless of which firewall is actively handling traffic at any given time.
Let’s assume the Primary firewall registered to the main site has failed, and the backup (secondary) firewall is currently handling all traffic. In this case, we need to replace the failed firewall that is registered to the main site. But we can also move our passive device to the main site and make it the main site, and only then add our new device as a passive device.
Step 1: Remove the Defective HA Peer
- Physically unplug the damaged device and remove it from the system.
- On the active (functioning) device, navigate to: Left Menu > System> Device HA
If HA is currently enabled but cannot sync with the broken peer, click Disable HA. - Save and apply the changes to complete this step.
Backup Configuration on the Active Device (it's not mandatory, but it's best practice to avoid losing your settings. Ideally, you should always have a copy of the configuration).
- Log in to the Active (working) device.
- Go to Maintenance > File Manager > Configuration File.
- Download a lastgood and startup config of the Active device (in case anything goes wrong).
Step 2 -Prepare for HA of the active master device
In this example, we’ll proceed by assigning the currently active device to Site 1 as the Primary device. The replacement (new) device will be assigned to Site 2 as the Backup (Secondary) device.
Since we disabled HA on the active device in the previous steps, the device now appears as online on Site 2. As shown in the image below, we’ll now reassign this device to Site 1 and designate it as the Primary device.
First, we need to remove our damaged firewall from the main site.
| First, we need to remove our damaged firewall from the main site. | Assign your active device to the main site, thereby making it the main one. |
You can now verify that the previously Secondary device has been successfully added to the main site and has naturally assumed the role of the Primary device.
Step 3 - Preparing for HA of the passive device
- Unbox and power up the new/replacement device, but do not connect it to the network yet.
- Ensure the firmware version matches the Active device (check under Maintenance > Firmware).
- If not, upgrade the firmware.
Factory Reset the New Device (if previously used)
- Hold the RESET button for ~10 seconds until the SYS LED blinks amber.
- Let the device reboot completely.
Connect and Configure the New Replacement Device
- Connect your PC to the LAN port of the new firewall.
- You will also need access to the Internet on the new device to register it and add it to the Nebul site.
- Log in using the default IP:
192.168.168.1(admin / 1234). - Initialise the device, and during the initialisation process, register the device in the same organisation.
- Add your new device to the second site; the new device will be our backup/secondary device
Step 4 - Pair the Devices Again for HA
- On the Primary Device, go to > System >Device HA > HA Configuration
- Click Enable HA and set:
- On the Secondary Device, go to > System >Device HA > HA Configuration
- Click Enable HA and set (on a passive device, there is no need to make any HA settings, just activate this function):
Also, note that the heartbeat cable should be disconnected for now.
Step 5 - Synchronisation and Testing
- Once HA is enabled on both sides:
- The primary should push its config to the new device.
- Wait for sync to complete. This can take a few minutes.
- Check HA status:
System > Device HA > HA Log
Step 6 - Failover Testing Procedure:
Simulate Primary Failure
Temporarily shut down or disconnect the primary system from the WAN to simulate a failure scenario.
Verify Secondary Takeover
Confirm that the newly designated secondary system successfully assumes the primary role without service interruption.
Restore Primary and Validate HA Status
Reboot the Active Device to make the original primary device active again. Verify that high availability (HA) status normalises and roles return.
If the synchronisation status appears incorrect, despite logs and settings indicating proper operation, you can resolve the issue via the Web Console by executing the following command:
usgflex500h> cmd device-ha force-sync full OKusgflex500h>
Comments
0 comments
Please sign in to leave a comment.