Faster failover with cold spare?

I have a pair of redundant gateways on 8.3.2. The backup is set up as a ‘cold’ spare. The customer would like to do updates/reboots on the primary/backup during production hours, without losing access for very long, but when we fail over to the backup gateway, the reboot is so fast that the primary is able to take over before the secondary gateway is spun up, so we end up waiting for a gateway restart which is several minutes, either way.

I’d like to be able to fail over manually, quickly, with the spare in ‘cold’ - like ‘warming up’ the spare, so it can take over, then shutting down the primary gateway and actually failing over so there’s very little downtime, as though the spare was warm.

During regular production, we need to keep in ‘cold’ to reduce tag polling, but once a month it’d be nice to be able to fail over quickly.

Clearly, I could change the configuration of the gateway to move to a ‘warm’ spare, do the failovers, then change it back, but I feel like that’s prone to human error, and leaves active tag polling running on both gateways until it’s manually turned off - another thing that would be nice to automatically turn on and off when a manual failover is engaged.

What’s my best course of action? Maybe I’m missing something?

You can configure redundancy for manual recovery instead of automatic on recovery of the primary/master server. Then before doing your reboot, force a failover from the master to the backup. Reboot the master server, and once it's up and settled, have it assume control again, then if you want automatic recovery on, turn it back on.

If they're doing this so frequently, I'd almost say just leave recovery in the manual setting.

This is only for maintenance purposes - failovers with intent. Nothing unintentional is going on. So you’re saying if you manually failover with a cold spare you don’t get long wait times to swap?

It fails over much faster on manual failovers in my experience because it's able to do so gracefully rather than an abrupt loss in communications.

2 Likes