Load Balancer Suggestions

jonah.haekel · December 19, 2023, 3:58pm

Has anyone done any implementations using load balancers? Any particular software that you would recommend? We were looking into Nginx, but would preferably like to use a windows based application rather than Linux. Would like to hear about some of your experiences.

pturmel · December 19, 2023, 4:04pm

The gold standard for load balancers, world wide, is HAProxy.

Mwah, hah, ha, ha!

The aphorism that best describes this is "cutting off your nose to spite your face."

Derek.McFarland · December 19, 2023, 4:27pm

This looks to be a huge can of worms to simply direct web clients to the active gateway in a redundant pair. The "Load Balancer" formerly shown on the inductive redundant architectures is now not shown. Perhaps the redundant pairs handle this now?

pturmel · December 19, 2023, 4:29pm

No, in a redundant pair, only one of the servers is "live", but they need no load balancer. The clients (Workstation, Vision Client Launcher) figure it out by preconfiguration. A running client session (of any kind) started on an active server should transition automatically to the other.

{ Jonah's OP doesn't mention redundancy, and I don't think that was meant. }

Derek.McFarland · December 19, 2023, 4:45pm

Thanks, 100% perspective web clients. Looking to increase availability by providing "redundant" web server gateways. Challenged by the understanding and experience of how backup gateways become primary when only a portion of the gateway's functionality fails. Also challenged by redirecting clients to the new primary web server. My experience is from the DCS world where redundant pairs look after themselves. Any service fails, the backup is stood up in the place of the failed primary or a redirection manager is provided.

pturmel · December 19, 2023, 5:01pm

Ignition does native redundancy only with dedicated pairs, with one "live" and the other up but "idle". This is necessary for reasonable switchover times when plant floor device connections are involved. There is no "instant" switchover with this, though some drivers can be "warm" in the idle server.

This kind of redundancy is fundamentally different from typical datacenter redundancy concepts, where servers can be spun up and spun down as needed, with an orchestration system (like HAProxy) pointing web clients at live servers. Ignition front-end gateways in a front-end/back-end split architecture can be treated this way, as long as they are all configured identically, with identical connections to (a/the) backend(s).

Datacenter style high availability is simply not practical with backend (I/O) Ignition gateways.

justin.brzozoski · December 19, 2023, 8:30pm

To double-down on what Phil said, using an Ignition-style redundant setup behind a load balancer will only make your job more difficult.

I definitely shot myself in the foot with my combined front-end/back-end redundant setup behind an AWS load balancer. My original architecture used the AWS LB purely for TLS termination and port forwarding, but having that LB in front made adding redundancy difficult. Trying to make it so customers could keep using a single DNS name and not need to know a redundant server even existed made it a total bear.

A quirky bear.

But I'm not going to hijack this thread with details unless anyone else asks.

jonah.haekel · December 19, 2023, 8:48pm

The architecture I will be implementing will have a load balancer, with 2 front end gateways. It will also have 4 tag gateways (2 master, 2 backup).

Feel free to hijack, I'm sure it would be useful for anyone looking to learn more about this style of implementation. I have looked into setting up a load balancer for a primary/redundant gateway setup before, but never implemented. Would be interested in hearing about some of the quirks.

justin.brzozoski · December 20, 2023, 4:03pm

The quirks are mostly based on two facts:

Clients (Perspective and Designer) may try to talk to both the master and backup server at the same time under normal operation. If only one is actually accessible, things will generally work but expect random delays as the client waits for the connection to the inaccessible server to timeout. Even weirder, if the "Public Address" setting on both servers advertise the same IP/DNS location and you use that IP/DNS to connect Designer, expect Designer to barf as it kills it's own session token by opening two connections to the same IP/DNS thinking it's talking to two different servers. It took a few iterations of web server, redundancy, and network firewall settings to minimize these issues. (I suspect there may be room to improve this more, but I'm in an okay state at the moment.)
Both active and inactive redundant servers will respond to HTTP requests properly, making normal LB health checks unusable. I ended up creating a tiny service that runs alongside Ignition on port 8089 and responds to HTTP requests with either a 200 or 403 result code depending on the contents fetched from http://localhost:8088/system/gwinfo

My general advice is that if you can give the master and backup unique DNS names and separate network paths, do it. Trying to force both through a common DNS name or IP address or making only one accessible at a time is not how IA expected a redundant pair to be used.

Derek.McFarland · December 20, 2023, 5:57pm

Thanks Justin!

Derek.McFarland · December 20, 2023, 6:22pm

I have to ask, what is Inductive Automation's recommendations for Perspective client access to redundant gateways? Last I remember, the Perspective workstation and Vision keep track of both redundant gateways but Perspective clients do not.

Kevin.Herron · December 20, 2023, 6:25pm

Hmm, is that true?

https://docs.inductiveautomation.com/display/DOC81/Ignition+Redundancy#IgnitionRedundancy-PerspectiveSessions

(genuinely asking aloud)

paul-griffith · December 20, 2023, 6:29pm

As far as I understand:
Since it's just a web browser connection, then initial launch against http://PrimaryGateway:8088 will obviously fail if PrimaryGateway is down.
However, once a session has launched, it gets a list of possible addresses to try for the backend (similar to Vision), such that "failover" can happen semi-transparently for the end user in their already open session.

Workstation and the mobile app are able to be a bit "smarter" and actually contact and choose the correct URL by caching, in their own storage, the primary and backup addresses.

justin.brzozoski · December 20, 2023, 6:55pm

Yeah, that was part of why I went down my LB path. I didn't want our users to have to remember two different URLs depending on which gateway was active, and I would really prefer they not even know that two existed. But the only way I could achieve that was to put some sort of redirection in front that would choose the appropriate server discretely, which lead to my issues.

The ideal solution if you have the resources and know-how, is to split the front end Perspective off into multiple identical servers (not using IA redundancy) which can be behind a single customer-facing load balancer, and to put the back-end on an IA-style redundant pair which is inaccessible to users but the front-ends can see both via the gateway network.

pturmel · December 20, 2023, 7:29pm

Yes, this. Production (OT) systems need redundancy, and production-critical, in-plant clients should be using Workstation or Vision. Management and analytics (IT) doesn't need redundancy--it needs scale-out.

Derek.McFarland · December 20, 2023, 7:37pm

There are many reasons for redundancy. Our biggest is patching the OS. For critical instances in the field, we use Edge as it doesn't need the Gateway(s) to survive. I'm realizing the simple solution is to force the use of Perspective WorkStation on Central Gateway clients.

Derek.McFarland · December 20, 2023, 7:40pm

I read this too and think it deceived me in the past. I'm curious how it does this?

wdougmiller · December 25, 2023, 10:22pm

This is the course I took, two active FE’s with LB in front. Though, I’ve heard this can still have quirky issues, so for now we are really just using the LB like a DR failover.

Planning to attempt with sticky sessions to actually load balance between the two active FE’s.

Even with that, I’ve heard that Ignition doesn’t handle this very well due to authentications that don’t carry over between the two separate FE’s.

To @pturmel’s point, the LB is for non perspective workstation users. Since we didn’t do the “standby” IA FE, I don’t think the PW failover will work for us?

Interested in other quirks people have run into while attempting LB’s with multiple FE’s that are independent gateways.

Thanks for starting a thread in this topic @jonah.haekel

pascal.fragnoud · October 1, 2024, 2:50pm

How do you keep the multiple identical servers identical ? EAM ?
I'd like to make this as fool proof as possible, considering operators may (and will) use the designer to modify project resources.