Data loss issue with gateway redundancy over MQTT

Hello; I have an issue related to data loss while communicating over mqtt. The test architecture is as follows: one main gateway (8.1.44) with Ignition Distributor module (4.0.25) + Mqtt Engine module (4.0.25); two Ignition edge gateways (8.1.47) both running the Mqtt Transmission (4.0.29); One Edge gateway is the Master and the other one is the Backup ( the gateway network between them is working fine); I have a PLC which is sending data to the Master Edge gateway (S7 PLC); The test I perform is to verify if data is lost when disconnecting the Master Edge ( redundancy is activated). What I see is that when the Master disconnects (by cable) there is a period of about 10 seconds for which the Engine client (on the main gateway) is disconnected. Then the Backup client is connecting and the connection is restored ( new data arrives in the main gateway); During the disconnect time I lose data at the main platform level; The target tag is historized (on the main gateway) and the data is also NOT present in the historian. I played with the settings in both the transmission module and engine (on the history settings also) but I was not able to avoid this data loss. On the Engine settings the Store Historical Events is disabled (I want the Transmission module to fill back all the tag values directly into the tag - I will need some tag change events to be triggered).
The question is: is it possible to avoid this data loss? Or I need to accept that for the time while the Backup gateway takes over the communication the data will be lost?
Any suggestions would be very usefull.
Thank you

Ignition Redundancy does not claim to be lossless. You should plan on using a ring buffer and handshaking to ensure critical data is recorded.

1 Like

Thanks for the quick reply; Actually before that I made a similar test using an Edge gateway (with Transmission module) and a main gateway (with Engine module) and with that configuration the data was "almost not lost" ( the transmission module was buffering the data while disconnected and then transmitted all the values to the target tag on the main gateway); I expected to have a similar behavior when using redundancy but it seems that with that the blind interval is longer. Ring buffer or handshaking means custom logic (scripting) between the Edge gateway(s) and main platform?

Ring buffer in the PLC, handshaking between PLC and gateway that has the database. Difficult to implement with Edge, due to the extra hop.

What was the standby activity level you configured redudnancy with?

the project I am working on is for the pharma industry - better not to discuss the changing of the PLC code...

i am not sure what this relates to; it is about the timeout settings on the engine module? this is the minimum (5 seconds); and my feeling is that the standby is ~ double (10 sec); the master edge transmission disconnects and then it take about 8-10 sec for the backup edge to take over ( the historian settings are configured for storing locally the data but I assume the mqtt transmission is not doing that for the backup edge while the master edge is online)

Ignition redundancy settings, nothing to do with MQTT or transmission at all.

ok, i get it; so the Startup Connection Allowance is 1500 ms; the Sync Timeout is 5; on the Backup Node Settings the Ping Rate is 1000ms, timeout is 300 and Ping Max Missed is 2; the Standby Activity Level is set to Warm

Okay, just checking - Warm is as good as it gets already. If you were on Cold there might have been some room for improvement.

1 Like

I agree with @Kevin.Herron. Your issue has nothing to do with MQTT. That's just the time it takes for a backup gateway to realize it needs to take control. Not much you can do. But, I will tell you this... try to get another SCADA platform to be as easy to configure redundancy as Ignition, and as quick to detect failover and switch.

I also agree with @pturmel, if loss of data is unacceptable, don't rely on SCADA. Store data locally in the controller or in a datalogger closer to the source.

1 Like

hy; so, to be clear: i never intended with my post to do a statement against the available functionalities in Ignition ; I am at the beginning on my learning travel with Ignition and I agree, the redundancy is working quite well ( to be honest I didn't expect from the start that no data will be lost because the switching it is normal to take some time); I opened the post because I assumed that maybe some improvements can be done (some tricks that maybe you cannot find in the doc files). thank you for the recommended points

2 Likes

I didn't take your post as negative against Ignition, so all good! Cheers!

1 Like