Redundancy - Synchronize active backup node to master node

Sorry, its me again. I’m trying to add support for redundancy to my driver module, and stuck at transferring the state of the active backup node to the passive master node.
Using the persistent record system works as expected, as long as the master node is active. When the master shuts down, the backup node becomes active and the internal db on the backup is in sync with the master.
Now the backup node starts deleting and inserting persitent records to the db. I expected those records to be updated on the master once it becomes available again, but this does not happen.
When i request the master to become the responsible node, it starts with the old database, which is not in sync with the backup anymore. Sometimes (i think when i try to delete a record on the master that is already deleted on the backup) the backup node realizes this inconsistency and restarts the service. After the restart the backup is in sync again, but all changes to the backup database are lost.

What am i missing? I searched the javadoc for a method to access the backup database from the master or trigger a sync manually, but this seems not to be possible.

I just did some more testing, and it seems not to an problem with the API but with my configuration.
I created a SQL-Tag (Memory Tag) that behaves the same way as my driver. The current value is transferred from Master to Backup, but not back from Backup to Master.
I’m not sure what to change to make this working. Are there maybe any limitations to redundancy with the trial version?

Hi,

The redundancy system does not allow changes to the database on the backup. So, when you make edits to the internal db, they will be lost.

HOWEVER, there is a system in place that is supposed to allow for “runtime data” to be synchronized back to the master. It’s important to understand the difference here between “config data” and “runtime data”. Config changes go to the internal database, and are queued up to send to the backup. Runtime data is held in memory until sent. The backup is allowed to send runtime data to the master. Once the node receives runtime data, it may write that to the internal database (which it should do using “getLocalPersistenceInterface”, therefore avoiding a potential config data change).
The memory tag is supposed to use this runtime data mechanism, so perhaps you could explain a bit about what you’re doing/how you’re testing.

Here’s what’s supposed to happen:

  • When the value of a memory tag is changed through getTagManager().write(…), it is applied to the tag, stored in the internal database, and queued as a runtime change.
  • The backup node receives that change, applies it to the tag, and stores it locally.
  • The master goes down.
  • The tag is written to on the backup node - it is written locally, and queue.
  • ON RECONNECT: The master compares its start time with the start time of the backup. IF the backup has been running longer, the master gets the runtime changes. IF NOT (if the master has been running the whole time), it will send the current state of any runtime data compatible objects to the backup.

SO, if you’re testing by simply disconnecting the master from the backup (unplug network cable, for example), you won’t see the change come through. If you stop the master service, on startup, the memory tag value should be transferred. If this isn’t working, perhaps I can have you turn on logging on the master and we’ll see what’s happening.

Regards,

Hi,

thank you for this information. Is that ‘runtime data synchronisation’ available to custom modules? Sounds exactly like what i need.

Here is what i did to test the MemoryTag:
I have to Gateways (7.5.2) running in VM’s with a simple test project containing only a memory tag and an input field to change the tags value.

  • Client is started, connects to the Master gateway.
  • Enter a value for the tag
  • Shutdown the Master gateway by stopping the service
  • Client is transferred to the Backup gateway, input field shows the value entered before
  • Enter a new value for the MemoryTag
  • Restart the Master Gateway, Switchover mode is set to ‘manual’ so nothing happens.
  • Request the Master gateway to takeover responsibility
  • Client is transferred to the Master gateway
  • Value in input filed switches back to the value it had on shutdown of the Master Gateway

Just let me know if i can turn on any loggers to narrow this down.

By the way, when i stop the Backup node, while the Master is not active, nothing happens. The Master does not become active and the link to make is responsible disappears. According to the manual the master should automatically become active in this situation.

Hi,

Thanks for the detailed information. I’ll have to look at whether the manual mode is interfering with the runtime synchronization, and I’ll certain look at why it’s not becoming active automatically.

As for using the runtime state system, yes, you certainly can. Do the following:

  1. Implement a RuntimeStateProvider.
  2. Register that with GatewayContext.getRuntimeStateManager().registerRuntimeProvider(…).
  3. Send updates with GatewayContext.getRuntimeStateManager().postRuntimeUpdate(…).

When you postRuntimeUpdates, they will just be sent to the other node- they won’t be delivered again locally. In other words, you should execute any actions that modify state, and then post it. I point this out because it is different than the way config works, where you post a task, and it executes locally first, and then is forwarded if there isn’t an error.

Regards,

The RuntimeStateManger works perfect, thank you for this information.
Now that i know this class, the correspondig section in the programmers guide makes sense. Maybe someone can add a little example to the next version of the guide, it is a bit difficult to get the link between the guide and the javadocs.

Concerning the SQLTag sync:
I reinstalled both test gateways, to make sure my messing around with the internal database did not cause any issues, but the backup still does not synchronize back to the master. The manual mode is not the problem, it does not work in automatic mode either.

Hi,

I think we’ve tracked down the problem with tag values. It appears to only affect those, not other parts of the runtime state system. It will be fixed in 7.5.3 beta 3.

Regards,