Question about the redundancy setup

I have the redundancy setup up and running on the ignition side. My question is in regards to detecting if my opc server is dead. My opc server has a heartbeat that I can monitor, to verify that it is live. Is there a way to invoke a change from master to backup using a script that monitors the status of the OPC server that I will have running on each server?

Hi,

For this initial 7.2 release we did not put in a way to trigger a switch-over based on outside conditions. That is, right now it’s purely based on whether the backup can see the master from a networking point of view. We expected to get feature requests of this kind, so I suspect that they’ll be coming along sometime soon.

Regards,

ok, we would definitely need this. the opc server is just as critical as the ignition interface.

is there a way to stop the gateway, or fault the gateway based on a scripted condition as a work around for now?

Yes, there is a way. You can create a Global Event Script (Gateway) such as a Timer Script or Tag Change script that can shutdown the local Ignition service.

So for example, you can have a timer script that runs on the gateway every 5 seconds or so that checks your OPC conditions and if it needs to shutdown Ignition can do the following:system.util.execute(["C:\\stop.bat"])The batch file looks like:net stop ignition.

**** WARNING ****: This is not recommended since you can shutdown both servers and the only way to get one of them up again is to start the service manually.

I should be able to write a script that looks at say, the computer name so that the script only executes on the one computer correct?

like say

if computer name = server1 and opcserverheartbeatdead;
system.util.execute([“C:\stop.bat”]

Im not worried about having to go back and restart the gateway, I can setup an alert or something to notify me when the master crashes. I just need it to swap over if it senses that the OPC server has died and switch to the backup where the other opc server resides.

thanks for the help on this.

Yes, you can look at the computer name and only do that code on one machine.

If you’re going to try to do something along those lines, the best thing to do would be to try and restart the service, use a failover timeout shorter than the restart period, and use Manual recovery mode, so that the master doesn’t automatically take over again after reboot.

It seems kind of risky, but I suppose if you really need it, it’s better than nothing. We will likely add support for this type of stuff in 7.3, but perhaps can put an “advanced” scripting function in earlier that forces the other node to take over, so you could roll your own a bit more easily. It could also do things like verify that the other node is available before shutting down the current one.

Regards,

Yeah that’s an option I will look at also. I’m ok with not automatically swapping back to the master until I restart the master gateway. That would give me a chance to look at my opc server alarm log to see what happened.

If the gateway on the master is stopped, it shouldn’t be an issue with going a few hours before I find the issue with the opc and restate it along with the master gateway correct? Just trying to see if there are any downsides to doing it like that.

Thanks

No, there’s no problem, the backup system can run for however you long you want as the active node. I was just pointing out that the Master Recovery Mode dictates whether or not the master will automatically take back responsibility once started, and for what you want to do, it should be set to manual.

Regards,

well, I have the backup part of it working but my dang OPC server seems to be restarting automatically when we connect to it opc. even when I manually stop the service in the administrative/services console it starts right back up as long as something OPC is connected to it. It doesnt do this when we connect to it via dde/suitelink with wonderware. aarghh.

anyway, my opc server provides a heart beat that changes every few seconds, and counts from 0-9. so if the number in the register stays stale, the server has crashed. whats the best way to look at that in ignition’s scripting?

Yes, the OPC server will likely be started when another app tries to connect… though funny enough, if the OPC server is the part that’s crashing, that might be enough to make it work again.

Anyhow, I think the easiest thing to do would be to bring the heartbeat in as a SQLTag and reference that. If you did Or, you could read the value directly using the system.opc.readValue() function.

Regards,

I know I’m jumping in late here - but shouldn’t you be using an OPC redundancy product for this? Detecting a failure in a 3rd party OPC server and then switching to an Ignition backup doesn’t totally make sense to me. What indication do you have that the backup is going to be talking to a different OPC server, and that that other OPC server is working?

I have an opc server running on each ignition server. the opc server provides me some diagnostics and also a bit to toggle it from backup mode to live mode. tomorrow I am going to try and get the dcom setup so that I can look at the status tags on each opc server and if needed, I can toggle on the backup opcserver when I detect a reason to do the switchover. I am then gonna stop the igntion gateway and the OPC server if necessary. Im not worried about switching back if I have a failure on the master side until we can log in and see what happened.

are there any system tags to monitor redundancy status? sort of like I see on the gateway page.

There is one System tag that shows the role of the gateway it is looking at.
System/Client/Network/GatewayRedundancyRole

Possible values are:
Independent
Master
Backup

ok, I was finally able to get the redundancy stuff to work on the opc side. It took a little bit of scripting to get it to work like I needed. basically I had to setup a opc-ua wrapper on the backup so that I could turn the opc server on/off depending on the computer name(couldnt get opc-da to connect remotely to the backup machine for some reason). if I am on the master, I use the opc-ua link to write to the backup server off tag to turn it off. If I have a failure on the master, once it swaps over it writes via the normal opc-da to come back on since this opc-da stays active during swapover. I then have a script that looks at the status of the master opc server, and if the heartbeat fails, it stops the gateway(which also stops the master opc from continuing to poll), which forces a swap to the backup . Once I start the master up again the computer name script turns the backup opc off via the opc-ua link .

sounds confusing but it works! This is just a test scenario, if mplemented I would probably also add some stuff in scripting to make sure that the backup opc and gateway was active before trying to swap over.

Glad you got it working. As I mentioned, I suspect that soon we’ll add a scripting function that can be used to request that the other node take over, but beyond that, if you have any suggestions feel free to post them in the feature request forum.

Regards,

thanks, Ill come up with a little something. we will probably make a video and post on youtube for our customer to see. Ill post a link so yall can see what we did and if there is anything to make it easier.

I guess the mobile module doesnt work with redundancy???

Hi,

No, it currently does not. We hope to add some sort of support shortly, but have a few things to work out.

Regards,

so its definitely in the works? just wanted to verify in case a customer asks.