Brief Comm Loss b/w Vision Client & Gateway

I have a deployment with a gateway running on a server, and two thick clients running Vision, all of which are communicating over an ethernet switch. I have a memory tag for each client functioning as a heartbeat between the client and the gateway.

The gateway is running a 15 second timer script that iterates through every client’s memory tag… if the memory tag has a value of 1, this is considered ‘good’ and the gateway timer script will write a 0 to this tag. Meanwhile, a client tag change event script tied to the same memory tag will execute as result of the gateway writing a 0 to this tag. The client tag change event script will write a 1 back into the tag. Then, at the next 15 second interval, the gateway once again reads the tag, sees a 1, and writes a 0.

In the case in which the gateway timer script reads the tag and does NOT see a 1, it treats this as a comm issue between the GW and the client (e.g. the client script never executed and reset the heartbeat tag within that 15 second period). In this case it will relinquish any ‘control’ that said client had over equipment in the field, requiring an operator to manually take control again which is a nuisance for them and ultimately the reason for this post.

I am trying to understand why this comm issue is occurring. One observation is that I see this in the log every time this occurs:

INFO | jvm 1 | 2026/01/28 08:49:22 | I [c.i.i.g.s.g.f.Projects$ProjectChangeMonitor] [13:49:21]: Starting up client project monitor. project=GGS-21140 scope=4 request-origin=10.50.162.6, session-user=Operator, session-project=GGS-21140, session-id=76D049DA

Other notes:

The heartbeat tends to fix itself by the next 15 second interval; but by then the ‘control’ has already been relinquished, e.g. the damage is done.

There is a different 15 second client event timer script that is used to track inactivity and force logout of privileged users, but this script has nothing to do with the ‘control’ of the field equipment and the correspond memory tags I described above. I only mention this because in my research I have seen that timer scripts can sometimes cause issues like what I am describing.

Don't write from two directions. I recommend a timer event in your client, at a 1 or 2 second pace, that writes the current client timestamp to the memory tag. In the gateway, monitor for that value getting stale, too far in the past.

That’s certainly not a bad idea, I fully agree there’s better ways to handle a communication check like this. But I am still unclear as to what could cause the failure of the client tag change event script to execute, seemingly randomly.

Could this be as simple as something with the network switch or the thick client itself? I just want to be certain it is not a ‘programming issue’.

How many of these memory tags with event scripts on them are there? I assume that the gateway timer script is writing to all tags at once not once per loop. Either way, what does this script look like in the client tag events. There are limits to how many events can run at once.

Right now the client tag change event script has three separate tag path triggers:

[default]MemoryTags/Clients/HSM-5525-DLC/Status
[default]MemoryTags/Clients/HSM-5517-DLC/Status
[default]MemoryTags/Clients/HSM-5536-DLC/Status

The two scripts are attached.

ClientTagChangeEvent Script.txt (375 Bytes)

GatewayTimer Script.txt (2.2 KB)

I would suggest that you use a logger and java.lang.System.nanotime to time the execution of the timer script.

Is the timer script set for Fixed Delay or Fixed Rate? If it is set for fixed rate and your script is taking longer than that rate to execute then you could be running into problems there.

system.tag.browse() is a very heavy function. I would also suggest moving these scripts to a project library script, and using a top level variable to hold the client list and controlTags. You may also want to look into system.tag.query() in place of system.tag.browse().