I have a strange problem with Ignition’s OPC-UA client. Some tags stop updating after working for some days. In the diagnostics i can see that Ignition thinks it has a subscription, but that subscription ist not present in the server. It looks like after an error, Igntion doesn’t renew the subscription.
Ignition version 8.1.5 on Linux
OPC-UA server: TANI PLC-Engine 2.0.2
At the same time the tag stopped updating, there is a log entry:
INFO | jvm 2 | 2021/06/18 09:20:57 | W [o.e.m.o.s.c.s.OpcUaSubscriptionManager] [07:20:57]: no item for clientHandle=290
INFO | jvm 2 | 2021/06/18 09:20:57 | W [o.e.m.o.s.c.s.OpcUaSubscriptionManager] [07:20:57]: no item for clientHandle=289
INFO | jvm 2 | 2021/06/18 09:20:57 | W [o.e.m.o.s.c.s.OpcUaSubscriptionManager] [07:20:57]: no item for clientHandle=288
INFO | jvm 2 | 2021/06/18 09:21:36 | W [o.e.m.o.s.c.s.OpcUaSubscriptionManager] [07:21:36]: no item for clientHandle=289
The tag is still displayed as subscribed in Ignitions’s client.
Those warnings generally mean a data change notification was received for an item id we don’t have on record any more. It can happen as a timing quirk when a lease change or items are removed and I’m not sure it’s related to the whole subscription being gone.
Leaving org.eclipse.milo.opcua.sdk.client.subscriptions.OpcUaSubscriptionManager on DEBUG might help, along with com.inductiveautomation.ignition.gateway.opcua.client.connection.OpcUaConnection. If the subscription times out or “disappears” some how the server should notify the client with a status change notification for that subscription, which should result in the subscription being removed / resynchronized in the client bookkeeping.
Hello Kevin,
i was able to log the sequence of events now. It looks like the OPC-UA server stops sending publish responses when values in a subscription are unchanged for about 30 hours. There is no notification to the client, the subscription simply dies silently. This seems to be a problem on the server side.
But shouldn’t Ignition notice the missing publish response and try ro re-subscribe when (timeout x keep alive count) expires?
Sorry for the delay, I’ve been out on vacation for the last month.
I’ll try to look into this closer in the coming weeks. I think you’re probably right though. Either the Milo or Ignition client should notice when a subscription’s lifetime has expired due to absence of keep alive notifications and do something, and I don’t think that’s happening right now.
We have the same problem with a customer of us. Did you come further with investigating the issue regard this?
We also noticed that polled tags keep updating while subcriptions are freezed, we set all tags from the opc server to a polled tag group to make sure they keep updating.
Also we saw the opc ua server itself blinking from faulted to connected, seems like missing a keep alive response as well, after resetting the opc ua connection within ignition it was connected without blinking to faulted.
Hopefully the 8.1.15 release will include a client subscription watchdog. It’s been implemented in the OPC UA library, which needs to be released and integrated into Ignition.
Is there a log setting to get more details about what is happening? Maybe a problem with the servers keep-alive implementation?
These messages are logged very often (5 to 60 seconds)
This is an absence of activity, so there’s not much in the logs, though it has been quite clear in Wireshark in the few cases I’ve observed so far.
I’d like to see Wireshark captures to rule out an implementation bug in the watchdog, but in every other case so far it has been an implementation problem in the server’s subscription / keep-alive logic.
Logger for org.eclipse.milo.opcua.sdk.client.subscriptions.OpcUaSubscriptionManager on DEBUG would log things when PublishRespones do arrive, though.
The expected keep-alive interval is agreed on when the subscription is created. If the server supports diagnostics you can view the Server diagnostics for that connection in Ignition and drill down into each subscription. Look for the maxKeepAliveCount and publishingInterval parameters:
This means the keep-alive interval is 500 * 120 = 60000 (milliseconds). So a keep-alive every 60s in the absence of data changes. The watchdog timer will fire after 125% of whatever this value is has elapsed.
Ignition tries to create subscriptions with a 60s keep-alive interval by default, but the server is allowed to revise the requested maxKeepAliveCount, so it may be something different.
Also there seem to be notifications that Ignition can’t relate:
INFO | jvm 1 | 2022/05/29 09:49:31 | W [o.e.m.o.s.c.s.OpcUaSubscriptionManager] [07:49:31]: no item for clientHandle=7
INFO | jvm 1 | 2022/05/29 09:49:31 | W [o.e.m.o.s.c.s.OpcUaSubscriptionManager] [07:49:31]: no item for clientHandle=13
INFO | jvm 1 | 2022/05/29 09:49:31 | W [o.e.m.o.s.c.s.OpcUaSubscriptionManager] [07:49:31]: no item for clientHandle=21
INFO | jvm 1 | 2022/05/29 09:49:31 | W [o.e.m.o.s.c.s.OpcUaSubscriptionManager] [07:49:31]: no item for clientHandle=29
INFO | jvm 1 | 2022/05/29 09:49:31 | W [o.e.m.o.s.c.s.OpcUaSubscriptionManager] [07:49:31]: no item for clientHandle=33
INFO | jvm 1 | 2022/05/29 09:49:31 | W [o.e.m.o.s.c.s.OpcUaSubscriptionManager] [07:49:31]: no item for clientHandle=37
INFO | jvm 1 | 2022/05/29 09:49:31 | W [c.i.i.g.o.c.c.OpcUaSubscriptionSynchronizer] [07:49:31]: Failed to create 2 items on subscription 'tag-group-default_emzaehlerlesen': requestId=1015782 timed out after 60000ms connection-name=TANI-PROD
INFO | jvm 1 | 2022/05/29 09:49:31 | W [o.e.m.o.s.c.s.OpcUaSubscriptionManager] [07:49:31]: no item for clientHandle=7
INFO | jvm 1 | 2022/05/29 09:49:31 | W [o.e.m.o.s.c.s.OpcUaSubscriptionManager] [07:49:31]: no item for clientHandle=13
INFO | jvm 1 | 2022/05/29 09:49:31 | W [o.e.m.o.s.c.s.OpcUaSubscriptionManager] [07:49:31]: no item for clientHandle=21
INFO | jvm 1 | 2022/05/29 09:49:31 | W [o.e.m.o.s.c.s.OpcUaSubscriptionManager] [07:49:31]: no item for clientHandle=29
I sent you the logs and a wireshark capture in a PM
The capture was a little bit difficult to decipher because the TANI server seems to re-use subscription ids… but you can see in this capture the server did fail to send a PublishResponse in time:
The filter applied in this case is opcua.SubscriptionId==65545 || opcua.SubscriptionIds==65545.
The CreateSubscriptionResponse returns and indicates the revised publishing interval is 1020 and the revised max keep alive count is 60. So after 61,200ms of no activity for that subscription a keep-alive PublishResponse should be de-queued and sent. The watchdog timer elapses at ~10:12:01, at which point Ignition issues a DeleteSubscriptionsRequest. The last PublishResponse for that subscription was received at ~10:10:45, a ~76,000ms gap, or at ~125% of the expected 61,200ms.
I think you need to contact TANI about this. It’s not very surprising, since a post a while back about how this same server sometimes just stopped sending subscription updates was one of the main motivators to implementing this client-side watchdog timer…
No, I need to think about whether to allow the “lateness” to be configurable or allow it to be disabled or both.
But right now it’s not something that can be disabled, because I expected to be catching servers with implementation bugs that caused a subscription to stop updating entirely, not just late (though we don’t know which is happening here), so at the time I figured it was safe to just have enabled.