OPC-UA - Reconnect to remote server fails

There is already a similar topic in this post, but my problem still happens in 7.8.4, so i started a new topic.

When my remote OPC-UA server restarts, Ignition fails to reconnect automatically. When i edit the connection settings in the gateway configuration, the connection is working again.
The reconnect itself seems to work, but something called TransferSubscriptionRequest seems to fail. Other clients reconnect in this situation without problems.
The full log is attached to this post.

[quote]INFO | jvm 6 | 2016/10/17 13:30:02 | DEBUG [ClientChannelManager ] [13:30:02,053]: Reconnect succeeded, channelId=1142659188
INFO | jvm 6 | 2016/10/17 13:30:02 | DEBUG [ClientSessionManager ] [13:30:02,055]: (re)ActivateSession failed: UaServiceFaultException: status=Bad_SessionIdInvalid, message=The session id is not valid.
INFO | jvm 6 | 2016/10/17 13:30:02 | java.util.concurrent.CompletionException: UaServiceFaultException: status=Bad_SessionIdInvalid, message=The session id is not valid.
INFO | jvm 6 | 2016/10/17 13:30:02 | at java.util.concurrent.CompletableFuture.encodeRelay(Unknown Source)
INFO | jvm 6 | 2016/10/17 13:30:02 | at java.util.concurrent.CompletableFuture.completeRelay(Unknown Source)
INFO | jvm 6 | 2016/10/17 13:30:02 | at java.util.concurrent.CompletableFuture.uniRelay(Unknown Source)
INFO | jvm 6 | 2016/10/17 13:30:02 | at java.util.concurrent.CompletableFuture$UniRelay.tryFire(Unknown Source)
INFO | jvm 6 | 2016/10/17 13:30:02 | at java.util.concurrent.CompletableFuture.postComplete(Unknown Source)
INFO | jvm 6 | 2016/10/17 13:30:02 | at java.util.concurrent.CompletableFuture.completeExceptionally(Unknown Source)
INFO | jvm 6 | 2016/10/17 13:30:02 | at org.eclipse.milo.opcua.stack.client.UaTcpStackClient.receiveResponse(UaTcpStackClient.java:283)
INFO | jvm 6 | 2016/10/17 13:30:02 | at org.eclipse.milo.opcua.stack.client.UaTcpStackClient.lambda$sendRequest$3(UaTcpStackClient.java:145)
INFO | jvm 6 | 2016/10/17 13:30:02 | at java.util.concurrent.CompletableFuture.uniWhenComplete(Unknown Source)
INFO | jvm 6 | 2016/10/17 13:30:02 | at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(Unknown Source)
INFO | jvm 6 | 2016/10/17 13:30:02 | at java.util.concurrent.CompletableFuture.postComplete(Unknown Source)
INFO | jvm 6 | 2016/10/17 13:30:02 | at java.util.concurrent.CompletableFuture.complete(Unknown Source)
INFO | jvm 6 | 2016/10/17 13:30:02 | at org.eclipse.milo.opcua.stack.client.handlers.UaTcpClientMessageHandler.lambda$null$19(UaTcpClientMessageHandler.java:528)
INFO | jvm 6 | 2016/10/17 13:30:02 | at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
INFO | jvm 6 | 2016/10/17 13:30:02 | at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
INFO | jvm 6 | 2016/10/17 13:30:02 | at java.lang.Thread.run(Unknown Source)
INFO | jvm 6 | 2016/10/17 13:30:02 | Caused by: UaServiceFaultException: status=Bad_SessionIdInvalid, message=The session id is not valid.
INFO | jvm 6 | 2016/10/17 13:30:02 | … 10 more
INFO | jvm 6 | 2016/10/17 13:30:02 | DEBUG [ClientSessionManager ] [13:30:02,064]: Sending CreateSessionRequest…
INFO | jvm 6 | 2016/10/17 13:30:02 | DEBUG [UaTcpClientMessageHandler ] [13:30:02,110]: OpenSecureChannel timeout scheduled for +5s
INFO | jvm 6 | 2016/10/17 13:30:02 | DEBUG [UaTcpClientMessageHandler ] [13:30:02,113]: OpenSecureChannel timeout scheduled for +5s
INFO | jvm 6 | 2016/10/17 13:30:02 | DEBUG [UaTcpClientMessageHandler ] [13:30:02,114]: Sent OpenSecureChannelRequest (Issue, id=0, currentToken=-1, previousToken=-1).
INFO | jvm 6 | 2016/10/17 13:30:02 | DEBUG [UaTcpClientMessageHandler ] [13:30:02,126]: OpenSecureChannel timeout canceled
INFO | jvm 6 | 2016/10/17 13:30:02 | DEBUG [UaTcpClientMessageHandler ] [13:30:02,132]: Sent OpenSecureChannelRequest (Issue, id=0, currentToken=-1, previousToken=-1).
INFO | jvm 6 | 2016/10/17 13:30:02 | DEBUG [ClientSessionManager ] [13:30:02,135]: Session created: NodeId{ns=1, id=1}
INFO | jvm 6 | 2016/10/17 13:30:02 | DEBUG [UaTcpClientMessageHandler ] [13:30:02,138]: OpenSecureChannel timeout canceled
INFO | jvm 6 | 2016/10/17 13:30:02 | DEBUG [UaTcpClientMessageHandler ] [13:30:02,142]: Received OpenSecureChannelResponse.
INFO | jvm 6 | 2016/10/17 13:30:02 | DEBUG [UaTcpClientMessageHandler ] [13:30:02,142]: SecureChannel id=1142659191, currentTokenId=1, previousTokenId=-1, lifetime=3600000ms, createdAt=DateTime{utcTime=131211774021381154, javaDate=Mon Oct 17 13:30:02 CEST 2016}
INFO | jvm 6 | 2016/10/17 13:30:02 | DEBUG [UaTcpClientMessageHandler ] [13:30:02,145]: 0 message(s) queued before handshake completed; sending now.
INFO | jvm 6 | 2016/10/17 13:30:02 | DEBUG [UaTcpClientMessageHandler ] [13:30:02,147]: Received OpenSecureChannelResponse.
INFO | jvm 6 | 2016/10/17 13:30:02 | DEBUG [UaTcpClientMessageHandler ] [13:30:02,148]: SecureChannel id=1142659190, currentTokenId=1, previousTokenId=-1, lifetime=3600000ms, createdAt=DateTime{utcTime=131211774021224898, javaDate=Mon Oct 17 13:30:02 CEST 2016}
INFO | jvm 6 | 2016/10/17 13:30:02 | DEBUG [UaTcpClientMessageHandler ] [13:30:02,149]: 0 message(s) queued before handshake completed; sending now.
INFO | jvm 6 | 2016/10/17 13:30:02 | DEBUG [ClientSessionManager ] [13:30:02,157]: Sending ActivateSessionRequest, secureChannelId=1142659188, channel=[id: 0x5e775f5a, L:/10.224.27.128:58161 - R:/10.224.27.128:4840]…
INFO | jvm 6 | 2016/10/17 13:30:02 | DEBUG [ClientChannelManager ] [13:30:02,164]: Channel bootstrap succeeded: localAddress=/10.224.27.128:58163, remoteAddress=/10.224.27.128:4840
INFO | jvm 6 | 2016/10/17 13:30:02 | DEBUG [ClientSessionManager ] [13:30:02,167]: Session activated: NodeId{ns=1, id=1}
INFO | jvm 6 | 2016/10/17 13:30:02 | DEBUG [ClientSessionManager ] [13:30:02,167]: subscriptionCount=1, transferNeeded=true
INFO | jvm 6 | 2016/10/17 13:30:02 | DEBUG [ClientSessionManager ] [13:30:02,167]: Sending TransferSubscriptionsRequest…
INFO | jvm 6 | 2016/10/17 13:30:02 | ERROR [UaTcpClientMessageHandler ] [13:30:02,170]: Received error message: ErrorMessage{error=StatusCode{name=Bad_NotFound, value=0x803E0000, quality=bad}, reason=null}
[/quote]

What server are you connected to?

The TransferSubscription service is allowed to fail as part of the reconnect process… but only in certain ways.

This is the OPC server from Tani

Well what’s happening is that the server is ignoring the TransferSubscriptionRequest all together (or erroring internally, or something) which is why the request ends up timing out. This is non-conformant behavior. It’s not required that the server support this request, but it is required that it responds (with, say… Bad_ServiceUnsupported).

So what’s happening here is that the client connection state machine, always making an effort to not lose any data, is assuming something went wrong with the connection so it reconnects, still preserving hope that the subscription transfer will succeed, because the server has not indicated that the service is not supported or the subscriptions no longer exists to be transferred.

Thank you. I will contact Tani support tomorrow. I hope they are as fast as you in answering :thumb_left:

FWIW we’ve seen this with another customer. It was with something built on National Instruments OPC-UA SDK, which itself may be built on top of one of the toolkits from the well known vendors… not sure.

If you can find out, I’m curious what toolkit this Tani server is built on.

FYI: The problem ist fixed in the current TANI version 1.12. There is now the Exception

INFO [OpcUaConnection[TaniOPC] ] [14:35:29,938]: Subscription transfer failed, id=196609, status=StatusCode{name=Bad_NotImplemented, value=0x80400000, quality=bad}
After that, the reconnect is successfull.

[quote=“chi”]FYI: The problem ist fixed in the current TANI version 1.12. There is now the Exception

INFO [OpcUaConnection[TaniOPC] ] [14:35:29,938]: Subscription transfer failed, id=196609, status=StatusCode{name=Bad_NotImplemented, value=0x80400000, quality=bad}
After that, the reconnect is successfull.[/quote]

Technically they should return Bad_ServiceUnsupported, but the client accepts Bad_NotImplemented, Bad_NotSupported, and Bad_OutOfService as well :thumb_left:

@Kevin.Herron
I’m running on 8.0.15 version. I have the same issue, reconnect to remote server fails. If i edit the connection, the connection is re-established.
Error message:

[remote=NOC-OWPP-SCADA/10.2.1.136:4862] errorMessage=ErrorMessage{error=StatusCode{name=Bad_ServiceUnsupported, value=0x800B0000, quality=bad}, reason=null}

Ignition 8.1.1 contains a workaround for these incorrectly implemented servers that should get the reconnect to happen.

Thanks @Kevin.Herron
We upgraded to 8.1.1 and it works.