Server keeps dropping modbus connections to all plc's

So it just started today. Ignition can’t keep a connection to a single plc in the field. Keeps getting a bunch of these:

INFO | jvm 1 | 2020/04/24 23:41:14 | ERROR [SocketIODelegate ] [22:41:14,503]: [hostname=,port=503] Socket connection closed, DriverState was Connected.
INFO | jvm 1 | 2020/04/24 23:41:14 | java.io.IOException: End of stream reached.
INFO | jvm 1 | 2020/04/24 23:41:14 | at com.inductiveautomation.iosession.socket.AsyncSocketIOSession.run(AsyncSocketIOSession.java:74)
INFO | jvm 1 | 2020/04/24 23:41:14 | at java.lang.Thread.run(Thread.java:745)

Any thoughts, we are dead in the water.

Thanks,

That means something outside Ignition closed the connection. It is common for Modbus TCP devices to have a 10-second inactivity timeout. You might need to have at least one tag polled faster than your devices’ timeouts to keep connections alive.

1 Like

Do you still see that as a possibility if it’s been working for years and all of a sudden every plc, over a hundred, stopped working simultaneously?

Have you tried restarting the Modbus driver? Also, sometimes just editing the device, then hit save helps re-establish and maintain connection. You said all of a sudden every PLC stopped working…is everything on a wired network or wireless?

I’ve restarted the server a couple of times. The servers are wired, the plc’s are cell. Been that way for years. I can still hit the modem the plc’s are attached to and the configs on modems look good.

Well, no. That suggests a change somewhere. OS change? OS Updates? OS Security Policy change? Network topology change? Network infrastructure OS updates?

When something goes from working to non-working, troubleshooting 101 is “what else changed at that time?”

Wireshark captures from a network node along the path from Ignition to the devices would be helpful, I suspect.

1 Like

True, but the only change was about a week before the problem there was an os update. This did jack up exim, which I just fixed an hour ago, couldn’t’ receive emails as root. Still can’t get the emails through sendgrid, but at least they aren’t sitting in a queue forever on the server with exim. I’ll try and get a *nix version capture and see what it says.

Anyone smart enough to run exim should be able to figure this out… (:

{ Full disclosure: I run exim for both home and work domains. I may be biased. }

Hahaha, that’s what I figured:). So I ran wire shark, nothing but keep alives, nothing real suspicious going on

1 0.000000000 <external> → <internal>   TCP 55 502 → 51974 [ACK] Seq=1 Ack=1 Win=512 Len=1
2 0.000024649   <internal> → <external> TCP 54 51974 → 502 [ACK] Seq=1 Ack=2 Win=28400 Len=0
3 10.619541752 <external> → <internal>   TCP 55 [TCP Keep-Alive] 502 → 51974 [ACK] Seq=1 Ack=1 Win=512 Len=1
4 10.619566099   <internal> → <external> TCP 54 [TCP Keep-Alive ACK] 51974 → 502 [ACK] Seq=1 Ack=2 Win=28400 Len=0
5 21.268434244 <external> → <internal>   TCP 55 [TCP Keep-Alive] 502 → 51974 [ACK] Seq=1 Ack=1 Win=512 Len=1
6 21.268469819   <internal> → <external> TCP 54 [TCP Keep-Alive ACK] 51974 → 502 [ACK] Seq=1 Ack=2 Win=28400 Len=0
7 31.440785541 <external> → <internal>   TCP 55 [TCP Keep-Alive] 502 → 51974 [ACK] Seq=1 Ack=1 Win=512 Len=1
8 31.440835644   <internal> → <external> TCP 54 [TCP Keep-Alive ACK] 51974 → 502 [ACK] Seq=1 Ack=2 Win=28400 Len=0
9 42.040864997 <external> → <internal>   TCP 55 [TCP Keep-Alive] 502 → 51974 [ACK] Seq=1 Ack=1 Win=512 Len=1

10 42.040899699 → TCP 54 [TCP Keep-Alive ACK] 51974 → 502 [ACK] Seq=1 Ack=2 Win=28400 Len=0

Not familiar with the linux side of things, but have you tried to just manually send out a Modbus read command to one of the PLCs using another PC? Possibly through the same network as the Ignition server. Just to prove whether it is a network connection related issue or Ignition itself. You could try Modbus Master or similar tool. PLCs are on cellular…I’m assuming through VPN. Keep Alives are going through, so I’m assuming other commands should work as well.

modpoll returns the expected data

I’m seeing a whole lot of RST ACK coming in, any thoughts? Thanks

From which direction? Those should be preceded by a RST from the side that is deliberately closing the connection.

Actually, that is a combo packet where the sender is acknowledging the prior transmission from the peer, and also notifying the peer to close the connection. A deliberate connection close. More here:

Having nothing but keep-alives is suspicious. Where are your regular reads?

Did something change recently that reduced your poll rate on these PLCs ?

Nothing changed on poll rates, so it seems like ignition is closing the connections then it’s it’s getting an ack from the plc. What would cause that?

Which end is sending the RST,ACK packets? That is the end that is closing the connection.

Scratch that, the plc’s are sending the acknowledgement. Currently have the poll rate at 15sec, always been that way, tried at 10 with same issue.

Try at nine seconds.

Have your PLCs gotten a firmware upgrade? Or have a configurable idle timeout that some other team is responsible for (and changed recently)?