Major comms issues with Rockwell PLCs

All,

Over the weekend, we attempted to upgrade our system (3 Data Servers) to 8.1 (from 7.9). We have a local ignition install that connects to a ton of Rockwell PLCs (Many different) and Wonderware iHistorian. After the upgrade, the processors kept running, but we lost comms to about 1/2 of our PLCs (30-40), mostly CompactLogix with L32 or L35 and ControlLogix with ENBT cards.

After a bit of troubleshooting, we decided to revert back to 7.9, as we had to get the plant running and didn’t feel like we had time to waste.

After reverting, the problem persisted. We found that all the processors were running just fine, but we still couldn’t communicate with many of them - either through RSLinx (RSWho) or to our panelviews or Ignition clients. We slowly began cycling power to the entire PLC racks - which seemed to fix the comms for that particular PLC. Eventually, however, many of the PLCs we “fixed” stopped communicating after a random amount of time. Some did not fail again.

We have setup wireshark on about 20 different systems and have checked just about everything you can think to check. Sometimes the PLCs fix themselves after a reboot, sometimes it takes 3-4 reboots for them to establish comms.

We also have Wonderware iHistorian - which seems to be communicating with all the PLCs just fine.

We have tried to pause Wonderware connections on some of the controllers, paused any anti-virus on our clients as well as re-downloading the programs, updating firmware, and just about any of the usual things that you would expect.

Lastly, we have looked at CIP connections and most of them are at about 50% of the allowable. We have narrowed it down to mostly CompactLogix L32 and L35’s and any ControlLogix with ENBT cards.

I am hoping someone has seen this before.

Either way, we are at a loss. We have been on the phone with Rockwell, Inductive, and some other third parties without luck. The entire plant is running just fine, we just have limited visibility to some of the processors and aren’t getting faults. Our greatest risk is from not getting faults - like from our Boiler master controller. It’s not able to see the controllers on a few of our boilers - so it can’t control them as a group - which is a considerable risk - as we may lose steam, which would take the whole plant down.

I want to be clear that I don’t think that Ignition is necessarily the source of the issue, but seems to be the catalyst that started it.

Any thoughts or advice would be greatly appreciated.

What communications driver are you using with them? A while back I had issues with gateway CPU usage. After many hours of troubleshooting, I eventually came to the conclusion that it was the legacy AB drivers causing our issues. We switched to the “Allen-Bradley Logix driver” and have not had issues since. It may be worth a shot to try using the newer driver if you haven’t yet. On the add devices page, it specifies for firmware versions 21+, but I have used it successfully with processors under that firmware version. The only “gotcha” is that you will have to edit your tag OPC paths and remove the “Global.”prefix from the tags for them to work. Easiest way to do this is to export the tags and do a search and replace for Global.

1 Like

Were there any firmware updates on affected processors? The very last v20 from Rockwell breaks IA's legacy drivers. See this and many similar topics:

We’re going to give that a try real quick. What’s weird is that everything worked just fine prior to the upgrade and even after reverting back - it still had the same issue.