Communication Issue with Allen Bradley PLC

rbachman · April 26, 2024, 9:59pm

We're having an issue with communication to a couple of Allen Bradley PLCs, and I'm trying to track down what might be happening. Here are few details.

We originally used a "Subscribed" group for the tags, but we discovered that the tag value would change on the PLC without always being reflected in Ignition. The rate for the tag group was 500ms, and we saw changes on the PLC tag that would not be updated correctly for 30 seconds or more.
I changed the tag group to a Polled tag group to try to do some investigation, and I have discovered a couple of items, all of which should probably be expected.
1. When the communication issues occur, the tag will show up as Bad_Failure in the tag browser.
2. Tag history will show gaps with the tag when it's in this status.
I'm capturing traffic via Wireshark, and traffic appears to continue without issue even when we're having issues with communication.

How do I best dig into the Wireshark data to try to understand why the tag data is so intermittent when it appears that the network traffic itself appears to be fine? Any tips on what I should be looking for within the Wireshark packets?

rbachman · April 26, 2024, 10:02pm

Also, I should mention that there are no errors in the logs for the connection to the PLC. If I had to guess, I would think that the PLC is sending some sort of malformed data from time to time, but I don't even know how to start looking for this.

pturmel · April 26, 2024, 10:43pm

Model? Firmware version? Driver? Driver settings?

Anything show in driver diagnostics?

rbachman · April 29, 2024, 4:42pm

The PLC is a ControlLogix 1756-L73 running firmware 32.011
Ignition communicates with the PLC via a 175-EN2T card running firmware 10.007.
We are running 8.1.38 and using the standard Allen-Bradley Logix Driver.
We are not seeing any errors in the Ignition logs, and the load on the PLC (as reported by Ignition) consistently stays very low.

We're starting to think there is an issue with the EN2T card. The web UI is not always available, and one of our controls engineers recently saw the CPU pegged at 100% even though the CPU usage on the card is normally very low. The team has added a secondary card, and we're monitoring one of the key tags through that card. Once we see an event happen again, we might be able to confirm if there is an issue with the EN2T card.

bschroeder · April 29, 2024, 4:51pm

Are you doing any other comms through that card other than PLC tags? IE do you have messages, produced/consumed tags, I/O connections?

MMaynard · April 29, 2024, 5:29pm

There is a known issue with that firmware and comms after doing certain PLC side edits live.
The solution is to upgrade the firmware on the PLC, or do a full download to the PLC of the program.

rbachman · April 29, 2024, 6:02pm

I just got off a call with IA, and I think we figured out what is happening, although I may need some more help figuring out how to fix it. When Ignition initiates a re-browse of the PLC, the PLC appears to withhold sending tag changes on existing tags until the PLC has completed sending all of the data for the re-browse.

This would fit with what we're seeing. Do you know what version of firmware the PLC should be upgraded to in order to avoid this issue?

rbachman · April 29, 2024, 6:08pm

Most of the communication should be using other network cards, so I don't think that should be causing this impact.

Kevin.Herron · April 29, 2024, 6:50pm

Let me clarify a couple things.

the PLC doesn't "withhold sending tag changes". Communication between Ignition and the PLC is request/response. Ignition polls the PLC.
that polling stops while a rebrowse happens is expected.

It's not safe to continue polling while a rebrowse is happening because the instance IDs that identify all your tags and UDT members may be about to change.

michael.flagler · April 29, 2024, 6:52pm

I just read through release notes and didn't see anything like this mentioned. There's some denial of service issues that it says are fixed in v32.016, but I don't even see that version available to download in ControlFlash Plus. It also doesn't give any details on the issue, as I wonder if they're not disclosing it since it's a security vulnerability. It does look like newer versions v33,34, etc fix some of these issues as well.

Kevin.Herron · April 29, 2024, 6:54pm

If the actual problem here is just that tags aren't polled while a rebrowse happens it's not a firmware bug.

The only firmware bug I remember that affected Ignition's driver was one where some kinds of online edits would cause tags to disappear from the browse and a browse error about template instance IDs not being found to happen. Some kind of bookkeeping error in the PLC after the online edits. I don't know officially what versions it occurred in or was fixed in.

michael.flagler · April 29, 2024, 6:57pm

If they're having an issue on a device, would it be helpful for them to disable automatic re-browses? This would stop rebrowsing, so any updates to tags that are needed in Ignition would require a manual rebrowse (possibly be editing and saving the connection), but normally would keep comms flowing without interruption?

Kevin.Herron · April 29, 2024, 6:58pm

No, disabling automatic rebrowse when you know there will be changes to the PLC program is dangerous. I'm not sure why the driver even has that option. Probably something sales engineering asked for and nobody said "no" even though they should have.

michael.flagler · April 29, 2024, 7:00pm

Ok, I didn't know if logic changes triggered a rebrowse even if tags in the PLC weren't changing. (I've never sat and watched what actually happens when making edits, and I've never heard any complaints either about data not updating when I'm making changes to a program).

Kevin.Herron · April 29, 2024, 7:01pm

I think in most cases a rebrowse is quick enough that nobody notices.

rbachman · April 29, 2024, 7:17pm

What would be possible causes of very slow re-browses? I just ran a test and it took 15 seconds for one of our connections that it monitoring a single tag.

Kevin.Herron · April 29, 2024, 7:20pm

Large quantity of tags defined in the PLC or a PLC that is too busy or doesn't have enough overhead for servicing comms. Or browsing over VPN or a remote link where everything is just slower.

michael.flagler · April 29, 2024, 7:58pm

Also, depending on PLC configuration, the L7x series still shared comms processor timeslices with the continuous task. If your system overhead timeslice is too low, it can cause slow comms performance. Also, if you're using lots of periodic tasks and overloading the processor with those, they take a higher precedence/priority than Class 3 comms which is what HMI applications use and comms performance will suffer no matter what you have your overhead timeslice set to.

We have customers upgrading from the L7x series to the L8x series just because of the increased comms performance, so if it's a big issue, it may be something to consider.

pturmel · April 29, 2024, 10:05pm

Heh, not for some crazy people trying to do 100ms sampling on few thousand of their tens of thousands of tags.

FWIW, my driver does rebrowses in parallel with live traffic, as tag IDs often don't significantly change, unless the PLC is completely downloaded. (Editing tags on-line is limited to adding new tags and deleting unused tags.) And then everything stops anyways.

rbachman · April 29, 2024, 10:18pm

This particular connection is direct on the same LAN/VLAN, so I don't think this is the issue.

We've already mentioned this one time to our controls team, and I think we need to revisit this. I doubt that we'll really want to upgrade to the L8x series, so maybe we can optimize the comms with the L7x for now.

Thank you all for your help and input on this.