System Overhead Time Slice Missing? Debugging High Load Factor

grietveld · July 27, 2021, 2:55pm

Screenshot is attached below, but essentially we are chasing to try and reduce the load factor on our PLC. We have an Allen-Bradley 1756-L83ES/B PLC connected to an Ignition 8.1.4 Gateway, and we are currently seeing 350% Load Factor on one of our scan classes. The response time isn’t horrible, averaging around 25ms, but still looking to try and reduce this or at least optimize it a bit further.

In the process of trying to debug this, I noticed that the ‘System Overhead Time Slice’ settings in Logix is no longer there. Is there a new location for similar settings to look for? Or is this setting just irrelevant now on newer devices?

Any other tips on lowering the load factor? We are going to try 8.1.7 when we have a break on the line to try and see if the new load factor calculation shows acceptable data, but I have a feeling its not going to bring it down to a reasonable number.

bschroeder · July 27, 2021, 3:06pm

With the L8X PLCs there isn’t a system overhead time slice anymore as the comms are now broken off into a separate CPU on that PLC itself. Before the comms had to share resources on the same CPU as the processing of the PLC but with the L8s that isn’t the case.

Are you using AOIs or UDT? If AOIs then you will probably want to start down the path of breaking the data that you want in the HMI into UDTs. See this long discussion: https://forum.inductiveautomation.com/t/device-load-factor

grietveld · July 27, 2021, 3:11pm

We are using quite a bit of UDTs both on the PLC with corresponding UDTs in Ignition. We’ve even set tags that we don’t need to refresh on separate refresh rate of 10s to try and reduce the number of tags we have polling.

Good to know about the L8X, I’m still a bit of a noob on the PLC side of things, so I am not as up to speed on these changes and the AB documentation I checked first didn’t call that out.

Kevin.Herron · July 27, 2021, 3:16pm

UDTs are fine, just make sure that all members of the UDT and its member UDTs have external access set to ReadOnly or Read/Write (talking about in the PLC, not Ignition).

AOIs are problematic and not read as efficient as UDTs, as mentioned in other threads here.

You might also upgrade to 8.1.7 to get the new diagnostics implementation. Nothing has changed about the polling, but you’ll probably see different load factor (overload factor, now) numbers, for better or worse

bschroeder · July 27, 2021, 3:17pm

I would double check on the UDT vs AO just to make sure.

Don’t split the UDT’s across multiple groups either. If you have bools that are in two different tag groups, then ignition will possibly subscribe to the packed ints from both. We saw that in our testing. I would try dropping to one single tag group and verifying that everything is being done via UDTs and not AOIs.

grietveld · July 27, 2021, 3:24pm

I’ve got a mirrored gateway setup to swap out on the line once the customer gives me some downtime to swap it out and see what the numbers look like, they don’t currently want to update beyond the version they are on so it will be temporary for now.

We did just notice in the AB webpage that we are maxing out our Class 3 MSG. Any suggestions on reducing this? We are wondering if we are gonna have to get a dedicated network card to test that to ignition if we are currently maxing it the original out.

bschroeder · July 27, 2021, 3:32pm

Does the PLC use a bunch of MSG instructions? Are there other SCADA systems connected to the PLC?

grietveld · July 27, 2021, 3:35pm

This is the only SCADA system connected to the line. I’m not sure what else would use these MSG Class 3

bschroeder · July 27, 2021, 3:41pm

If in the PLC they are using MSG instructions to communicate with other devices, IE other PLC’s, drives, switches etc… then that uses Class 3 IIRC. If there isn’t anything like that in the PLC then you are probably looking at something that Ignition is trying to read that it isn’t able to optimize, and that is typically AOIs or as @Kevin.Herron mentioned the external access. Also if there are AOI’s in the UDTs that you are reading then that will cause issues.

Also to properly optimize the UDT in the PLC side, the bools need to be consecutive. The UDT implementation on the PLC side will compact that bools into a single SINT (IIRC) and then ignition will read that SINT rather than the individual BOOL. If the BOOLS are scattered though, then there is a TON of extra data that is being read. And if you have two BOOLS that are being read by different tag groups, but those BOOLS are in the same SINT in the UDT then it’s possible that Ignition might subscribe twice to the same SINT.

Kevin.Herron · July 27, 2021, 3:56pm

Poll slower or poll less tags. Not really much else to it. Your screenshot showed 727 request in the 1000ms group, not sure if there are others, but that's already a lot.

(better packing of UDTs for use by the HMI, using UDTs instead of AOIs, using arrays if possible, etc... are all ways of polling less tags in addition to just using less tags in Ignition)

grietveld · July 27, 2021, 3:57pm

We aren’t referencing AOIs directly, the tags we are reading from the PLC are updated within AOIs, but are UDTs within the PLC. Does that make a difference? or Is that still going to be a problem?

Good to know on grouping booleans, we aren’t doing that consistently now, so we have some room to potentially improve there if we can get those UDTs updated on the PLC side. Our External access for these tags are set as readOnly or Read/Write depending on the type of tag, so that should be good.

grietveld · July 27, 2021, 3:58pm

Would adding a dedicated ethernet card for Ignition to communicate to help this issue? Or is the bottle neck going to be more on the processor itself?

bschroeder · July 27, 2021, 4:02pm

Hmm…

With the UDTs are you picking and choosing the tags in the UDT or are you reading the entire UDT?
Do these UDT’s have any AOI’s in them?
It sound’s like the way that things are structured in the PLC isn’t able to be optimized properly. That would signal back to the bools not being packed properly or that there are AOI tags being read. I’d take a detailed look at how your UDTs are structured and make sure that they are arranged properly so that the bools can all be packed.

grietveld · July 27, 2021, 4:06pm

We are reading the whole UDT, but that is because we built these UDTs to contain the tags we needed and nothing else.

We are looking to review that with our Controls team to see if we can try and update the UDTs to be a bit more optimized. Testing the calculation in 8.1.7 now as well while the team takes lunch.

Kevin.Herron · July 27, 2021, 4:07pm

Hmm, that's ideal. Sounds like you are just reading a lot of tags, at a rate faster than the PLC can deliver.

bschroeder · July 27, 2021, 4:12pm

I would say then that looking at how you have the BOOLS organized would be a good place to start.

Also like I said when we did our testing, having an extra tag group that read bools that don’t change much, ie config stuff, actually hurt us as this caused Ignition to subscribe to the same SINT twice. So I would start off by packing those BOOLs and then dropping that second tag group and seeing where you get.
You might also be able to play with the max concurrent requests and CIP size for the device and see if you can improve anything.

pturmel · July 27, 2021, 4:36pm

Also use the larger connection size: 4000 for L8x instead of the default 500.

grietveld · July 27, 2021, 4:37pm

I’ll keep playing with those settings and see what happens.

The number in 8.1.7 looked very similar to what we are seeing in 8.1.4, but I do like seeing those queue time statistics. We are seeing about 3s of delay there it seems. I had to revert since the customer hasn’t approved the update yet, but that at least confirms that it isn’t just the calculation making it look worse than it is.

I did just notice something that seems to either confirm what you are saying about subscribing twice, or I have another issue somewhere. On the device screen it shows I’m subscribed to 52k tags, but on the tag provider screen it says 102k…So sounds like I’ve got some configurations to update

grietveld · July 27, 2021, 4:39pm

We were at 2000, but I’ve updated to 4000 to see if that helps as well.

grietveld · July 27, 2021, 7:07pm

So updating the CIP size to 4000 seems to have had some interesting results…and I’m not sure if this means I’m in a better spot or worse.

The 'Request counts now dropped to 368 instead of 728. But now our Throughput dropped down to ~77 and our mean response time is 52ms with an even higher load factor of 460% now.

What numbers should I be focusing on to help reduce the response time? Since we did seem to reduce the number of requests, but I would think bigger number is better on throughput and lower number is better for response time.