Ignition Freezing When Accessing Gateway / OPC UA Device Config – All Vision Clients Fault

Hello all,

I’m running into a strange and repeatable issue on one of our Ignition systems, and I’m not sure where to start looking.

Any time I open the Gateway and go to Config → OPC UA → Devices to add or modify a device, the entire Ignition system locks up. All Vision clients freeze, all templates go into fault, and the only way to recover is to stop the Ignition service and restart it. For instance, I modified the port number of an EFM ABB TotalFlow device, clicked Save. The Save never completed, just froze.

This also happened yesterday when I first logged into the Gateway and downloaded a backup (which I routinely do on all our installations). As soon as the backup process started, all screens went red and faulted.

This is a new build, and we’re in the final stages of commissioning.


Recent Changes (Past Two Weeks)

  • Created a view-only project

    • Exported the live project, imported it under a new name, removed editing permissions, etc.
    • This is to allow Corporate to view the operation without any ability to control equipment.
  • Added the EFM ABB TotalFlow driver

    • Preparing to bring new field flow meters online.

Troubleshooting Performed

I validated this Gateway against another nearly identical system we built in September (same version, same tag structure, MQTT setup, EFM config, etc.). Everything matches.

I’ve also tried:

  • Restarting the server

  • Stopping/restarting the Ignition service using the included batch files

  • Uninstalling and reinstalling the EFM ABB TotalFlow module

  • Removing the view-only project (no change), then adding it back

  • Confirmed the server is not under load

    • Current project has ~38,500 tags

    • CPU, RAM, and disk I/O are all very low

Nothing I’ve done has prevented the freeze when accessing OPC UA device configuration or performing a Gateway backup.


Looking for Suggestions

At this point I’m not sure if I’m chasing a module issue, a Gateway memory/thread issue, or something corrupted in the project itself. Does anyone have insight on what could cause the Gateway to hang simply by accessing OPC UA device settings or creating a backup?

Any guidance or places to look would be appreciated.

Version: 8.1.48 (b2025042910)
EFM ABB Totalflow Driver: 4.0.30 (b2025062418)
MQTT Transmission: 4.0.30 (b2025062418)
Local MySQL Historian

Thank you for any assistance, recommendations and suggestions.

Jason

Look at the wrapper.log file.

1 Like

IIRC, 8.1.48 may have introduced some additional synchronization in the device manager. I've seen this exacerbate what previously would have been nearly-inconsequential driver misbehavior into this deadlocked state.

If you can get a thread dump or three, a minute or so apart, while in this state, that may shed some light on the issue. I think the Web UI may work in this state, as long as it isn't hitting anything that depends on the device manager. Not sure if you would be able to get to the diagnostics page to do a thread dump or not.

If you'd rather not upload them to the forum, let me know, and I can get you a dropbox link.

I had this happen to me once before, specifically the Totalflow module was locking up the system. It got stuck in an infinite loop from the AarPoller thread which pegged one of the CPUs at max and in the logs it kept just saying timeout and retrying endlessly.

The root cause in my case was one of the ABB RTU’s had an overflow of alarms; the techs previously put a wrong parameter in that sent the volume total to something silly like 10^35. The way the RTU works when the totalizer overflows it doesn’t reset to zero, it simply subtracts a set value, like a million, from the totalizer. Every time generating an alarm in the alarm table. Eventually the techs fixed the issue and reset the volume totalizer but when the Totalflow driver connected and tried to poll the alarms it crashed the driver and caused an infinite loop.

No idea if it’s similar but I’d suggest trying to disable reading all the alarms/events/records and then seeing if it stabilizes there first, then enable one at a time and see if it starts.

@Cody_Morgan in my situation I could get to anywhere in the gateway config page, and all the diagnostics work. It was only specifically the OPC UA devices page that would not load. But again, not sure if it’s same issue as OP.

1 Like

It sounds like it could be similar.

It was only specifically the OPC UA device page that would not load.

That's good to know. My head has been in the 8.3 space lately, and I want to say there may be other pages that end up requesting info through the device manager. So, if it's locked up, those pages end up in a bad state. I might be misremembering some details, though. My focus isn't usually the Web UI.

1 Like

With the way the 8.1 backend worked (SSR via Wicket) combined with the single-threaded IDB, it's entirely possible that one subsystem causing issues with the IDB would affect other pages, though usually the hardcoded 30s timeout on SQLite acts as enough of a bandaid you can still go in and fix things.

Thread dumps would be the definitive way to diagnose, for sure.

1 Like