CPU and RAM problems: Clock events and Garbage Collector configuration

Hello Ignition community,

I am experiencing problems related to high CPU and RAM usage on my SCADA system. I have observed SCADA clock events that seem to be linked to memory release by the Garbage Collector. Despite having the Garbage Collector configured in the ignition.conf file, I am still facing these issues.

Attached you will find two images illustrating the problems: the first one shows the system performance indicators, highlighting a CPU usage of 77% and a memory usage exceeding 50,000 MB. The second image is an excerpt from my ignition.conf file, where I have configured parameters such as initial and maximum heap size, and also specified the use of the G1 Garbage Collector with a maximum pause time of 100 milliseconds.

My question to the community is this: should I consider removing, modifying, or adding any parameters in the ignition.conf configuration to improve memory management and overall system performance? Are there any best practices or settings I should consider to alleviate these performance issues?

I would greatly appreciate your advice and suggestions based on your experience.

Not really enough information. You might have a memory leak. But first impression is that you've given Ignition too much memory, so when it finally is pressed into doing a complete garbage collection, it makes an unavoidable GC pause.

You should add GC logging to your ignition.conf.

2 Likes

What specific version of Ignition is this?
Is this running on a Virtual Machine?
What else is running on the machine?

1 Like

Related topic:

1 Like

A couple other notes:

  • Your ignition.conf initmemory and maxmemory should be the same on a production system. Have Ignition claim its complete allowance immediately so the OS won't have to clear buffers or other time-consuming cleanup later when Ignition asks for more RAM.

  • Your persistent high CPU load suggests you may not have enough CPU cores for your workload. Ignition is highly multi-threaded and very latency sensitive, and some mostly-idle cores are required for best performance. (And for fast GC.) That chart might be OK if you have ~32 cores. Otherwise, consider adding hardware, or splitting to a multi-gateway architecture.

The version of Ignition being used is 8.1.22.

Regarding the operating environment, Ignition is running on both Windows and Linux virtual machines.

As for the other processes, could you please specify what details you need? Are you referring to other Ignition modules, different applications, or system services running concurrently on the machine?

We continue to investigate the problem and believe it could be due to a Memory Leak caused by a script that we have already disabled. After restarting Ignition, only one user is running tests on Vision and, so far, we have observed an improvement in system behavior. This leads us to believe that the problem could be related to memory management when multiple users are connected simultaneously.

There is a possibility that we are facing a specific problem with Vision users. We made some changes after 13:30 and are keeping an eye on how these affect the overall system performance.

Vision is generally a light load on a gateway, for most tasks, since all of its user-interface bindings and scripts run on the client machine, not in the gateway itself. Vision gateway load is almost entirely due to tag binding delivery and running DB queries on behalf of the Vision clients. (Charting workloads, especially.)

Which means you should be carefully investigating all of your gateway scripting. Look especially at (mis-)uses of system.util.invokeAsynchronous and system.util.getGlobals as likely sources of memory leaks.

IA generally recommends against the use of virtual machines in production when IA native drivers are involved, as hypervisors are notorious about stealing Ignition's idle CPU time to burn on other vCPUs. You have to configure your hypervisor to not do this. (No CPU overcommit on the entire physical hypervisor running your Ignition VMs, at the very least.)

1 Like

Efficient use of resources in an Ignition environment is crucial, especially when dealing with systems operating in virtual machines. It is correct that most of the user interface tasks in Vision are handled by the client and should not impose a heavy load on the gateway. However, database queries and tag link delivery, especially in intensive charting operations, can significantly increase the load.

We understand the importance of reviewing our scripts, particularly those that use system.tag.writeBlocking(). At this time, we are conducting numerous tests and it would be difficult to make significant changes to the scripts without affecting these tests.

We will consider a detailed review of our scripts and possible optimizations at a more appropriate time, when it will not interfere with current operations and testing.

thanks again for your help Phil.

Yes, that's does load down the database, and if using the tag historian, adds substantial load to the gateway. If you are using wide tables with hundreds of thousands of rows, or more, there is some is a 3rd-party accelerator available:

Best used with EasyNoteChart...

{ /shameless plug }

1 Like