High CPU Usage - Diagnosing

Consider including the -Xloggc:.... and -XX:+PrintGCDetails options.

Got it, this is what it looks like now:

wrapper.java.additional.1=-XX:+UseG1GC
wrapper.java.additional.2=-XX:MaxGCPauseMillis=100
wrapper.java.additional.3=-Ddata.dir=data
wrapper.java.additional.4=-Dorg.apache.catalina.loader.WebappClassLoader.ENABLE_CLEAR_REFERENCES=false
wrapper.java.additional.3=-Xloggc:…/logs/javagc-%WRAPPER_TIME_YYYYMMDDHHIISS%.log
wrapper.java.additional.5=-XX:+PrintGCDetails
wrapper.java.additional.6=-XX:+PrintGCTimeStamps
wrapper.java.additional.7=-XX:+PrintGCDateStamps

The “additional” numbers must be unique. After cutting and pasting the lines, renumber the ones that aren’t commented out.

Boy am I dense today! Thanks, corrected it.
A quick update: our ESX engineers moved 6 (out of 9 total) other VMs that were on our same host to a different one, and this happened:

We will have to define some parameters for them to isolate our VMs our guarantee resources. Is there a set of best practices for running Ignition in a virtualized environment?

Thanks!
Oscar.

An update on this and seeking some more help:

We added two vCPUs to our VM yesterday and restarted the Gateway after the updates to the ignition.conf to use the new Garbage Collector exclusively.

Here’s what our ignition.conf looks like:
wrapper.java.additional.1=-XX:+UseG1GC
wrapper.java.additional.2=-XX:MaxGCPauseMillis=100
wrapper.java.additional.3=-Ddata.dir=data
wrapper.java.additional.4=-Dorg.apache.catalina.loader.WebappClassLoader.ENABLE_CLEAR_REFERENCES=false
wrapper.java.additional.5=-Xloggc:…/logs/javagc-%WRAPPER_TIME_YYYYMMDDHHIISS%.log
wrapper.java.additional.6=-XX:+PrintGCDetails
wrapper.java.additional.7=-XX:+PrintGCTimeStamps
wrapper.java.additional.8=-XX:+PrintGCDateStamps

After adding the 2 vCPUs (6 vCPUs total), our CPU usage drop considerably:

However, our memory profile is looking a bit odd, and we had 2 crashes overnight (we had to restart the gateway).
Our Memory profile before the GC changes:

and after:

We did not get a change to grab a Thread Dump, but the logs indicate issues connecting to devices (OPCUA) and memory full events for alarms.
It also looks like this line:
wrapper.java.additional.5=-Xloggc:…/logs/javagc-%WRAPPER_TIME_YYYYMMDDHHIISS%.log
is not working as I don’t see any java logs in the logs folder? Maybe this could be causing some overhead?

Thanks,

Oscar.

So using G1GC with a pause target really smoothed out your memory profile. It's not clear to me why you have no GC log file. There's probably an error in the wrapper log at startup indicating the problem with that parameter. Consider using a full path to the log file.
The memory full indication suggests that something is happening that spikes memory usage and 8GB of RAM isn't enough to handle it. Whether it is a bug causing a memory leak or a legitimate need for lots of memory (I'm looking at you, reporting!), I can't tell. There are probably clues in your wrapper log. Can you give the VM a bunch more memory (increasing the max in ignition.conf accordingly) to see if you can ride through the spike? If so, and the profile returns to the vicinity of 6GB afterwards, then I would suggest it is a legitimate load.

1 Like

Hi,
I will try that; I’ll report back with the results of increasing the RAM.
One of the common errors we saw in the logs was around alarm journaling; we disabled it around 9:30AM this morning and that helped the load. This is likely not the culprit, but most definitely a contributor.

You are spot on on the GC log file… I am trying using backslashes for the path next (…\logs\javagc…)
INFO | jvm 1 | 2018/12/13 12:27:56 | Java HotSpot™ 64-Bit Server VM warning: Cannot open file …/logs/javagc-20181213122746.log due to No such file or directory

Oscar.