Dumping gateway threads

I know this is a little unrelated, but I have a problem with my gateway overusing CPU and memory. This has been happening ever since we updated to 8.0 (we’ve been using Ignition since 7.5-ish). See below.

I’m trying to track down what these scripts are. The way I planned to do it is to use the Thread ID and use system.util.threadDump() to then get insight into what these scripts are. However, when I used @nminchin’s function to dump threads, I didn’t get any of the IDs to match (ie. thread 5807 was nowhere to be found). Does this have to do with the fact that I ran the thread dump from my local computer’s Script Console?

How would I do a thread dump to investigate what those threads are?

For reference, here is the script I’m talking about:

def saveThreadDump():
	'''
	Description:
		Takes a thread dump and records the current CPU, memory, and disk utilisation at the top of the file. 
		This method is far more useful that using the webpage thread download button as it contains CPU usage per thread.
		File is stored local to the client that issues the function call.
	'''
	now_formatted = system.date.format(system.date.now(), 'yyyyMMdd_HHmmss')
	filename = 'IgnitionThreadDump_AU6_{}.txt'.format(now_formatted)
	
	fileFolderPath = 'C:/Ignition Reports/Thread Dumps/'
	filePath = '{}{}'.format(fileFolderPath, filename)
	
	dump = system.util.threadDump()
	# collect some basic overall performance stats and record them at the top of the file
	stats = system.tag.readBlocking(['[System]Gateway/Performance/CPU Usage', '[System]Gateway/Performance/Disk Utilization', '[System]Gateway/Performance/Memory Utilization'])
	cpu_util = stats[0].value*100
	disk_util = stats[1].value*100
	mem_util = stats[2].value*100
	
	dump = 'CPU Utilisation: {:.2f}%\r\nMemory Utilisation: {:.2f}%\r\nDisk Utilisation: {:.2f}%\r\n{}'.format(cpu_util, mem_util, disk_util, dump)
	
	system.file.writeFile(filePath, dump)
	
	return filePath

Here is what our gateway looks like on a normal day. The gateway will work fine for the most part until CPU usage eventually ramps up to 99% or something close and the gateway server locks up. After a reboot, the server is functional until the next time it happens, which is about once a week.

Yes. You need to run it in gateway scope. The simplest way is to call it from a gateway message handler. Then you can send a message to that handler from the designer script console.

If you’re on a recent-ish 8.1.X version, system.util.threadDump will be a JSON-formatted dump that’s friendlier to parse (e.g., shameless plug, via Kindling).

We’re on 8.1.14, and system.util.threadDump() did return a pretty nicely formatted json.

Here is a condensed version with the threads that are identified in the long running scripts:

{
  "name": "perspective-worker-735",
  "id": 5807,
  "state": "RUNNABLE",
  "daemon": true,
  "system": "perspective",
  "scope": "Gateway",
  "cpuUsage": 4.4924031589958595,
  "lockedSynchronizers": [
	"java.util.concurrent.ThreadPoolExecutor$Worker@664d3bf9"
  ],
  "stacktrace": [
	"app//com.inductiveautomation.ignition.common.gson.internal.LinkedTreeMap.find(LinkedTreeMap.java:150)",
	"app//com.inductiveautomation.ignition.common.gson.internal.LinkedTreeMap.put(LinkedTreeMap.java:94)",
	"app//com.inductiveautomation.ignition.common.gson.JsonObject.add(JsonObject.java:64)",
	"app//com.inductiveautomation.ignition.common.script.adapters.PyJsonObjectAdapter.__setitem__(PyJsonObjectAdapter.java:135)",
	"org.python.pycode._pyx1400.transform$1(\u003ctransform\u003e:16)",
	"org.python.pycode._pyx1400.call_function(\u003ctransform\u003e)",
	"app//org.python.core.PyTableCode.call(PyTableCode.java:173)",
	"app//org.python.core.PyBaseCode.call(PyBaseCode.java:306)",
	"app//org.python.core.PyFunction.function___call__(PyFunction.java:474)",
	"app//org.python.core.PyFunction.__call__(PyFunction.java:469)",
	"app//org.python.core.PyFunction.__call__(PyFunction.java:464)",
	"app//com.inductiveautomation.ignition.common.script.ScriptManager.runFunction(ScriptManager.java:831)",
	"app//com.inductiveautomation.ignition.common.script.ScriptManager.runFunction(ScriptManager.java:813)",
	"app//com.inductiveautomation.ignition.gateway.project.ProjectScriptLifecycle$TrackingProjectScriptManager.runFunction(ProjectScriptLifecycle.java:806)",
	"com.inductiveautomation.perspective.gateway.binding.transforms.script.ScriptTransform.runFunction(ScriptTransform.java:113)",
	"com.inductiveautomation.perspective.gateway.binding.transforms.script.ScriptTransform.synchronousTransformInternal(ScriptTransform.java:142)",
	"com.inductiveautomation.perspective.gateway.binding.transforms.AbstractSynchronousTransform.transform(AbstractSynchronousTransform.java:30)",
	"com.inductiveautomation.perspective.gateway.model.AbstractBindingHarness$TransformSequencer.transform(AbstractBindingHarness.java:259)",
	"com.inductiveautomation.perspective.gateway.model.AbstractBindingHarness$TransformSequencer.run(AbstractBindingHarness.java:272)",
	"java.base@11.0.13/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)",
	"java.base@11.0.13/java.util.concurrent.FutureTask.run(Unknown Source)",
	"java.base@11.0.13/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)",
	"java.base@11.0.13/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)",
	"com.inductiveautomation.perspective.gateway.threading.BlockingWork$BlockingWorkRunnable.run(BlockingWork.java:42)",
	"java.base@11.0.13/java.lang.Thread.run(Unknown Source)"
  ]
},
{
  "name": "perspective-worker-750",
  "id": 5982,
  "state": "RUNNABLE",
  "daemon": true,
  "system": "perspective",
  "scope": "Gateway",
  "cpuUsage": 6.2657201954415935,
  "lockedSynchronizers": [
	"java.util.concurrent.ThreadPoolExecutor$Worker@33347937"
  ],
  "stacktrace": [
	"app//com.inductiveautomation.ignition.common.gson.internal.LinkedTreeMap.find(LinkedTreeMap.java:150)",
	"app//com.inductiveautomation.ignition.common.gson.internal.LinkedTreeMap.put(LinkedTreeMap.java:94)",
	"app//com.inductiveautomation.ignition.common.gson.JsonObject.add(JsonObject.java:64)",
	"app//com.inductiveautomation.ignition.common.script.adapters.PyJsonObjectAdapter.__setitem__(PyJsonObjectAdapter.java:135)",
	"org.python.pycode._pyx1400.transform$1(\u003ctransform\u003e:16)",
	"org.python.pycode._pyx1400.call_function(\u003ctransform\u003e)",
	"app//org.python.core.PyTableCode.call(PyTableCode.java:173)",
	"app//org.python.core.PyBaseCode.call(PyBaseCode.java:306)",
	"app//org.python.core.PyFunction.function___call__(PyFunction.java:474)",
	"app//org.python.core.PyFunction.__call__(PyFunction.java:469)",
	"app//org.python.core.PyFunction.__call__(PyFunction.java:464)",
	"app//com.inductiveautomation.ignition.common.script.ScriptManager.runFunction(ScriptManager.java:831)",
	"app//com.inductiveautomation.ignition.common.script.ScriptManager.runFunction(ScriptManager.java:813)",
	"app//com.inductiveautomation.ignition.gateway.project.ProjectScriptLifecycle$TrackingProjectScriptManager.runFunction(ProjectScriptLifecycle.java:806)",
	"com.inductiveautomation.perspective.gateway.binding.transforms.script.ScriptTransform.runFunction(ScriptTransform.java:113)",
	"com.inductiveautomation.perspective.gateway.binding.transforms.script.ScriptTransform.synchronousTransformInternal(ScriptTransform.java:142)",
	"com.inductiveautomation.perspective.gateway.binding.transforms.AbstractSynchronousTransform.transform(AbstractSynchronousTransform.java:30)",
	"com.inductiveautomation.perspective.gateway.model.AbstractBindingHarness$TransformSequencer.transform(AbstractBindingHarness.java:259)",
	"com.inductiveautomation.perspective.gateway.model.AbstractBindingHarness$TransformSequencer.run(AbstractBindingHarness.java:272)",
	"java.base@11.0.13/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)",
	"java.base@11.0.13/java.util.concurrent.FutureTask.run(Unknown Source)",
	"java.base@11.0.13/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)",
	"java.base@11.0.13/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)",
	"com.inductiveautomation.perspective.gateway.threading.BlockingWork$BlockingWorkRunnable.run(BlockingWork.java:42)",
	"java.base@11.0.13/java.lang.Thread.run(Unknown Source)"
  ]
},
{
  "name": "perspective-worker-5259",
  "id": 41148,
  "state": "RUNNABLE",
  "daemon": true,
  "system": "perspective",
  "scope": "Gateway",
  "cpuUsage": 4.019518615943663,
  "lockedSynchronizers": [
	"java.util.concurrent.ThreadPoolExecutor$Worker@cf83248"
  ],
  "stacktrace": [
	"app//com.inductiveautomation.ignition.common.gson.internal.LinkedTreeMap.find(LinkedTreeMap.java:150)",
	"app//com.inductiveautomation.ignition.common.gson.internal.LinkedTreeMap.put(LinkedTreeMap.java:94)",
	"app//com.inductiveautomation.ignition.common.gson.JsonObject.add(JsonObject.java:64)",
	"app//com.inductiveautomation.ignition.common.script.adapters.PyJsonObjectAdapter.__setitem__(PyJsonObjectAdapter.java:135)",
	"org.python.pycode._pyx8985.transform$1(\u003ctransform\u003e:16)",
	"org.python.pycode._pyx8985.call_function(\u003ctransform\u003e)",
	"app//org.python.core.PyTableCode.call(PyTableCode.java:173)",
	"app//org.python.core.PyBaseCode.call(PyBaseCode.java:306)",
	"app//org.python.core.PyFunction.function___call__(PyFunction.java:474)",
	"app//org.python.core.PyFunction.__call__(PyFunction.java:469)",
	"app//org.python.core.PyFunction.__call__(PyFunction.java:464)",
	"app//com.inductiveautomation.ignition.common.script.ScriptManager.runFunction(ScriptManager.java:831)",
	"app//com.inductiveautomation.ignition.common.script.ScriptManager.runFunction(ScriptManager.java:813)",
	"app//com.inductiveautomation.ignition.gateway.project.ProjectScriptLifecycle$TrackingProjectScriptManager.runFunction(ProjectScriptLifecycle.java:806)",
	"com.inductiveautomation.perspective.gateway.binding.transforms.script.ScriptTransform.runFunction(ScriptTransform.java:113)",
	"com.inductiveautomation.perspective.gateway.binding.transforms.script.ScriptTransform.synchronousTransformInternal(ScriptTransform.java:142)",
	"com.inductiveautomation.perspective.gateway.binding.transforms.AbstractSynchronousTransform.transform(AbstractSynchronousTransform.java:30)",
	"com.inductiveautomation.perspective.gateway.model.AbstractBindingHarness$TransformSequencer.transform(AbstractBindingHarness.java:259)",
	"com.inductiveautomation.perspective.gateway.model.AbstractBindingHarness$TransformSequencer.run(AbstractBindingHarness.java:272)",
	"java.base@11.0.13/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)",
	"java.base@11.0.13/java.util.concurrent.FutureTask.run(Unknown Source)",
	"java.base@11.0.13/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)",
	"java.base@11.0.13/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)",
	"com.inductiveautomation.perspective.gateway.threading.BlockingWork$BlockingWorkRunnable.run(BlockingWork.java:42)",
	"java.base@11.0.13/java.lang.Thread.run(Unknown Source)"
  ]
},
{
  "name": "perspective-worker-5273",
  "id": 41388,
  "state": "RUNNABLE",
  "daemon": true,
  "system": "perspective",
  "scope": "Gateway",
  "cpuUsage": 5.319951109337202,
  "lockedSynchronizers": [
	"java.util.concurrent.ThreadPoolExecutor$Worker@37433ac"
  ],
  "stacktrace": [
	"app//com.inductiveautomation.ignition.common.gson.internal.LinkedTreeMap.find(LinkedTreeMap.java:150)",
	"app//com.inductiveautomation.ignition.common.gson.internal.LinkedTreeMap.put(LinkedTreeMap.java:94)",
	"app//com.inductiveautomation.ignition.common.gson.JsonObject.add(JsonObject.java:64)",
	"app//com.inductiveautomation.ignition.common.script.adapters.PyJsonObjectAdapter.__setitem__(PyJsonObjectAdapter.java:135)",
	"org.python.pycode._pyx8985.transform$1(\u003ctransform\u003e:16)",
	"org.python.pycode._pyx8985.call_function(\u003ctransform\u003e)",
	"app//org.python.core.PyTableCode.call(PyTableCode.java:173)",
	"app//org.python.core.PyBaseCode.call(PyBaseCode.java:306)",
	"app//org.python.core.PyFunction.function___call__(PyFunction.java:474)",
	"app//org.python.core.PyFunction.__call__(PyFunction.java:469)",
	"app//org.python.core.PyFunction.__call__(PyFunction.java:464)",
	"app//com.inductiveautomation.ignition.common.script.ScriptManager.runFunction(ScriptManager.java:831)",
	"app//com.inductiveautomation.ignition.common.script.ScriptManager.runFunction(ScriptManager.java:813)",
	"app//com.inductiveautomation.ignition.gateway.project.ProjectScriptLifecycle$TrackingProjectScriptManager.runFunction(ProjectScriptLifecycle.java:806)",
	"com.inductiveautomation.perspective.gateway.binding.transforms.script.ScriptTransform.runFunction(ScriptTransform.java:113)",
	"com.inductiveautomation.perspective.gateway.binding.transforms.script.ScriptTransform.synchronousTransformInternal(ScriptTransform.java:142)",
	"com.inductiveautomation.perspective.gateway.binding.transforms.AbstractSynchronousTransform.transform(AbstractSynchronousTransform.java:30)",
	"com.inductiveautomation.perspective.gateway.model.AbstractBindingHarness$TransformSequencer.transform(AbstractBindingHarness.java:259)",
	"com.inductiveautomation.perspective.gateway.model.AbstractBindingHarness$TransformSequencer.run(AbstractBindingHarness.java:272)",
	"java.base@11.0.13/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)",
	"java.base@11.0.13/java.util.concurrent.FutureTask.run(Unknown Source)",
	"java.base@11.0.13/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)",
	"java.base@11.0.13/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)",
	"com.inductiveautomation.perspective.gateway.threading.BlockingWork$BlockingWorkRunnable.run(BlockingWork.java:42)",
	"java.base@11.0.13/java.lang.Thread.run(Unknown Source)"
  ]
},
{
  "name": "perspective-worker-10048",
  "id": 57827,
  "state": "RUNNABLE",
  "daemon": true,
  "system": "perspective",
  "scope": "Gateway",
  "cpuUsage": 4.965287702048054,
  "lockedSynchronizers": [
	"java.util.concurrent.ThreadPoolExecutor$Worker@60a98039"
  ],
  "stacktrace": [
	"app//com.inductiveautomation.ignition.common.gson.internal.LinkedTreeMap.rebalance(LinkedTreeMap.java:311)",
	"app//com.inductiveautomation.ignition.common.gson.internal.LinkedTreeMap.find(LinkedTreeMap.java:175)",
	"app//com.inductiveautomation.ignition.common.gson.internal.LinkedTreeMap.put(LinkedTreeMap.java:94)",
	"app//com.inductiveautomation.ignition.common.gson.JsonObject.add(JsonObject.java:64)",
	"app//com.inductiveautomation.ignition.common.script.adapters.PyJsonObjectAdapter.__setitem__(PyJsonObjectAdapter.java:135)",
	"org.python.pycode._pyx14217.transform$1(\u003ctransform\u003e:16)",
	"org.python.pycode._pyx14217.call_function(\u003ctransform\u003e)",
	"app//org.python.core.PyTableCode.call(PyTableCode.java:173)",
	"app//org.python.core.PyBaseCode.call(PyBaseCode.java:306)",
	"app//org.python.core.PyFunction.function___call__(PyFunction.java:474)",
	"app//org.python.core.PyFunction.__call__(PyFunction.java:469)",
	"app//org.python.core.PyFunction.__call__(PyFunction.java:464)",
	"app//com.inductiveautomation.ignition.common.script.ScriptManager.runFunction(ScriptManager.java:831)",
	"app//com.inductiveautomation.ignition.common.script.ScriptManager.runFunction(ScriptManager.java:813)",
	"app//com.inductiveautomation.ignition.gateway.project.ProjectScriptLifecycle$TrackingProjectScriptManager.runFunction(ProjectScriptLifecycle.java:806)",
	"com.inductiveautomation.perspective.gateway.binding.transforms.script.ScriptTransform.runFunction(ScriptTransform.java:113)",
	"com.inductiveautomation.perspective.gateway.binding.transforms.script.ScriptTransform.synchronousTransformInternal(ScriptTransform.java:142)",
	"com.inductiveautomation.perspective.gateway.binding.transforms.AbstractSynchronousTransform.transform(AbstractSynchronousTransform.java:30)",
	"com.inductiveautomation.perspective.gateway.model.AbstractBindingHarness$TransformSequencer.transform(AbstractBindingHarness.java:259)",
	"com.inductiveautomation.perspective.gateway.model.AbstractBindingHarness$TransformSequencer.run(AbstractBindingHarness.java:272)",
	"java.base@11.0.13/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)",
	"java.base@11.0.13/java.util.concurrent.FutureTask.run(Unknown Source)",
	"java.base@11.0.13/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)",
	"java.base@11.0.13/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)",
	"com.inductiveautomation.perspective.gateway.threading.BlockingWork$BlockingWorkRunnable.run(BlockingWork.java:42)",
	"java.base@11.0.13/java.lang.Thread.run(Unknown Source)"
  ]
}

I see the word “transform” in the stack trace of every thread here, so I’m suspecting these are coming from script transforms, which is very widely used in our organization.

Is there a way to see which script transforms are causing this? Any information would be helpful at this point.

Thank you all in advance!

Unfortunately, while those do identify exactly what’s consuming CPU, they don’t offer much in the way of mitigation.
In all of those threads, there’s a script transform running - but the part that’s held up in that code is the script transform updating a JSON structure. The last thread is interesting; it’s held up in rebalance - are you returning very large chunks of data from any transforms, e.g. for a table’s data property?

Yes, I know we have transforms on large datasets happening on some bindings. The two scenarios that require (afaik) a transform are styling individual cells and merging data from two sources (two databases from different systems into one table).

I’m still having issues with high CPU usage on our server. It seems like it’s only getting worse. We end up rebooting the Ignition server about twice a week, with an hour downtime each time. I added the system CPU and memory usage tags to Historian so I can see exactly when the usage goes up. I’m gathering information still, but I did make a very interesting observation:

image

I noticed that the usage only goes up and it happens around the times when new sessions are launched.

2:30 pm is shift change
4 am employees show up
5 am different employees show up

The shifts don’t really even matter - I think what is more important is the usage going up when new sessions are launched, but never coming back down. Are sessions released from the gateway automatically once the session ends (web browser closed, reboot, etc) and is there any setting that would mess this up?

You should get in contact with support, if you haven’t already.

It definitely seems correlated, which implies either something is misconfigured (and thus consuming excessive CPU) or your totally legitimate configuration is exposing a performance pitfall in our code. Either are very possible, and support should be able to help isolate; plus, they can collect more information (and dedicate more time) than I can.

1 Like

I have a few cases open… My coworkers and I have been pretty passive in getting this resolved, but I’m going to focus on it more seriously, so I’ll be in touch with support.

Thanks for your help, @PGriffith! If we get this resolved, I’ll update this thread so others can find this information.

1 Like

An hour of downtime? How long does your gateway take to come back up? We’re seeing some issues on a customer’s site with a large system and rebooting takes <5mins or so