ExecutionManager threading issue?

Hello everyone,

My module has several execution scripts that are making a series of REST calls. I have noticed that when some of these REST calls repeatedly fail (return a bad response), it has been overloading the Ignition system resources and sometimes leading to full crashes. I am trying to troubleshoot what may be at the root of this problem. Here is some context:

  1. Here is a list of my execution scripts that are using the shared execution engine:

  2. Here is a sample section of code with how I register these scripts:

public void register() {
		if(parentRecord.getEnabled().equals(true)) {
			log.info("Launching Agent: {}", execId);
			context.getExecutionManager().register(execId, "ClearAllInterlocks", new ClearAllInterlocks(context, record, execId, parentRecord, httpPool), 500);
			context.getExecutionManager().register(execId, "CreateMission", new CreateMission(context, record, execId, parentRecord, httpPool), 500);
		}
	}
  1. One thing in particular that I have noticed is that there are an increasingly large number of Timed Waiting threads in the system. I cannot determine what processes are creating these threads or what they belong to by using the Ignition Gateway, but I suspect this may be related to the problem. Here is a snapshot of the system performance when experiencing some of these issues:

My main suspicion at the moment is that the execution scripts are being registered with a Fixed Rate instead of a Fixed Delay. When the REST calls fail, it is causing the execution time to last longer than the rate and causes a large buildup of Timed Waiting threads. I don't have a ton of visibility into how the ExecutionManager or threading works so I was wondering does this theory make any sense? If not, any other ideas? If so, would the correct solution be to switch from

context.getExecutionManager().register(execId, "ClearAllInterlocks", new ClearAllInterlocks(context, record, execId, parentRecord, httpPool), 500);```

to something like

context.getExecutionManager().scheduleWithFixedDelay(new ClearAllInterlocks(context, record, execId, parentRecord, httpPool), 500, 500, TimeUnit.MILLISECONDS);

Thanks in advance for the help!

Take a thread dump from Status > Diagnostics > Threads. You'll see all the threads there.

ExecutionManagers used a fixed size thread pool. I'm not sure exactly what you're trying to troubleshoot.

Hi Kevin,

The Timed Waiting threads seem to be piling up and increasing usage of system resources, which has led to the whole Ignition system crashing. I'm trying to troubleshoot why this is happening.

I was wondering if it is because I am registering my execution scripts using a fixed rate instead of fixed delay. My theory was that the threads are piling up due to the execution lasting longer than the rate which would add new Timed Waiting threads, but I'm not sure how these work. I would like to use the functionality of a fixed delay such that the next thread will execute at the delayed time after the first finishes. I can't tell if I'm using a fixed rate or fixed delay as the Javadocs on these methods aren't particularly clear, so I was hoping for some clarity there.

Additionally, I'm not seeing any threads populate in the table, so I'm unable to take a thread dump. I waited for ~15 mins with nothing being added:

Thanks again!

Registering tasks with an ExecutionManager should not cause threads to be created or pile up.

You may need to use the jstack utility included with the JDK to get the thread dump instead.

Hi Kevin,

I will look into jstack. Do you know what may cause a dramatic increase in Timed Waiting threads?

Also, can you help identify the proper methods to use to register with Fixed Delay vs. methods for registering with Fixed Rate? The Javadoc has some confusing language for its descriptions:

Timed waiting threads are just threads that aren't doing anything. ExecutionManager's have fixed size thread pools, so unless you're creating new managers they aren't the reason your thread count is growing.

register uses fixed delay, registerAtFixedRate uses fixed rate. These are not likely relevant to whatever you're troubleshooting.