Delays in a timed gateway script

compuzak · November 10, 2022, 7:32pm

I am experiencing delays in my gateway timed scripts. I have a test script that just prints a test message every 3 seconds. Some messages come out 4 seconds after the previous print (I guess I can live with that) but every minute or so, the print comes 10-18 seconds after the last print (delayed 7 to 15 seconds). Does anyone know what could cause this? I will reboot our ignition server over the weekend when our shed is down, but really can't reboot now without affecting our production lines. All scripts are delayed, it is not just one script. Also, the test script ONLY contains the line 'print "Test Script"'.

I checked all the Ignition performance stats (CPU, memory, gateway timed scripts, and gateway tag change scripts, and the scripts all complete in the expected amount of time (in ms, not seconds).

Anyone have any ideas on what could be causing this problem?

Here is some sample output of the test script:
INFO | jvm 2 | 2022/11/10 11:36:59 | Test Script
INFO | jvm 2 | 2022/11/10 11:37:02 | Test Script
INFO | jvm 2 | 2022/11/10 11:37:06 | Test Script
INFO | jvm 2 | 2022/11/10 11:37:09 | Test Script
INFO | jvm 2 | 2022/11/10 11:37:24 | Test Script
INFO | jvm 2 | 2022/11/10 11:37:27 | Test Script
INFO | jvm 2 | 2022/11/10 11:37:30 | Test Script
INFO | jvm 2 | 2022/11/10 11:37:33 | Test Script
INFO | jvm 2 | 2022/11/10 11:37:36 | Test Script
INFO | jvm 2 | 2022/11/10 11:37:39 | Test Script
INFO | jvm 2 | 2022/11/10 11:37:42 | Test Script
INFO | jvm 2 | 2022/11/10 11:37:46 | Test Script
INFO | jvm 2 | 2022/11/10 11:37:49 | Test Script
INFO | jvm 2 | 2022/11/10 11:37:52 | Test Script
INFO | jvm 2 | 2022/11/10 11:37:56 | Test Script
INFO | jvm 2 | 2022/11/10 11:37:59 | Test Script
INFO | jvm 2 | 2022/11/10 11:38:02 | Test Script
INFO | jvm 2 | 2022/11/10 11:38:06 | Test Script
INFO | jvm 2 | 2022/11/10 11:38:09 | Test Script
INFO | jvm 2 | 2022/11/10 11:38:24 | Test Script
INFO | jvm 2 | 2022/11/10 11:38:27 | Test Script
INFO | jvm 2 | 2022/11/10 11:38:30 | Test Script

andrews · November 10, 2022, 7:50pm

Is your Delay Type Fixed or Delayed.

If it is delayed you should change it fixed and that might resolve it

compuzak · November 10, 2022, 8:34pm

This is Ignition 7.9, we are upgrading in Jan. I checked the box fixed delay and the box shared.

I tried fixed rate and it still has the long delay.

I tried dedicated and it worked, but I have thousands of scripts, so I am sure I can't make them all dedicated. I also have a lot of threading in scripts that handle communication to scales and they are experiencing the delay as well. This just started recently, so I have been analyzing what has changed since then, but can't find the cause.

Anyone know what could cause this delay? Is there a way to tell how many threads are running and how many are available?

pturmel · November 11, 2022, 1:26am

Do you mean loops until a condition is true, with or without sleeping? (Waiting for input counts as sleeping.) If so, don't. Each case ties up one of the threads from the shared thread pool, delaying other events.

Every single one of those needs a dedicated thread, or a carefully-managed long-lived asynchronous thread.

compuzak · November 11, 2022, 1:28pm

Yes, they are threads that continuously run and collect TCP/IP data from a scale. There is a delay between each weight (about 1 per second). The overhead in making the connection to the scale and setting all the parameters (multiple seconds) requires me to leave the connection open and get the scale weights each second. I can't have a multi second gap in between my weight values, have to react instantly. Some of our scales are run in a 200ms response time and we handle all of those in a C# program and not Ignition. During our cherry season (April through June) we have 64 scales connected and for years it has never caused any long delays. During our current walnut season we only have 11 scales connected and the 64 cherry season scales are NOT connected. If the scales were setup as modbus I know I can define a device much like a PLC (we have a few legacy scales setup that way). But these scales are setup for TCP/IP and the modbus replacement card would be a huge expense.

Sounds like I need a carefully-managed long-lived asynchronous thread. Currently, for each scale I have an asynchronous thread that starts around 5am and turns off around 11pm each day. I don't carefully manage it, but it is turned off and on via a unique run status tag for each scale. I turn it off each night as the scale routines have experienced problems during the network backup, so I just turn the scales off when they are done packing and turn them back on before they start.

Is there any way to list these threads via Ignition or do you have to save the process id as the threads are started? Would be nice to run a command to list all open threads and see what their usage statistics are. I must have something running that is causing issues (although I can't see any performance issues in the high level thread screens available in the gateway).

pturmel · November 11, 2022, 1:40pm

The gateway status pages have a thread monitor. You will need to name your threads, which current Ignition can do (with an option to system.util.invokeAsynchronous). On older Ignition you can create threads with java.lang.Thread with names.

The most import part is that you can't kill threads. They have to check something regularly to know to kill themselves. The best tool is java's interrupt infrastructure--easy to check.

You also have to ensure you use system.util.getGlobals() to hold a reference you can get to, as project saves will destroy your access to old script-module variables without destroying the background threads using them (big memory leak).

Search this forum for various combinations of "background thread", "thread life cycle", "thread memory leak", and "getGlobals".

compuzak · November 11, 2022, 1:46pm

That is how I control them via a tag. Once the tag is set to zero, the thread exits gracefully with the sys.exit(0) command. I never kill the threads or even manage them at all right now. We are only on 7.9 right now, will be upgrading soon.

Also, I turned off the scales and restarted the ignition gateway and I am still experiencing the delays. So these scale routines are not the culprit as they have zero manually started asynchronous threads running. These scale routines are the only routines that start asynchronous threads directly in the code.

pturmel · November 11, 2022, 1:51pm

You probably have stuck threads. Reboot your gateway or restart the Ignition service.

Editing the project that starts those threads will start another batch of threads, which will often then get stuck contending with the still-running older threads on sockets or ports. And tie up the event thread pool.

You probably have other sleep() operations scattered through your events. Anyone who uses sleep() or input waits in one place will use them elsewhere.

Kevin.Herron · November 11, 2022, 1:53pm

A few thread dumps may help. But it sounds like the problem is that you have "thousands" of timer scripts, all of which are trying to run on the same shared timer, and they take too long for that.

pturmel · November 11, 2022, 1:55pm

Keep in mind that events really need to run to completion quickly whenever called. In as few milliseconds as possible. I consider any event that runs (including waits) more than 100ms to be pathological. Only dedicated background threads should ever run longer than that.

compuzak · November 11, 2022, 1:55pm

I restarted the gateway and rebooted the Ignition VM server.

compuzak · November 11, 2022, 2:08pm

Kevin,

Thousands of scripts is an overstatement. But I do have at least 1000 scripts or so that are running during our busy season and at least 500 running or could potentially run (via timed or tag change gateway scripts). The strange thing is that the gateway script screen (both timed and tag change) do not show anywhere close to the number of scripts I have. It lists some of them, but doesn't list a lot of them. For example the test script I just started a couple days ago is not listed and it is running as we speak.

How do I get these thread dumps that you recommend?

Kevin.Herron · November 11, 2022, 2:12pm

You can download them from the diagnostics page in the gateway: Diagnostics - Ignition User Manual 7.9 - Ignition Documentation

compuzak · November 11, 2022, 2:14pm

pturmel, I do have a number of scripts that do something, then wait for things to happen, then do something else. I will have to redesign these scripts to exit and start back up via a tag change script. That is probably the root cause of these delays.

Just strange that the majority of these systems have been running for years without any long delays. We have always experienced a 1 second delay or something like that on these time scripts, but never 10-18 second delays, I just identified it now and can't say for sure how long it has been happening, but definitely didn't happen during our busy season or the entire system would have failed, we have many routines that would fail with even a 2 second delay.

Thank you both for all your insight!

compuzak · November 11, 2022, 2:21pm

Thanks, I was looking at the gateway scripts area. missed the actual threads section. I will dig into this and see if I can find things. Looks like only dedicated scripts show up by name, the pooled scripts are combined as would be expected. I will research and see what I can find. Thanks again