[IGN-6503]Project Library Script Initialization Races

Are you annoyed by pseudo-random breakage in Ignition project library scripting? Odd top-level constants being garbled, or operations in the top level that have side effects running twice? Or more?

Not happening much, but always around project edits, and sometimes even on gateway restart?

Yeah, me too. :frowning_face:

Poking around under the hood back in the spring, for a module development task, I ran across some problematic IA bytecode and pinged some people. But it went nowhere, as I could not, at that time, make a reliable reproducer.

But I dreamed some code this week, thanks to an exotic consulting task, that led to a reproducer in a nice, self-contained project. On decent multi-core hardware, the reproducer runs its script init in parallel, near-simultaneously, in five threads. On startup and on every project edit.

Import that into any gateway and immediately see the race in the logs.

I'm sure IA will fix this, but there are many, I suspect, who would benefit from a workaround in the meantime.

If you try the above project and it shows the problem, replace the script module's import section with this alternate:

# Catch racing initialization threads (See IA support ticket #131484)
import sys, threading
globals().setdefault('initLock', threading.Lock())
if not initLock.acquire(False):
	# not first
	with initLock:
		pass
	sys.exit()
# One more leaker possibility.  No need to wait on this one.
# The key checked can be any of the top level variables instantiated below.
if 'logger' in globals():
	initLock.release()
	sys.exit()

from java.util.concurrent.atomic import AtomicInteger

And add this at the bottom of that script:

# The following must remain at the end.  There *must* be error checking
# above to ensure this executes.
initLock.release()

And you should see one "Initializing" report per project edit thereafter.

Tweak to suit.

I will report when I'm told this is fixed.

{ You shouldn't expect any proper external python IDE to like that work-around, due to its use of variable names that have never received an assignment. :man_shrugging: }

11 Likes

I should mention that the work-around shown above should NOT be placed in all of your project library scripts. It solves the multiple execution problem but at the risk of AB-BA deadlock across interdependent scripts. Some thoughts:

  • Definitions of simple functions and simple assignments to top-level constants do not need this fix. While multiple executions will repeatedly replace top level objects, the replacements are identical to the first pass--no harm, no foul.

  • Definitions of top-level objects to be used as deletable/clearable caches are also effectively unharmed by multiple initialization, though some cache entries established early (while racing) might be blown away.

  • Definitions of utility classes that have short lifetimes are similarly unaffected. That early instances might not have the same type as later instances (because the class was redefined multiple times) is meaningless for such cases--the instances don't live long enough for that to matter, and the implementations are identical.

  • Establishing top-level dictionaries with my system.util.globalVarMap() or the native system.util.getGlobals() to obtain JVM-lifetime persistent storage is safe. Values placed in such dictionaries with .setdefault() are also unaffected by repeat initialization.

Scenarios that DO need this fix:

  • Jython classes that establish singleton instances during initialization. Multiple singletons will be created, and those grabbed early will not have the same state as the last one.

  • Top level constants that are assembled in multiple steps, using += or list.append() operations, will be corrupted in odd ways by the parallel execution. (Encountered this a lot some years ago--I had to stop doing that.)

  • Top level objects that form the root of linked object relationships are vulnerable. Early linkages may be chopped off by the repeat initialization and those linked objects' memory leaked.

  • Scripts that launch asynchronous threads during initialization will launch multiples, with potentially long-lived clashing behavior.

  • Top level constants initialized from database queries, web API requests, or other time-consuming operations (particularly when calling a function just defined locally) will race and race and race the whole time the first execution runs, and the last execution may leave the constant with errors if any resource has rate limits.

The fix I recommended to IA in my bug report does not have the AB-BA deadlock risk, so hopefully this all becomes moot.

2 Likes

So the workaround does not always work?
I think this is quite important to get right for Ignition.

The work-around always works if it runs to completion (the .release()). There are situations where that could be blocked or broken, most notably AB-BA deadlocks.