Using system.util.getGlobals with classes

Hayden_Watson · September 5, 2023, 12:38pm

Hey all,
We've got a system setup that takes in orders and breaks these into sub-tasks, these sub-tasks are then carried out by our machinery.

Some of our developers wants to use an API to get the running status of these jobs and their subtasks to a remote system.

To store all of the sub-task statuses and details to send later I'll need some way to append to some kind of object from several different scripts.

My plan currently is to use one of two different options:

Use a document type memory tags, I'll create a new memory tag for each job and then delete these tags when they the status is sent off to the backend and it responds with an okay message. Personally not a fan of this one as there's a risk of adding many many tags and reading and writing from many scripts could cause issues.
Use a class instance, that returns the data I want and has functions to append sub-task statuses, these will then be used with system.util.getGlobals so that it can be updated from any of our various scripts across the system.

I'm personally leaning to option 2, but how would I use this system.util.getGlobals() to declare my class instance and how can I then call it elsewhere and/or drop the class instance when the job is completed.

Felipe_CRM · September 5, 2023, 1:20pm

If I understand correctly, you don't want to store the classes themselves in the globals dictionary because of the high risk of nasty memory leaks. You might want to store just data to the globals in the form of native objects like dictionaries and keep the logic separate.

The following links have some info on how to use the globals dictionary:

pturmel · September 5, 2023, 1:31pm

For what Hayden describes as the use case:

{ My edit. }

Don't put user-defined class instances into the system.util.getGlobals() dictionary. Only java or jython native object types are safe in persistent contexts. (The exception is if you have very robust code that replaces every single one of those instances with new ones (running new code) every time the script environment restarts.) Be aware that Ignition is fundamentally multi-threaded and careful synchronization is always required.

I've found that the best combination is to design classes that wrap around a standard python dictionary, which is safe to place in the getGlobals() dictionary, containing the class's state. And any nested classes would have to be designed to wrap inner dictionaries in the same way. Such dictionaries would typically need to contain a jython lock object to be used with critical sections in the class methods. While jython's base objects are thread-safe (they are implemented with java's java.util.concurrent package), that doesn't make user algorithms thread-safe.

Hayden_Watson · September 5, 2023, 1:44pm

So if the memory leak risk is pretty bad, I'm wondering if there will be any advantage to using a class or java function over making a new python script with local module global variables and functions to read and write to an object stored in getGlobals()

E.g.

def makeObject(job_name):
     instance = system.util.getGlobals().setdefault(str(job_name),{myObjStruct})

def appendSubTask(job_name, sub_task_sts):
     system.util.getGlobals()['job_name']['sub_tasks'] = sub_task_sts

def delObject(.......

def transmitObject(.......

I guess this would still return a similiar result without needing the risky memory issues with using the python classes? Or are these thread safe Jython functions a superior way of doing it. My concern is if we have many concurrent jobs I don't want to lose the state of my jobs.

pascal.fragnoud · September 5, 2023, 3:52pm

You don't need system.utils.globals to be able to access things from various places in your system.

Hayden_Watson · September 6, 2023, 6:26am

Maybe there's other way to do it, I just want a single persistent space to alter and read my values across the gateway and perspective/vision that isn't bound by threading issues in the tag provider

pascal.fragnoud · September 6, 2023, 7:04am

Persistent and across the gateway are important concepts here.

globals might be the tool for the job then. But do you need custom objects ? Maybe a list of dicts could be enough.

Hayden_Watson · September 6, 2023, 10:58am

Currently the API we are posting to has a JSON object structure our backend team has developed, we'll need the data to be sent in that format. Theorectically we could rebuild the structure each time we want to send it off, but if I can build my object struct that is persistent that would be better.

Hence why I was wanting to build a persistent object, so I could append subtask statuses to it, and send that object when requested. Then dropping the object when the whole job is completed.

Globals could be possible so

global Job_Object
Job_Object = {object_struct}

then calling Job_List.Job_Object in scripts or buttons, but will this global tag remain persistent over some period of time. And how can I create new instances as ideally we're handling multiple orders at any given time.

But I'll want a new object structure for each job, I could append a list of these, but I'm wondering on which is safer to go to.

pascal.fragnoud · September 6, 2023, 11:09am

Sorry, I meant system.util.globals, not just global.

Can't your jobs and sub tasks be represented by simple dicts ? Those could be stored safely in system.util.globals.

That said, I'm not sure that globals dict is accesible through the whole gateway, I believe it's project specific.

Hayden_Watson · September 6, 2023, 12:51pm

Oh sorry, I think I misunderstood and thought you meant globals is an alternative to system.util.getGlobals

Essentially my object will be a JSON object, which will essentially be a complicated python dict that I convert to a string before transmission, where some key values are also dicts

E.g.

instance = 
{"job_name": foo,
"job_type": bar,
"start_time": 123,
"task_status": {"task one": {...}

etc. etc.

At first I thought storing a whole class would be a more concise way of storing it in the globals list.

But if storing this dictionary/object instead is better I'll do that. I didn't know about not transferring over to projects, maybe will be okay if it's created in my Gateway Scripting Project, but I'll test that out. I think in our current system this won't be an issue as only this project is using these particular scripts, but I'll look into that.

JordanCClark · September 6, 2023, 12:59pm

If you put it there, it will propogate to any leaf projects that inherit it. Probably not the best.

Out of (a possibly morbid) curiosity, How much of a performance hit would you really have by using a more db-centric method?

lrose · September 6, 2023, 1:02pm

Perhaps I am just missing something, but why not just use a document tag to store this JSON structure, then it is both persistent and Gateway global?

Or do the class definition in the Library and create a new instance of it at each use.

Are you meaning that this needs to be a Singleton object?

Hayden_Watson · September 6, 2023, 1:08pm

Essentially we are only using this to send the data to a backend server, each job our machinery conducts will create a new JSON to send off, these can then be read and written to from different parts of our project mostly gateway scoped scripts or HMI, and our backend system can send GET requests at any time to request the current state of that given task.

I vaguely remember another thread on the forum talking about how appending to tags can cause threading issues as we can't finely control when the tags update, epecially if we write during an update or if we write at the same time etc. This was found when I tried to use system.util.getGobals for an alarm sending queue.

Hayden_Watson · September 6, 2023, 1:11pm

Maybe best to keep contained to its own project then, especially as it's not required anywhere else.

Database wise could be okay I guess, though I'm not sure how I'd create the table structure that would be needed if Jobs are refencing many subtasks. I guess I would need to have a seperate tables for subtasks/job 'metadata' and many joins between various tables to do it.

If I was creating a vast system I'd probably spec some kind of Mongo DB, but we're only expecting at most a couple dozen tasks at any given time

lrose · September 6, 2023, 1:17pm

I don't understand this. A document tag would be a memory tag, and it would update when and only when you write to it. This doesn't, to the best of my knowledge, fall into any thread un-safe territory.

However, if it is only needed for this project, and it doesn't need to be a Singleton (1 and only one instance of the class), then I would just put the class definition in the script library, and create a new instance when needed. Is there any reason this doesn't meet your needs that I am missing? Is the concern here that it could be possible to have a scripting restart between instance creation and when all of the different attributes are set? (Sorry if you feel that this has already been answered).

You could store the JSON structure in the DB, as opposed to creating a table schema.

Hayden_Watson · September 6, 2023, 1:32pm

So the thread I had found before was this:

Maybe this would only be relevant in a queue like fashion, where the order does really matter, as when I found this that was a what I was developing at the time.
My worry is that because object will be appended to often and then read at any given time, it may have similiar constraints as a queue. But again, maybe this isn't actually a problem.

My other concern is that if these sub-tasks are rapidly appended to, say in a single for loop, and it's being called from another script on a different thread, how will it handle that. Would I system.tag.writeBlocking on each index of that for loop? Or if I write my changes right at the end of the loop, will another script fail to write to it as its sub-task 'doesn't exist' yet.

These probably only one in a million chance of happening, hence why I'm just seeing what others in a similiar situation do before I start implementing it. TBH, the tag option isn't a bad idea, probably easier to implement in some ways.

bkarabinchak.psi · September 6, 2023, 1:37pm

Probably easier to maintain as well, especially if it ends up being someone other than you who has to go in to make a modification and isn't aware of all these caveats and gotchas regarding this methodology. Personally I use database tables all the time to function as queues. Gives me a built in audit trail and survives any gateway reboots.

Seems like a table with "mainJobs" and a table "subJobs" with a "parentMainJobId" and perhaps an "ordering" column for the order of subtasks would suffice imo.

Hayden_Watson · September 6, 2023, 1:42pm

I think you're probably right. This was only supposed to be for send off to the backend system.

But keeping it for long term auditing/historical storage solution on top of that may have some good benefits as well.

Would also be good for a redundant hardware to maintain if we decide that we want one of those later.

lrose · September 6, 2023, 1:51pm

If you are trying to create a queue where the lifetime of the object needs to potentially survive a scripting restart, then yes this can be an issue, but I'm not getting that from what you have said so far.

The tag system is multithreaded, but that doesn't mean that your script is being called from multiple threads. If this is possible because of how you are calling the script, then you will need to implement some type of locking to insure that what ever you use to store the values can only be accessed by one thing at a time.

This sounds very complex, the DB approach sounds more and more appealing to me.

I recommend to never do tag writes or reads within a for loop. It is a huge performance hit. Instead, always construct the ojbect(s) and do all writes afterwards. My general rule of thumb is that any interaction with the tag system should have a max of 1 call to read values and 1 call to write values. There are exceptions to this (for writing anyway, I have not run into to one yet for reading), but they are few.

pturmel · September 6, 2023, 3:58pm

So, I was nerd-sniped pretty intensely by this yesterday, as the nesting of persistent dictionaries has some performance considerations that suggest some instance caching is appropriate.

This is utterly untested, but should give you a framework to use if you decide you really don't want the latency of database storage (and I wouldn't use document tags):

from threading import RLock

_g = system.util.getGlobals()

# This class requires a dictionary in its constructor to
# carry state.  It delegates attribute assignment and retrieval
# to corresponding keys in the state dictionary.  For compatibility
# with persistent object storage (like _g above), it is vital that
# only native Jython or Java object instances be stored in
# the state dictionary.
#
# Attributes that begin with an underscore are not diverted
# to the state dictionary.
#
class DictWrapper(object):
	def __init__(self, persistenceDictionary):
		self._myState = persistenceDictionary

	# Note that __getattr__ is only called when the required name
	# doesn't exist *as an attribute*.
	def __getattr__(self, attr):
		try:
			return self._myState[attr]
		except KeyError:
			raise AttributeError(attr)

	# Note that __setattr__ is *always* called for attribute
	# assignment and must explicitly defer to the superclass
	# for attributes that must not be diverted.
	def __setattr__(self, attr, val):
		if attr.startswith('_'):
			super(DictWrapper, self).__setattr__(attr, val)
		else:
			self._myState[attr] = val

# Implement a job-with-nested-tasks class architecture using persistent
# object storage.  Jobs and tasks within jobs are *named* for maximum
# concurrency.  Each instance is initialized with .deleted set False
# so that deletions in parallel contexts are noticeable.  (Any instance
# that sets .deleted is expected to then delete from the persistent
# dictionary.  Parallel instances will notice the deleted status upon
# cache lookup and prune at that point.)

jobStates = _g.setdefault('JobStatesByName', {})

# A job instance wraps a state dictionary that is maintained within
# _g so that all jython contexts will share state.  The class itself
# maintains a cache of all wrapped instances.
#
# Generally, job instance algorithms would use `with self.lock:` for
# critical sections where thread safety matters.
class JobTracker(DictWrapper):
	_trackers = {}
	_cacheLock = RLock()

	# Users should never call this.  Unconditional instance creation
	# must use the static method .lookupOrMake() to avoid duplicate
	# cache entries, and to prune deletions.  Retrieve existing
	# instances with .lookup() to ensure deletions are pruned.
	def __init__(self, job_name):
		with JobTracker._cacheLock:
			super(JobTracker, self).__init__(jobStates.setdefault(job_name, {}))
			self._myState.setdefault('tasks', {})
			self._myState.setdefault('lock', RLock())
			self._myState.setdefault('deleted', False)
			self._tasksCache = {}
			# Make this instance's name effectively constant.  Attribute
			# writes to job_name will be diverted to _myState, but attribute
			# reads will get this value.
			self.__dict__['job_name'] = job_name
			JobTracker._trackers[job_name] = self

	# An instance's delete method handles the persistence details.
	def delete(self):
		with self.lock:
			self.deleted = True
			with JobTracker._cacheLock:
				jobStates.pop(job_name, None)
				JobTracker._trackers.pop(job_name, None)

	@staticmethod
	def _cachedJob(job_name):
		# Fast path is unlocked.
		job = JobTracker._trackers[job_name]
		if job.deleted:
			with JobTracker._cacheLock:
				# Repeat the lookup under the lock
				job = JobTracker._trackers[job_name]
				if job.deleted:
					JobTracker._trackers.pop(job_name, None)
					throw KeyError(job_name)
		return job

	@staticmethod
	def lookup(job_name):
		# Fast path is unlocked.
		try:
			return JobTracker._cachedJob(job_name)
		except KeyError:
			# Try again under the cache lock
			with JobTracker._cacheLock:
				try:
					return JobTracker._cachedJob(job_name)
				except KeyError:
					# This will give another key error if it really doesn't exist
					jobPersistence = jobStates[job_name]
					# If no error, instantiate.
					return JobTracker(job_name)

	@staticmethod
	def lookupOrMake(job_name):
		# Fast path is unlocked.
		try:
			return JobTracker._cachedJob(job_name)
		except KeyError:
			# Try again under the cache lock
			with JobTracker._cacheLock:
				try:
					return JobTracker._cachedJob(job_name)
				except KeyError:
					# Unconditionally make an instance.
					return JobTracker(job_name)

	# Manage a cache of nested tasks' instances, similar to the cache of
	# job instances.
	def _cachedTask(self, task_name):
		# Fast path is unlocked.
		task = self._tasksCache[task_name]
		if task.deleted:
			with self.lock:
				# Repeat the lookup under the lock
				task = self._tasksCache[task_name]
				if task.deleted:
					self._tasksCache.pop(task_name, None)
					throw KeyError(task_name)
		return task

	def getTask(self, task_name):
		# Fast path is unlocked.
		try:
			return self._cachedTask(task_name)
		except KeyError:
			# Try again under the job lock
			with self.lock:
				try:
					return self._cachedTask(task_name)
				except KeyError:
					# This will give a final key error if it really doesn't exist
					taskPersistence = self.tasks[task_name]
					# Wrap it in the Task class
					return TaskTracker(self.job_name, task_name)

	def getOrMakeTask(self, task_name):
		# Fast path is unlocked.
		try:
			return self._cachedTask(task_name)
		except KeyError:
			# Try again under the job lock
			with self.lock:
				try:
					return self._cachedTask(task_name)
				except KeyError:
					# Unconditionally make a Task instance
					return TaskTracker(self.job_name, task_name)

	# Provide a helper for iteration over a job's tasks that
	# supplies the task instances instead of the task's persistence
	# dictionary, pruning automatically.
	@property
	def _tasks(self):
		pruneKeys = set(self._tasksCache.keys())
		for k, v in self.tasks.items():
			pruneKeys.discard(k)
			try:
				yield self.getTask(k)
			except KeyError:
				# Only here when a task is pruned concurrently with
				# this generator loop.
				pass
		for k in pruneKeys:
			try:
				self._cachedTask(k)
			except KeyError:
				pass

# A task instance wraps a state dictionary that is maintained within
# the job's state dictionary so that all jython contexts will share state.
# The job instance maintains a cache of all wrapped task instances.
#
# Task lookup and creation must be performed by the job to maintain
# the cache and perform deletion pruning.
#
# Generally, task instance algorithms would use `with self.lock:` for
# critical sections where thread safety matters.
class TaskTracker(DictWrapper):
	def __init__(self, job_name, task_name):
		# The following will fail if the job doesn't exist.
		self._job = JobTracker.lookup(job_name)
		with self._job.lock:
			super(TaskTracker, self).__init__(self._job.tasks.setdefault(task_name, {}))
			self._myState.setdefault('deleted', False)
			self._myState.setdefault('lock', RLock())
			# Make this instance's name effectively constant.  Attribute
			# writes to this will be diverted to _myState, but attribute
			# reads will get this value.
			self.__dict__['task_name'] = task_name
			self._job._tasksCache[task_name] = self

	def delete(self):
		with self.lock:
			self.deleted = True
			with self._job.lock:
				self._job.tasks.pop(task_name, None)
				self._job._tasksCache.pop(task_name, None)