I tried JProfiler and I did end up finding/realizing a problem & finding a solution.
Running JProfiler on the server where Ignition is running and hammering the save button in a designer, I found a large number of classes were being created and never released. Looking in Heap Walker > Bigger Objects, I could see all the memory was being allocated to
org.python.core.PySystemState$PySystemStateCloser. From there the only thing I could guess was that the caching function was somehow conflicting with the proper removal of cached data. Turning off caching and simply returning the function results every time, restarting the Ignition server, and then repeating the same process as before proved this theory out.
My implementation of the LRU cache is nearly identical to the one in the standard library in Python 3, functools.lru_cache. In that design and pretty much every other design involving a doubly-linked list in Python you can find, the nodes do reference one another and do create a circular reference. I believe it is not a problem in Python 3, CPython, because with the LRU cache you will dump the old unused cached values and the cache will never grow to anything significant.
Normally in CPython you would just kill the interpreter process and that would be that. On the Ignition server, there were many instances of
PySystemStateCloser, with each consuming ~6 MB. After the fix to the caching function, I am down to just 1. It appears whatever Ignition/Jython does to when restarting the gateway interpreter following a project save cannot handle this circular reference problem and so the memory remains allocated until Ignition is restarted.
To actually fix the caching function, I did the following:
- Store the hash of the function parameters as the
key instead of the parameters themselves. This fixed a problem when the decorator was used on classes, where self is passed as a param and the class references the decorator so you end up in a circular reference.
weakref.ref to create “links” between the nodes, removing the circular ref between the nodes themselves.
Weakrefs were still a problem and still stuck around forever, it just was less of a problem so it appeared to have worked. Changing to using the hash values in each node fixed that problem. Just requires a dict lookup to get adjacent nodes.