Stopping File Leak on Gateway Script Restarts from TCP Socket Connection script

I have a device that I need to connect to over a TCP socket, register what types of messages I want to receive (by sending a TCP packet for registration) and then store messages received from it in a DB. I have a main() that tries to connect, register, and read from the tcp socket connection in a loop. Main is not directly called but rather invokedAsync by threadManage(). The threadManage() script is run on Gateway Startup Event. Previously run threads are stored in system.util.getGlobals() (I have also tried pturmel’s LifeCycle module). When threadManage() is called it interrupts and joins the previous thread before invokeAsync(main). The thread in invokeAsync is stored in getGlobals for the next time.

The issue I’m running into is that every time the gateway scripts restart due to a save/edit the is an open file/socket leak (We are running on Linux so open files and sockets are one in the same). I can’t figure out what I am doing wrong that is handling closing previous threads or sockets. So I’m looking for insight into how my closing of the threads/socket is still leaving a file leak.

def main(host_ip, host_port):
	"""
	Loop forever. Try to connect, subscribe, and read.
	If something fails, wait a short time and try again.
	:param host_ip: The IP address of the host to connect to
	:type host_ip: str
	:param host_port: The port to connect to on the host
	:type host_port: int
	"""
	sock = None
	import java.lang.Thread as thr
	import java.lang.Thread.State as ts
	while True:
		try:
			#g = system.util.persistent('threading') # grab globally persistent dictionary
			g = system.util.getGlobals()

			if 'sock' in g:
				try:
					g['sock'].shutdown(socket.SHUT_RDWR)
					g['sock'].close()				

				except:
					pass
			g['sock'] = None
			
			#g = system.util.persistent('threading')
			g = system.util.getGlobals()
			
			if  g['thread'].getState() == ts.TERMINATED or  g['thread'].interrupted():
				print 'here'
				return
			if  g['thread'].id != thr.currentThread().id:
				return
			
			g['sock'] = try_to_connect(host_ip, host_port)
			
			
			print g['sock']
			if g['sock']:
				subscribed_ok = subscribe()
		
				if subscribed_ok:
					read_until_failure()
		
				g['sock'].close()
			else:
				print 'sleeping'
				time.sleep(10.0)
		except Exception as e:
			print 'exception thrown'
			print e.message
		finally: 
			#g = system.util.persistent('threading') # grab globally persistent dictionary
			g = system.util.getGlobals()
			if g['sock']:
				g['sock'].shutdown(socket.SHUT_RDWR)
				g['sock'].close()
			g['sock'] = None # to ensure garbage clean up
			

def ThreadManage():
	"""
	Function to create a socket thread and interrupts previous thread before starting its own.
	"""
	# imports
	import time
	import system
	import java.lang.Thread as thr
	import java.lang.Thread.State as ts
	from java.util.concurrent import Semaphore
	from java.util.concurrent import TimeUnit
	
	system.tag.writeBlocking(['[default]Test/NML/timerScriptTime'],[system.date.now()])
	
	#g = system.util.persistent('threading') # grab globally persistent dictionary
	g = system.util.getGlobals()
	
	if not 'lock' in g: 
		g['lock'] = Semaphore(1) # create Semaphore lock if one does not already exist
	
	#lock = g['lock']
	acquired = g['lock'].tryAcquire(5,TimeUnit.SECONDS) # Try to acquire the lock for 5 seconds
	
	if not acquired:
		return # end function if lock was not acquired
		
	try:
		if 'thread' in g: # if thread exists interrupt and join
			g['thread'].interrupt()
			g['thread'].join()
			system.tag.writeBlocking(['[default]Test/NML/globalsIsNone'], [False])
		else:
			system.tag.writeBlocking(['[default]Test/NML/globalsIsNone'], [True])
		g['thread'] = system.util.invokeAsynchronous(main,[HOST_IP, HOST_PORT]) # spin up new socket thread and store thread in global dict
		return
	except:
		pass
		
	finally:
		g['lock'].release() # always release the lock

system.util.getGlobals() lost its persistence across script restarts somewhere early in the v7.9.x range, and not fixed until 8.1.

See this topic:

Note: I am not supporting v8.0 as it is not receiving any more updates.

Also, system.util.persistent() doesn't exist without the module linked above.

@pturmel Sorry I should have included my Ignition version. The gateway is running v8.1.7 and the LifeCycle module I tried was the v8.0 one. I’ve confirmed that on gateway script restarts I find the key ‘thread’ and not just an empty dictionary.

Does anything immediately look wrong with my implementation of interrupting the old thread and waiting for it to end? And same thing for my clean up of the previous socket? I took inspiration for my own implementation from a few forum posts including this one, but still have an issue with file leaks. Managing multiple asynchronous threads - #2 by pturmel

I didn’t closely review your code. I would only note that I avoid locking. I recommend relying on jython’s dictionary being an implementation of a ConcurrentMap. The .setdefault() method is particularly helpful. Or for more picky cases, use a java ConcurrentHashMap directly within a persistent dictionary. Then you can use the fact that a java Map's .put() method returns the object displaced in the map, if any. That doesn’t need locking.

Meanwhile, testing code like this generally requires lots of logging, so you can see threads starting and retiring, sockets opening and closing, etc. All with their java hash ids, preferably. Then you can see for sure that your thread and socket life cycles are doing what you intend.

I still don’t have a fix/solution for my problem. But after digging into what lines of code I could change or comment I’ve discovered that I think the cause of the file leak is from import socket I have made a new function in a global script that is simply:

def runTest():
	import socket
	return

When runTest() is run on a gateway startup event script and the gateway scripts restart, the total number of open files found from the command lsof -p 23215 | wc -l increases by what seems to be exactly 30. However, if I just comment out the import socket then the file leak stops happening on gateway scripts restarts. I don’t have any idea what could be causing the file leak from the import of socket.py even after looking at the socket.py file. I also don’t know what I can use as a solution here.

Don’t import within the function. Import at the top level of a project script module.

Also, you should consider re-implementing with java’s network classes. Lots of grief here with jython sockets.

I see the same issue here whether the import is in the function or at the top of the file

import socket

def runTest():
	
	return

I have figured out a work around which is storing the socket module object in the globals dictionary directly. I really don’t like this as I feel a module import should work/clean up without any file leaks, but storing it in globals means the file leak only happens once as long as the socket module stays stored in the globals dictionary. The code as written below does not have a file leak on gateway scripting restarts for me, but commenting socket = g['socketModule'] and uncommenting import socket creates a file leak on every gateway scripting restart

import importlib

def runTest():

	g = system.util.getGlobals()
	if 'socketModule' not in g:
		g['socketModule'] = importlib.import_module('socket')
		
	socket = g['socketModule']
	#import socket
	
	sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
	sock.close()
	return

I’m going to write up a ticket for this. I think a fix on our side may be possible.

Also, @nicoli.liedtke, I suspect storing that reference to the socket module in the globals dictionary is a bad idea and probably a memory leak or something.

Yeah. Uuuuuuuugly.

Let me repeat: