I created my own python class and I add an instance to the script module where it is defined. The class acts as a local cache in a client for some components. Instead of having to fetch data from across the network they get it from this class instance instead. When a window is loaded the component queries the cache for some info and if the data is not available locally then the class instance fetches it from the network source. Several components will access this cache concurrently since the components do this in a call to invokeAsynchronous()
. I’m worried about this being thread safe. Since each component is interacting with the class instance in it’s own thread it’s possible (I presume) that if they are both trying to access the same data at the same time and there could be a conflict. I’ve seen other forum topics that discuss thread safe but it’s still unclear to me what makes something thread safe or not. How can I make my python class thread safe? Or can I create another thread and queue access to this object in a single thread instead of each component doing so independently?
Thread safety is a huge topic with no single solution. For a cache, I would recommend using a dictionary and a python lock. Something like this:
import threading
# Simple concurrent cache. Fast for existing entries. New entries serialize on the lock.
myCache = {}
myLock = threading.Lock()
def cacheGet(key, constructorFunction):
cached = myCache.get(key)
if cached is None:
myLock.acquire()
try:
cached = myCache.get(key)
if cached is None:
cached = constructorFunction(key)
myCache[key] = cached
finally:
myLock.release()
return cached
Note that there’s no need for a custom class to manage this cache. Just define a function that takes a key and delivers the data to cache. This cacheGet() function is safe to call from many background threads. They won’t block on getting any existing entry, even if another thread is working on a different entry. If two threads try to create an entry for the same key at the same time, the second thread will have to wait on the lock, and will bail out when it gets the lock because the first thread will have set that key in the cache.
The try-finally construct makes sure your lock won’t get stuck if there’s an error in your constructor function.
Jython’s datastructures which are dictionaries, lists and sets are thread safe. So this means that there will not be data corruption of the data structures but there can be race conditions.
More information about this topic at this link: http://www.jython.org/jythonbook/en/1.0/Concurrency.html
Indeed, the purpose of the python lock in my example is to prevent races between two threads trying to construct the value for a particular key, not to protect the dictionary itself.
So the Jython datastructures essentially have a built in ‘lock’?
Yes, but it basically protects themselves from you accessing them simultaneously. The native lock won’t protect your algorithm.
Python datastructures normally need to be thread safe by the spec (though dead-locks cannot be avoided on thread safe objects without knowing the calling code, as Phil said). Jython follows the spec pretty well. The main problems with thread safety happen when you import java classes in Jython (like the java ArrayList, or any java method that uses ArrayList internally). These aren't thread safe, and accessing them from different threads may even corrupt the entire structure.