Read UDT Instance vs Read Members

I have a UDT definition with 19 members. I need to read 3 of them. Wondering if it's more efficient to read the 3 I need or if I should read the entire UDT and extract from the dictionary.

Or, what's best practice?

Code is a bit simpler if I extract from the dictionary because I don't have to build the individual member paths and then extract them in parallel.

Are you reading a LOT of those ?

If not, don't worry about performances, do it in the simplest possible way.

I'll end up reading all instances of the UDT that exist on the gateway.

I'm not sure it really matters all that much, unless you're doing it a number of times. In which case less reads will always be more performant.

That said. This is pretty much a one liner. So i'm not sure what simpler looks like.

Either way, it's one call to system.tag.readBlocking()

Code is much simpler and more readable reading the entire UDT

headers = ['Number','Type','Name']
paths = getUDTtagPathList('Grain_Equipment')

equipment = system.tag.readBlocking(paths)
	
data = []
for equip in equipment:
	data.append([equip.value[eqNumber],equip.value[eqType],equip.value[eqName]])
	
return system.dataset.toDataSet(headers, data)

instead of

headers = ['Number','Type','Name']
paths = getUDTtagPathList('Grain_Equipment')

tagsToRead = [str(path)+'/'+suffix for path in paths for suffix in headers]
	
tagsVals = [ tag.value for tag in system.tag.readBlocking(tagsToRead)]

data = [tagsVals[i:i+len(headers)] for i in range(0, len(tagsVals), len(headers))]

return system.dataset.toDataSet(headers, data)

I'm not sure what the comprehension for data is trying to achieve, your basically going a long way around stepping through each value in tagVals, which you already know will be in groups of the length of headers.

Either way, the dictionary script:

headers = ['Number', 'Type', 'Name']
paths = getUDTtagPathsList('Grain_Equipment')

data = [[equip.value[eqNumber],equip.value[eqType],equip.value[eqName]] for equip in equipment]

return system.dataset.toDataSet(headers,data)

For the second one, I would throw this function into your utilities library because it is super handy for things just like this:

def chunker(seq,size):
    return (seq[pos:pos + size] for pos in xrange(0,len(seq),size))

Then your second script can be this:

EDIT: replaced zip with product, as @pascal.fragnoud correctly pointed out, zip is not what we want in this instance.

from itertools import product
headers = ['Number', 'Type', 'Name']
paths = ['{}/{}'.format(path,suffix) for path,suffix in product(getUDTtagPathsList('Grain_Equipment'),headers)]

data = [[qv.value for qv in chunk] for chunk in chunker(system.tag.readBlocking(paths),len(headers))]

return system.dataset.toDataSet(headers,data)

You may feel that is less readable (particularly for the un-initiated), and that's fair. To me the difference is negligible, but as the UDT list increases the second will be more performant. The only way to be certain is to measure the performance on your system. Either one is a perfectly acceptable way to do what you're doing.

2 Likes

That won't work, it's only gonna match each udt with one suffix.

Itertools provides a simple way to do this (as it often does):

from itertools import product

udts = ['foo', 'bar', 'pox']
headers = ['number', 'type', 'name']
paths = ["{}/{}".format(path, suffix) for path, suffix in product(udts, headers)]
3 Likes

The data comprehension in the second one was because they would be in the wrong order if stepped through one by one, the way I had constructed the member paths.

Thanks for pointing out the comprehension I missed for the first method.

I'm going to stick with the first method for now. The nice thing about modularity and scripting library is that it's easy to update in the future if it becomes a problem.

Is there any reason to not just use double for in the comprehension instead of import from Itertools?

paths = ["{}/{}".format(path, suffix) for path in udts for suffix in headers]

It seems to produce the same list, but I'm not sure if there's any performance advantage in using product

1 Like

You're right. I was thinking product, and used zip.

Ughhh.

The double for loop might actually be faster, but it's about conveying intent.
I read product, I know what's coming out.
It's particularly true when you have more than 2 parameters. A double for loop is simple enough that it might not warrant importing itertools, but write the loop equivalent to this:

things = product("abcdef", (1, 2, 3, 4, 5), ['foo', 'bar', 'pox', 'wuz'], {e, w, q, a, e})

Things will get ugly real quick.

4 Likes

read blocking is guaranteed to return the values in the same order as the paths list that is passed in. And because Lists are ordered that order will not change so long as the list is not intentionally modified.

Another tip is if you do need a range function, then you should get in the habit of using xrange() as it uses a generator as opposed to creating a full list, which is also good for performance.

3 Likes

Well when I printed the list I generated it showed Number, Number, Number, Type, Type, Type, Name, Name, Name

So, yes I could have adjusted the list prior to the read to get the order I wanted, but I didn't. It's moot now anyway (edit: I was mistaken, it was in the correct order)

We digress...

3 Likes