Set entire column to value with button

dkhayes117 · September 4, 2018, 7:50pm

I’m trying to set an entire column of values equal to 0. I’ve tried about 50 different ways, and keep failing miserably. I’ve found several snippets that are close to what I want, but I can’t get any to work.

table = event.source.parent.getComponent('poGroup').getComponent('Table3').data
pydata = system.dataset.toPyDataSet(table)

for row in pydata:
	row['QUANTITY'] = 0

table.data = pydata

mrogers · September 4, 2018, 7:58pm

Almost there - try this

table = event.source.parent.getComponent('poGroup').getComponent('Table3').data
pydata = system.dataset.toPyDataSet(table)
#this gets a list of column headers from the dataset
headers = system.dataset.getColumnHeaders(table)
for row in pydata:
	row['QUANTITY'] = 0
#You changed it to a pydataset, but you need to get back to a regular dataset
table.data = system.dataset.toDataSet(headers, pydata)

dkhayes117 · September 4, 2018, 8:04pm

This looks better than what I had, but it says that object does not support item assignment, referring to row[‘QUANTITY’] = 0.

PGriffith · September 4, 2018, 8:05pm

Try a different approach, using standard datasets throughout:

colName = 'QUANTITY'
table = event.source.parent.getComponent('poGroup').getComponent('Table3')
ds = table.data
colIndex = ds.getColumnIndex(colName)
colType = ds.getColumnType(colIndex)
ds = system.dataset.filterColumns(ds, [i for i in range(ds.columnCount) if i != colIndex])
ds = system.dataset.addColumn(ds, colIndex, [0] * ds.rowCount, colName, colType)

table.data = ds

mrogers · September 4, 2018, 8:08pm

I was going to suggest using system.dataset.setValue iterating through rows, but this looks like a more straightforward way.

dkhayes117 · September 4, 2018, 8:08pm

I got this with your code

TypeError: can't put element of type class java.lang.Integerin column of type class java.lang.Double

PGriffith · September 4, 2018, 8:46pm

I don't see how - filterColumn will remove the type completely, and the newly generated row should work just fine. I did some testing and wasn't able to reproduce that error.

I was curious about this, so I did some benchmarking, and (not really surprisingly) setting the values-per-row is incredibly slower - many orders of magnitude as the row count increases:

Running 'zeroColumnAtOnce' on a dataset with 100 rows 10000 times took 204ms; average: 0.0204ms
Running 'zeroColumnIteratively' on a dataset with 100 rows 10000 times took 5133ms; average: 0.5133ms
Running 'zeroColumnAtOnce' on a dataset with 1000 rows 1000 times took 98ms; average: 0.098ms
Running 'zeroColumnIteratively' on a dataset with 1000 rows 1000 times took 39854ms; average: 39.854ms

And the code I used, including a slightly cleaned up version of what I originally posted:

from java.lang import Double, Float, Long

def zeroColumnAtOnce(ds, colName):
	colIndex = ds.getColumnIndex(colName)
	colType = ds.getColumnType(colIndex)
	ds = system.dataset.filterColumns(ds, [i for i in range(ds.columnCount) if i != colIndex])
	ds = system.dataset.addColumn(ds, colIndex, [0] * ds.rowCount, colName, colType)
	
	return ds
	
def zeroColumnIteratively(ds, colName):
	copy = ds
	for row in range(ds.rowCount):
		copy = system.dataset.setValue(copy, row, colName, 0)
	return copy
	
def generateSample(rows=1000, cols=5):
	sampleDataHeaders = ["Column %s" % (col + 1) for col in range(cols)]
	sampleData = [[i, Float(i), Double(i), Long(i), str(i)] for i in range(rows)]
	return system.dataset.toDataSet(sampleDataHeaders, sampleData)

def time(function, runs, **kwargs):
	start = system.date.now()
	data = generateSample(**kwargs)
	for run in range(runs):
		function(data, "Column 3")
	end = system.date.now()
	elapsed = system.date.millisBetween(start, end)
	avg =  elapsed / float(runs)
	rows = data.rowCount
	print "Running '%s' on a dataset with %s rows %s times took %sms; average: %sms" % (function.__name__, rows, runs, elapsed, avg)
	
time(zeroColumnAtOnce, 10000, rows=100)
time(zeroColumnIteratively, 10000, rows=100)

KathyApplebaum · September 4, 2018, 9:13pm

To expand on why setting one value at a time is slower, remember that datasets are immutable. That means each time you call system.dataset.setValue(), all the data is copied to a whole new dataset. So changing a single value in each of 1000 rows means each value in the entire dataset is read and written to a new dataset 1000 times. If you do filterColumns and then addColumn, each value is read twice and written twice (once for each operation).

Always prefer the bulk operations on datasets when you can.

pturmel · September 4, 2018, 9:50pm

To keep the overhead to the barest minimum, use DatasetBuilder and do everything in one pass.

dkhayes117 · September 5, 2018, 12:49pm

I would never call myself an actual programmer, I get by with what I know and that’s it. I still don’t have any working code. I don’t really care about efficiency, I’m looking for simple code. I will not be using this over a 50 row count.
753rd try doesn’t error out, but nothing happens either:

table = event.source.parent.getComponent('poGroup').getComponent('Table3').data

for i in range(table.rowCount):
	newData = system.dataset.setValue(table, i, 'QUANTITY', 0)
table = newData

pturmel · September 5, 2018, 1:06pm

That's because your final assignment is to a local variable, instead of the data property of Table3. You are also overwriting newData in every iteration, so the final assignment will only reflect a change to the last row.

MMaynard · September 5, 2018, 1:20pm

You need to wrap the 0 in quotes to use this.

I tested it with a standard power table with random data on the string column.

dkhayes117 · September 5, 2018, 1:28pm

I appreciate the replies, I got it working finally with a little nudge from @pturmel. I’m sure that @PGriffith’s solution is more efficient then what I came up with, but it works and I’m using it on small datasets so if the shoe fits!

table = event.source.parent.getComponent('poGroup').getComponent('Table3') idx = table.data.rowCount for i in range(idx): newData = system.dataset.setValue(table.data, i, 'QUANTITY', 0) table.data = newData

pturmel · September 5, 2018, 1:54pm

Ugh! No. Don’t assign to table.data within the loop. That’ll generate a propertyChange event for every row you change. Use a local dataset variable instead of table.data within the loop (preloading with table.data before the loop). Then assign back to table.data after the loop, yielding a single change event.

dkhayes117 · September 5, 2018, 2:10pm

How’s this

table = event.source.parent.getComponent('poGroup').getComponent('Table3') copy = table.data idx = table.data.rowCount for i in range(idx): newData = system.dataset.setValue(copy, i, 'QUANTITY', 0) copy = newData table.data = copy

pturmel · September 5, 2018, 2:14pm

You’re an actual programmer now, @dkhayes117 !

ctonge · September 7, 2018, 1:16pm

Just going to put a plug in for the Perfect Abstractions mutable pydataset here…check it out in their Power Scripting module. For operations like this it’s much more intuitive. Not sure how it stacks up in performance tests vs system.dataset operations, but I’d have to guess it’s more efficient.

pturmel · September 7, 2018, 1:24pm

Dangerous. All other datasets are immutable (after finalization for a few specialized ones in gateway context), and deliberately so. Immutability is crucial to generating propertyChange events within the Java Swing ecosystem. The presence of this datatype in the Power Scripting module prevents me from ever recommending it, despite its other useful features.

ctonge · September 20, 2018, 7:59pm

If you can expand on what you mean by dangerous I’m all ears. Converting a table component’s dataset to a pa mutable dataset, and manipulating it’s data, then converting back to an ignition dataset for your table component is a bad thing? Keep in mind that you are levels above me in your knowledge here, I’m a converted SCADA guy, not a software developer.

pturmel · September 21, 2018, 2:45am

No, that part is fine. It's the assigning it to the table component and mutating in place (because it's mutable) that is bad.