How to get the number of rows in a table?

I am using a query binding, and attempting to use a Script transform to get the number of rows in the dataset, because I did not find a prop on the table that would allow me to get the number of rows.

In the Script console, I can make a dictionary like this:

d = {"name": ["one", "two","three","four"], 
	"number": [1,2,3,4]}

Then print(len(d['name'])) and get 4.

But, when trying to do the same in a Script transform, I get messages like the index must be an integer, or something about not being able to be coerced into a PyDataSet.

I've tried the toDataSet and toPyDataSet, to no avail. Such as len(dataset[0]['name']), len(dataset[0]) evaluates to the number of columns ( I think).

What am I missing? Thanks!

Edit:
And this:

def transform(self, value, quality, timestamp):
	set1 = len(value)
	return set1

Yields: basic streaming dataset has no len().

BUT, this seems to work:

def transform(self, value, quality, timestamp):
	set1 = system.dataset.toPyDataSet(value)
	set2 = len(set1)
	return set2

You can also do this

headers = ['test']
data = [[1],[2]]
ds = system.dataset.toDataSet(headers, data)
print ds.getRowCount()
print ds.rowCount
pyDs = system.dataset.toPyDataSet(ds)
print pyDs.getRowCount()
print pyDs.rowCount

result

>>>
2
2
2
2

You're working with datasets directly, not Python data structures, so the Python builtin len has no idea how to work with it. Wrapping with a PyDataset is fine; it's no significant performance overhead, but you can also directly ask it for the row count as @dkhayes117 suggested.

Correct me if im wrong but you should use ds.rowCount over ds.getRowCount() because "something Jython something" makes it more efficient.

Correct

Whats the difference here?

.rowCount is a bean property with a value that is directly accessible, where as getRowCount() is a component method that has to get the value and return it. You could view the method as a middle man between you and the property value.

In practice, it's probably impossible to visually observe the difference in efficiency between the using the property instead of the method, and I've never done any kind of performance testing to confirm the consensus that bean properties are better, but intuitively, it's easy to imagine why one would be more efficient than the other.

1 Like

Fewer instructions in the bytecode for the jython interpreter.

Directly, meaning:

query = "myQuery"
ds = system.db.runNamedQuery(query)
print(ds.rowCount)

And this does, indeed, work, without converting to any other dataset type.
Thank you, @dkhayes117 and @PGriffith

1 Like