Combining rows in a dataset when column names are equal

accyroy · July 13, 2017, 12:32pm

I have a dataset like this:

A	B	B	B	C	D	E	E
2	4	3	4	7	4	5	5
3	2	3	3	2	5	3	1

How can I easily combine and add the rows where the columns are equal? From the above dataset I would like this result:

A	B	C	D	E
2	11	7	4	10
3	8	2	5	4

Thanks

paul-griffith · July 13, 2017, 6:08pm

Well, this was fun. I make no guarantees of its accuracy, and good luck to you if the columns aren’t nicely in order as they are in the example. But this should give you something to go from:

ds = system.dataset.toDataSet(
	['A', 'B', 'B', 'C', 'D', 'D', 'D'],
	[
		[1, 2, 2, 3, 4, 4, 4]
		, [1, 4, 4, 3, 0, 0, 0]
	])

columns = system.dataset.getColumnHeaders(ds)

headers = []
data = {}
for row in range(ds.rowCount):
	for col in range(ds.columnCount):
		l = data.get(columns[col])
		if l is None:
			data[columns[col]] = [ds.getValueAt(row, col)]
		else:
			try:
				l[row] += ds.getValueAt(row, col)
			except IndexError:
				l.append(ds.getValueAt(row, col))

headers = [h for h in sorted(data.keys())]

d = []
for key in data.keys():
	for index, row in enumerate(data[key]):
		r = []
		for k in sorted(data.keys()):
			r.append(data[k][index])
		d.append(r)
	break

print headers
print d

Input:

['A', 'B', 'B', 'C', 'D', 'D', 'D']
[1, 2, 2, 3, 4, 4, 4]
[1, 4, 4, 3, 0, 0, 0]

Output:

>>> 
[u'A', u'B', u'C', u'D']
[[1, 4, 3, 12], [1, 8, 3, 0]]

accyroy · July 14, 2017, 8:24am

Brilliant Paul, that works perfectly. I was working along the same lines as you but my brain was telling me there has to be an easier way that I didn’t know about.

+1 Fat Rabbit Pint token to be redeemed in September.