Move row values to column in dataset

prasath.t · September 26, 2022, 12:32pm

Hi, this is the code i am using. but i am getting 2R * 1C… But i want 1R * 2C. How to achieve this?
what change i want to do in my script. Please let me know

filteredData = system.dataset.filterColumns(ds11, eastWest)
print filteredData 




colNames = list(filteredData .getColumnNames())
dataOut = []
s = []
for colNum in range(filteredData .getColumnCount()):

	colValues = filteredData.getColumnAsList(colNum)
	
	if all(value is None for value in colValues):
		dataOut.append(0)
	else:
		
		print colValues
		#print '...sdds...'
		dataOut.append(sum(filter(None, colValues)))
		s.append(len(filter(None, colValues)))
		
print dataOut
print s

z = [m/n for m, n in zip(dataOut, s)]
ds = system.dataset.toDataSet(['Column Name'], [[x] for x in z])
print ds
print ds.getValueAt(1,0)

output
[15, 13]
Dataset [2R ⅹ 1C]

what i need is 1R * 2C

dkhayes117 · September 26, 2022, 12:44pm

Can you post sample data for filteredData and what the final output should look like (IE what you are trying to do)?

prasath.t · September 26, 2022, 1:08pm

Nothing i am getting 3 columns. I am filtering t stamp from dataset

So i am geting 2R *1C

I am running named query for a particular time and geting values… i adding the row values of each column and removing the null and taking Mean of each column sum value and want to show in single row

I want [15, 13] this list to show in dataset like this

Column1 column2
15. 13

djs · September 26, 2022, 1:25pm

When you use system.dataset.toDataset(), you are providing a single columnName and then a list of lists for the row data. The function then understandably takes what you provided and returns a dataset with 1 column and as many rows as there were lists in the second argument.

What you want to provide to that function is as many column names as you had rows in your filteredData, and then a list with only 1 list in it, but that inner list having as many items as you had rows in your filteredData.

Example:

headers = ["col1", "col2", "col3"]
data = [[1,2,3]]
system.dataset.toDataSet(headers, data)

pascal.fragnoud · September 26, 2022, 2:29pm

I see several things that look rather sketchy in this script.
What happens if all the values in a column are 0 ?
You append 0 to dataOut, but nothing to s. Which is understandable, since you then divide by values in s.
But you’re also zipping dataOut and s, which won’t have the same number of values. which means your divisions are using values that were not initially aligned. And some values in dataOut are left out. Is this really what you want ?

What is eastWest ? Do you know how many values are in it ?
What is the whole thing supposed to do ? As it is, it looks like maintenance nightmare: it’s pretty much impossible to understand what you’re trying to do by reading the code.
What does dataOut represent ? What does s represent ? What is z ? Names matter !

If you can explain what it is that script is supposed to do, what’s its input and what’s the expected output, I’ll help with writing something more… maintainable. And that does what it’s supposed to.

edit: A few examples of what could be reworked to make things cleaner
You’re filtering your initial dataset with eastWest, then iterating through the new dataset and picking columns one at a time.
Why not just iterate through eastWest and pick the corresponding columns ?

# if eastWest is a list of column names
for colName in eastWest:
	colIndex = ds11.getColumnIndex(colName)
	colValues = filteredData.getColumnAsList(colNum)

# if eastWest is a list of column indices
for colIndex in eastWest:
	colValues = filteredData.getColumnAsList(colNum)

You could also replace this part:

if all(value is None for value in colValues):
	dataOut.append(0)
else:
	dataOut.append(sum(filter(None, colValues)))
	s.append(len(filter(None, colValues)))

by

values = filter(None, ColValues)
valuesLen = len(values)
dataOut.append(sum(values))
if valuesLen:
	s.append(valuesLen)

assuming you do want to not append anything when the len is 0.
It doesn’t reduce the number of lines of codes, but it reduces clutter, and avoids repeating some operations.

prasath.t · September 26, 2022, 3:13pm

ds11 = system.tag.queryTagHistory(paths=[

"[MQTT Engine]Edge Nodes/VaujoursV5/Line/Takeoff/Takeoff_/DET/DET 1/Production Data/H01","[MQTT Engine]Edge Nodes/VaujoursV5/Line/Takeoff/Takeoff_/DET/DET 1/Production Data/H02"
], 
			startDate='2022-08-31 22:00:00', endDate='2022-08-31 22:00:10',returnSize=-1,  aggregationMode="MinMax", returnFormat='Wide',noInterpolation = False)

get out from query 40R*3C

i filtering the t stamp column by using filter data
eastWest = ['H01',H02'] - column i want to use

column i am filtering

dataOut is used for sum the each column values

s is used get the each column count and i am dividing to get the mean of each column

i am removing the null values i am getting null values from system.tag.query that's why

prasath.t · September 26, 2022, 3:17pm

values = filter(None, ColValues)
valuesLen = len(values)
dataOut.append(sum(values))
if valuesLen:
	s.append(valuesLen)

i say your script after the sum the values. i want to take mean for each value.. that is missing in your scirpt

i want to add all values in each column and take mean for that value that's it

finally want to pass that list to a new dataset.. each value in separate column
if i pass list directly to a new dataset i am getting 2R *1C - output

but i want 1R * 2C output..

i don't know how to do this

if you help me to optimise the script. it will useful for me

I hope now you understand what i am trying to do?

JordanCClark · September 26, 2022, 4:12pm

You only need to iterate through the columns once when you’re constructing the new dataset.
(Note, I changed your filteredDataset line to use the sample data.

sampleHeaders = ['date','H01','H02']
sampleData = [
              ['2021-11-23 00:00:00',   None, 40],
              ['2021-11-23 00:00:00',   24,   90],
              ['2021-11-23 00:00:00',   89, None]
			 ]
dataIn = system.dataset.toDataSet(sampleHeaders, sampleData)

#filteredData = system.dataset.filterColumns(ds11, eastWest)
filteredData = system.dataset.filterColumns(dataIn, ['H01','H02'])

colNames = list(filteredData.getColumnNames())
dataOut = []

for i in range(len(colNames)):
	# Get valid values
	colValues = filter(None, filteredData.getColumnAsList(i))
	
	if len(colValues) == 0:
		dataOut.append(0)
	else:
		# Get mean of valid values and append to the row.
		dataOut.append(system.math.mean(colValues))
# Since we are only constructing one row, and system.dataset.toDataSet() requires a list of lists,
# we'll need to wrap dataOut in a list when we make the dataset
ds = system.dataset.toDataSet(colNames, [dataOut])

pascal.fragnoud · September 26, 2022, 5:31pm

Because it's not a script, it's a snippet. It's not meant to be copy/pasted and be functional immediately.

If you want a script...

def mean(values):
	values = filter(None, values)
	if values:
		return float(sum(values)) / len(values)
	return None

# if eastWest is a list of column indices
header = [data.getColumnName(i) for i in eastWest]
means = filter(None, (mean(data.getColumnAsList(i)) for i in eastWest))
ds = system.dataset.toDataSet(header, [means])

# if eastWest is a list of column names
eastWest = ["Bar", "Baz"]
indices = [data.getColumnIndex(name) for name in eastWest]
means = filter(None, (mean(data.getColumnAsList(i)) for i in indices))
ds = system.dataset.toDataSet(eastWest, [means])

Pick the version you need depending on what eastWest is for you (I did ask earlier, but you didn't answer the question).

If you'd rather have a version using system.math.mean, replace the function call and filter with number == number. system.math.mean return NaN for empty lists, and NaN != Nan.

indices = [data.getColumnIndex(name) for name in eastWest]
means = [system.math.mean(filter(None, data.getColumnAsList(i))) for i in indices]
means = [m for m in means if m == m]
ds = system.dataset.toDataSet(eastWest, [means])