Hi, this is the code i am using. but i am getting 2R * 1C… But i want 1R * 2C. How to achieve this?
what change i want to do in my script. Please let me know
filteredData = system.dataset.filterColumns(ds11, eastWest)
print filteredData
colNames = list(filteredData .getColumnNames())
dataOut = []
s = []
for colNum in range(filteredData .getColumnCount()):
colValues = filteredData.getColumnAsList(colNum)
if all(value is None for value in colValues):
dataOut.append(0)
else:
print colValues
#print '...sdds...'
dataOut.append(sum(filter(None, colValues)))
s.append(len(filter(None, colValues)))
print dataOut
print s
z = [m/n for m, n in zip(dataOut, s)]
ds = system.dataset.toDataSet(['Column Name'], [[x] for x in z])
print ds
print ds.getValueAt(1,0)
Nothing i am getting 3 columns. I am filtering t stamp from dataset
So i am geting 2R *1C
I am running named query for a particular time and geting values… i adding the row values of each column and removing the null and taking Mean of each column sum value and want to show in single row
I want [15, 13] this list to show in dataset like this
When you use system.dataset.toDataset(), you are providing a single columnName and then a list of lists for the row data. The function then understandably takes what you provided and returns a dataset with 1 column and as many rows as there were lists in the second argument.
What you want to provide to that function is as many column names as you had rows in your filteredData, and then a list with only 1 list in it, but that inner list having as many items as you had rows in your filteredData.
Example:
headers = ["col1", "col2", "col3"]
data = [[1,2,3]]
system.dataset.toDataSet(headers, data)
I see several things that look rather sketchy in this script.
What happens if all the values in a column are 0 ?
You append 0 to dataOut, but nothing to s. Which is understandable, since you then divide by values in s.
But you’re also zipping dataOut and s, which won’t have the same number of values. which means your divisions are using values that were not initially aligned. And some values in dataOut are left out. Is this really what you want ?
What is eastWest ? Do you know how many values are in it ?
What is the whole thing supposed to do ? As it is, it looks like maintenance nightmare: it’s pretty much impossible to understand what you’re trying to do by reading the code.
What does dataOut represent ? What does s represent ? What is z ? Names matter !
If you can explain what it is that script is supposed to do, what’s its input and what’s the expected output, I’ll help with writing something more… maintainable. And that does what it’s supposed to.
edit: A few examples of what could be reworked to make things cleaner
You’re filtering your initial dataset with eastWest, then iterating through the new dataset and picking columns one at a time.
Why not just iterate through eastWest and pick the corresponding columns ?
# if eastWest is a list of column names
for colName in eastWest:
colIndex = ds11.getColumnIndex(colName)
colValues = filteredData.getColumnAsList(colNum)
# if eastWest is a list of column indices
for colIndex in eastWest:
colValues = filteredData.getColumnAsList(colNum)
You could also replace this part:
if all(value is None for value in colValues):
dataOut.append(0)
else:
dataOut.append(sum(filter(None, colValues)))
s.append(len(filter(None, colValues)))
assuming you do want to not append anything when the len is 0.
It doesn’t reduce the number of lines of codes, but it reduces clutter, and avoids repeating some operations.
You only need to iterate through the columns once when you’re constructing the new dataset.
(Note, I changed your filteredDataset line to use the sample data.
sampleHeaders = ['date','H01','H02']
sampleData = [
['2021-11-23 00:00:00', None, 40],
['2021-11-23 00:00:00', 24, 90],
['2021-11-23 00:00:00', 89, None]
]
dataIn = system.dataset.toDataSet(sampleHeaders, sampleData)
#filteredData = system.dataset.filterColumns(ds11, eastWest)
filteredData = system.dataset.filterColumns(dataIn, ['H01','H02'])
colNames = list(filteredData.getColumnNames())
dataOut = []
for i in range(len(colNames)):
# Get valid values
colValues = filter(None, filteredData.getColumnAsList(i))
if len(colValues) == 0:
dataOut.append(0)
else:
# Get mean of valid values and append to the row.
dataOut.append(system.math.mean(colValues))
# Since we are only constructing one row, and system.dataset.toDataSet() requires a list of lists,
# we'll need to wrap dataOut in a list when we make the dataset
ds = system.dataset.toDataSet(colNames, [dataOut])
Because it's not a script, it's a snippet. It's not meant to be copy/pasted and be functional immediately.
If you want a script...
def mean(values):
values = filter(None, values)
if values:
return float(sum(values)) / len(values)
return None
# if eastWest is a list of column indices
header = [data.getColumnName(i) for i in eastWest]
means = filter(None, (mean(data.getColumnAsList(i)) for i in eastWest))
ds = system.dataset.toDataSet(header, [means])
# if eastWest is a list of column names
eastWest = ["Bar", "Baz"]
indices = [data.getColumnIndex(name) for name in eastWest]
means = filter(None, (mean(data.getColumnAsList(i)) for i in indices))
ds = system.dataset.toDataSet(eastWest, [means])
Pick the version you need depending on what eastWest is for you (I did ask earlier, but you didn't answer the question).
If you'd rather have a version using system.math.mean, replace the function call and filter with number == number. system.math.mean return NaN for empty lists, and NaN != Nan.
indices = [data.getColumnIndex(name) for name in eastWest]
means = [system.math.mean(filter(None, data.getColumnAsList(i))) for i in indices]
means = [m for m in means if m == m]
ds = system.dataset.toDataSet(eastWest, [means])