Get Difference value from dataset

prasath.t · January 25, 2023, 11:47am

Hi , I have a dataset with below mentioned columns

|t_stamp|running|starting|stop|stopping|
|---|---|---|---|---|
|12:00:00 AM|2026686.859|0|604281.2019|0|
|12:04:48 AM|2027128.76|0|604414.1515|0|
|12:09:36 AM|2027569.067|0|604545.232|0|
|12:14:24 AM|2028008.825|0|604675.712|0|
|12:19:12 AM|2028451.945|0|604806.4073|0|
|12:24:00 AM|2028898.808|0|604938.972|0|
|12:28:48 AM|2029341.828|0|605073.8736|0|
|12:33:36 AM|2029784.787|0|605209.7999|0|
|12:38:24 AM|2030227.914|0|605342.7043|0|

Image for your reference

I need to subtract the first row and second row and get the difference and store the value in new dataset with 2nd row timestamp column. Like wise i need to continue for 2nd row with 3rd row and upto last row

how to do this? can anyone please help me out

JordanCClark · January 25, 2023, 12:56pm

The use of PyDataSets makes it easier.

sampleHeaders = ['t_stamp','running','starting','stop','stopping']

sampleData = [['12:00:00 AM', 2026686.859, 0, 604281.2019,0],
        ['12:04:48 AM', 2027128.760, 0, 604414.1515,0],
        ['12:09:36 AM', 2027569.067, 0, 604545.232, 0],
        ['12:14:24 AM', 2028008.825, 0, 604675.712, 0],
        ['12:19:12 AM', 2028451.945, 0, 604806.4073, 0],
        ['12:24:00 AM', 2028898.808, 0, 604938.972, 0],
        ['12:28:48 AM', 2029341.828, 0, 605073.8736, 0],
        ['12:33:36 AM', 2029784.787, 0, 605209.7999, 0],
        ['12:38:24 AM', 2030227.914, 0, 605342.7043, 0],
       ]

sampleDataset = system.dataset.toDataSet(sampleHeaders, sampleData)

############################################################################

# PyDataSets lets us easily slice and iterate in a pythonic way.
dataIn = system.dataset.toPyDataSet(sampleDataset)

# Initialize values for creating the output dataset
colNames = list(dataIn.getColumnNames())
data = []

# zip() lets us iterate over multiple objects at the same time. 
# We just happen to use the same PyDataSet starting at the next row.
for prevRow, row in zip(dataIn, dataIn[1:]):
	newRow = [row['t_stamp']]
	# iterate over the remaining columns to get differences.
	for col in colNames[1:]:
		newRow.append(row[col] - prevRow[col])
	data.append(newRow)

dataOut = system.dataset.toDataSet(colNames, data)

dataIn and dataOut:

row | t_stamp     | running     | starting | stop        | stopping
-------------------------------------------------------------------
0   | 12:00:00 AM | 2026686.859 | 0        | 604281.2019 | 0       
1   | 12:04:48 AM | 2027128.76  | 0        | 604414.1515 | 0       
2   | 12:09:36 AM | 2027569.067 | 0        | 604545.232  | 0       
3   | 12:14:24 AM | 2028008.825 | 0        | 604675.712  | 0       
4   | 12:19:12 AM | 2028451.945 | 0        | 604806.4073 | 0       
5   | 12:24:00 AM | 2028898.808 | 0        | 604938.972  | 0       
6   | 12:28:48 AM | 2029341.828 | 0        | 605073.8736 | 0       
7   | 12:33:36 AM | 2029784.787 | 0        | 605209.7999 | 0       
8   | 12:38:24 AM | 2030227.914 | 0        | 605342.7043 | 0       

row | t_stamp     | running | starting | stop     | stopping
------------------------------------------------------------
0   | 12:04:48 AM | 441.901 | 0        | 132.9496 | 0       
1   | 12:09:36 AM | 440.307 | 0        | 131.0805 | 0       
2   | 12:14:24 AM | 439.758 | 0        | 130.48   | 0       
3   | 12:19:12 AM | 443.12  | 0        | 130.6953 | 0       
4   | 12:24:00 AM | 446.863 | 0        | 132.5647 | 0       
5   | 12:28:48 AM | 443.02  | 0        | 134.9016 | 0       
6   | 12:33:36 AM | 442.959 | 0        | 135.9263 | 0       
7   | 12:38:24 AM | 443.127 | 0        | 132.9044 | 0

prasath.t · January 25, 2023, 1:20pm

One issue i am using system.tag.queryTagHistory to get the data from database

but for same minute i am getting many rows for different seconds.

I want only one row data for the same minute and do the difference calucaltion

Is there any possibility to handle that in your script?

JordanCClark · January 25, 2023, 1:23pm

Post some sample data (CSV is fine) and I'll see what can be done.

prasath.t · January 25, 2023, 1:26pm

Thanks i have attached it
sampleData.csv (10.9 KB)

JordanCClark · January 25, 2023, 7:24pm

When you use Excel to save your csv, please format the t_stamp column to have the date. Saves me having to add them...

dataIn = system.dataset.toPyDataSet(sampleData)
colNames = list(dataIn.getColumnNames())

# Sort the items into a dictionary. Key keys are the t_stamps truncated to the minute
dataDict = {}
for row in dataIn:
	# Get the timestamp value and set the seconds to 0
	t_stamp = row['t_stamp']
	t_stamp.setSeconds(0)
	
	# Check if the t_stamp exists. 
	if t_stamp not in dataDict.keys():
		# Make an entry if it doesn't
		dataDict[t_stamp] = {}
		# In the nested dictionary, add the column name and it's value-as-a-list
		for col in colNames[1:]: 
			dataDict[t_stamp][col] = [row[col]]
	else:
		# Otherwise, append the column values into their respective lists.
		for col in colNames[1:]:
			dataDict[t_stamp][col].append(row[col])

# Construct dataset
data = []
t_stampList = sorted(dataDict.keys())
for prevT_stamp, t_stamp in zip(t_stampList, t_stampList[1:]):
	newRow = [t_stamp]
	for col in colNames[1:]:
		newRow.append(max(dataDict[t_stamp][col]) - max(dataDict[prevT_stamp][col]))
	data.append(newRow)	

dataOut = system.dataset.toDataSet(colNames, data)

row | t_stamp                      | Running | starting | stop     | stopping
-----------------------------------------------------------------------------
0   | Wed Jan 25 18:19:00 EST 2023 | 123.5   | 0        | 548.4028 | 0       
1   | Wed Jan 25 18:20:00 EST 2023 | 124.25  | 0        | 207.2763 | 0       
2   | Wed Jan 25 18:21:00 EST 2023 | 124.25  | 0        | 0.0      | 0       
3   | Wed Jan 25 18:22:00 EST 2023 | 124.25  | 0        | 0.0      | 0       
4   | Wed Jan 25 18:23:00 EST 2023 | 124.75  | 0        | 0.0      | 0       
5   | Wed Jan 25 18:24:00 EST 2023 | 125.0   | 0        | 0.0      | 0       
6   | Wed Jan 25 18:25:00 EST 2023 | 124.5   | 0        | 0.0      | 0       
7   | Wed Jan 25 18:26:00 EST 2023 | 124.0   | 0        | 0.0      | 0       
8   | Wed Jan 25 18:27:00 EST 2023 | 123.0   | 0        | 0.0      | 0       
9   | Wed Jan 25 18:28:00 EST 2023 | 77.75   | 0        | 0.0      | 0

prasath.t · January 27, 2023, 5:09am

JordanCClark:

dataIn = system.dataset.toPyDataSet(sampleData)
colNames = list(dataIn.getColumnNames())

# Sort the items into a dictionary. Key keys are the t_stamps truncated to the minute
dataDict = {}
for row in dataIn:
	# Get the timestamp value and set the seconds to 0
	t_stamp = row['t_stamp']
	t_stamp.setSeconds(0)
	
	# Check if the t_stamp exists. 
	if t_stamp not in dataDict.keys():
		# Make an entry if it doesn't
		dataDict[t_stamp] = {}
		# In the nested dictionary, add the column name and it's value-as-a-list
		for col in colNames[1:]: 
			dataDict[t_stamp][col] = [row[col]]
	else:
		# Otherwise, append the column values into their respective lists.
		for col in colNames[1:]:
			dataDict[t_stamp][col].append(row[col])

# Construct dataset
data = []
t_stampList = sorted(dataDict.keys())
for prevT_stamp, t_stamp in zip(t_stampList, t_stampList[1:]):
	newRow = [t_stamp]
	for col in colNames[1:]:
		newRow.append(max(dataDict[t_stamp][col]) - max(dataDict[prevT_stamp][col]))
	data.append(newRow)	

dataOut = system.dataset.toDataSet(colNames, data)

Hi @JordanCClark

I have tested your code but i am getting wrong output. I don't know where i am making mistake (for same minutes i am getting difference value and its creating dataset)

I have attached project file also for your reference. can you please correct me where i am doing wrong

test_data.zip (29.0 KB)

pascal.fragnoud · January 27, 2023, 9:08am

Can you clarify this, maybe with an example ?

prasath.t · January 27, 2023, 10:31am

can you please check my test project i have uploaded.

there will be example dataset will be there

if i use system.tag.queryhistory its returning same minutes many rows with different seconds difference.
But i want to subtract one minute row with another minute row and get the difference value and store it in new dataset
like this

row | t_stamp     | running | starting | stop     | stopping
------------------------------------------------------------
0   | 12:04:48 AM | 441.901 | 0        | 132.9496 | 0       
1   | 12:09:36 AM | 440.307 | 0        | 131.0805 | 0       
2   | 12:14:24 AM | 439.758 | 0        | 130.48   | 0       
3   | 12:19:12 AM | 443.12  | 0        | 130.6953 | 0       
4   | 12:24:00 AM | 446.863 | 0        | 132.5647 | 0       
5   | 12:28:48 AM | 443.02  | 0        | 134.9016 | 0       
6   | 12:33:36 AM | 442.959 | 0        | 135.9263 | 0       
7   | 12:38:24 AM | 443.127 | 0        | 132.9044 | 0

pascal.fragnoud · January 27, 2023, 10:58am

I can't, I don't have access to an ignition installation at the moment.

JordanCClark · January 27, 2023, 12:32pm

Victor is on the right track methinks; however, I'll be held up in meetings and such for most of my day.

JordanCClark · January 29, 2023, 11:38pm

As @victordcq had mentioned java.util.date does not have a setMillisecond(). However, we do have access to Apache DateUtils.

from org.apache.commons.lang3.time import DateUtils

for row in dataIn:
	# Get the timestamp value and set the seconds to 0
	t_stamp = row['t_stamp']
	t_stamp = DateUtils.setSeconds(t_stamp, 0)
	t_stamp = DateUtils.setMilliseconds(t_stamp, 0)

victordcq · January 30, 2023, 7:33am

you read my deleted message?
i deleted it because the to string doesnt look to milis, but actually now im not sure if the to string even happend on the dicts.
i had no time to check so i just deleted it

JordanCClark · January 30, 2023, 9:27am

Yes, yes I did.

victordcq · January 30, 2023, 10:07am

xD if you react to it, next time you should also undelete it. so others can follow

JordanCClark · January 30, 2023, 12:59pm

You should probably just edit it to say 'Never mind...' lol