Sum of average value in the dataset

I want to sum a data set where inside the dataset, an average value of every 5 min. (I have a dataset that produces every second and I need to get the addition of the average value in every 5min interval)
How can I do this?
Note that: Dataset is timestamped and I am using perspective

Is the dataset ordered by timestamp?


You need to group by the "truncated" timestamp. I would use my groupBy() expression function (from my free Integration Toolkit), getting the truncated timestamps by converting to millis, taking the modulus of 300000, and subtracting those millis from the original timestamp.

Wrap the grouping with a sum(forEach(mean())). If you post the CSV of a sample dataset, I can work out the expression for you.

How can I find these functions? The usage of 'grounBy()' is not familiar to me.
I am trying to solve this through the script and writing two 'for' loops.

1 Like

This is the formula

I need to calculate the summation of every 15min averaged P_t.

This is the dataset and it has 50k entries.

Please paste the dataset as text (use the copy to clipboard icon in the dataset property editor). I'm not going to type all those in.

(And tell us the right answer, computed by hand if necessary, so we can check our code. That formula is not particularly clear.)

1 Like

Here attached the dataset
data.txt (40.3 KB)

It is a huge formula. I do it part by part. So simply, I need to get the summation of 15min averaged P(power) value.

Do you want the averages to start on the 5-minute boundaries or the 15-minute boundaries? (Also, please follow directions on creating CSVs--I can't just paste this into Ignition because it wasn't copied out of Ignition's dataset property editor.)

It is 15min

The grouping part of the expression would look like this:

	addMillis(it()['t_stamp'], -(toMillis(it()['t_stamp']) % 900000))

This will produce a list of 15-minute groups, in this shape:

	[2023-11-15 16:15:00.000, Dataset{nRx2C}],
	[2023-11-15 16:30:00.000, Dataset{nRx2C}],
	[2023-11-15 16:45:00.000, Dataset{nRx2C}],
	[2023-11-15 17:00:00.000, Dataset{nRx2C}],
	[2023-11-15 18:45:00.000, Dataset{nRx2C}]

Note that the grouping key is the beginning of the 15-minute period, and the accompanying dataset contains just the rows that fall within that period.

To get all of the averages added together, you nest that inside an iterator that will take the mean of each inner dataset, and present as a list to sum(), like so:

			addMillis(it()['t_stamp'], -(toMillis(it()['t_stamp']) % 900000))
		mean(it()[1], 'PoC_RealPowerTotal_kW')


The source dataset must be composed of complete sets of rows for each period. It is up to you to ensure the query binding that gets this data has a start timestamp that is similarly truncated to a 15-minute boundary, and is a multiple of fifteen minutes long. (Less one millisecond on the end timestamp, for ridiculous reasons, if coming from the tag historian.)

forEach(), groupBy(), and it() are functions from my Integration Toolkit. toMillis(), addMillis(), mean(), and sum() are native Ignition expression functions, in the "Aggregates" section.

First Phil's module comes highly recommended, and will be far more performant than doing this in a script, so if you can use his module, then you absolutely should.

That being said, you can achieve this in script as well. I will note that this is not exactly trivial python to the uninitiated, so it will quite possibly look very cryptic. If that is the case, then you will just have to take me at my word that this function indeed returns the sum of the averages of 15 min groups from a dataset.

You may also want to consider running this in an asynchronous thread since it could potentially take a fair amount of time to process (depending on the size of your dataset)

from itertools import count,groupby

#colToGroupBy should be the index of your t_stamp column
def dsGroup(ds,colToGroupBy,msTimeRange):
    c = count()

    def keyfunc(r):
        return[r][colToGroupBy],[r][colToGroupBy]) % msTimeRange)

    return (list(g) for _,g in groupby(ds,lambda x: keyfunc(next(c))))

ds = {} #or a function/value that returns your dataset

print sum(sum(kw for kw,_ in group) / len(group) for group in dsGroup(ds,1,900000))