Is there a way, or did anyone tried, to aggregate old data in the historian.
We are recording a lot of instruments by the second to review and diagnose processes when there is a problem (temperature, pressure, level, flow, etc…).
We want history to be kept for 2 years, but the high precision is only useful for the first month. We would like a way to aggregate instrument data in the historian to one data point per minute after one month.
Can you aggregate older historian data stored in the database?
The closest thing in Ignition is Pre-processed History Partitions:
But is apparently opposite of your desire: it uses more storage to gain better aggregate query performance. None of the fine-grained data is discarded.
Of course you can, but there is no first party (or thrid party that I know of) method that will do it. You would have to script it yourself, either within Ignition (probably a Scheduled Gateway Task) or within whatever database management system you are using.
Not a significantly difficult task, but not exactly trivial either.
Much depends on exactly how you want to aggregate the data. You can use ignitions existing functions to do the aggregation for you assuming the options fit your needs. But removing the data you don’t want from the existing data tables and keeping/changing the data that you do want would be entirly up to you.
I went through this exercise with a client. We explored two solutions:
Scheduled within database (each mo.): Iterate over all tags within all tables 2mo old, delete all timestamps which are within x time of the previously kept timestamp. This would more-closely resemble historical data that was polled slower. Optionally, iterate over all tables and increase timeframe as tables become older.
Scheduled within Ignition (each mo.): Query for all tags with history configured, return relevant info about each tag (tag type, deadbands, compression used, etc.). Iterate over each tag, query for periods of historic data, apply new compression algorithm based on each tag type.
While #2 was the 'better solution' (my opinion), it became an endless 'what-if' conversation on how to handle various scenarios (special requirements on a specific tag, periods of bad quality, steady process variables (temperatures) vs sporadic (flow rates), etc.).
In the end, #1 was deemed acceptable, and easier to understand, implement, and maintain...also happened to match the data collection of their other SCADA system (with 15 minute polling).
I have considered doing this myself, but instead of deleting based on the time, I was working with and average and max/min stored over that a larger time base, that way you don't lose the critical detail too much.
That is what I am considering aggregating the average or min/max.
I agree with @Chris_Bingham who raise the issue of steady process vs sporadic variables. A flow meter is unused 95% of the time, but when used the data is critical.
One thing to improve is having better deadbands per specific variable; currently deadbands are defined per equipment type. But as you can expect, a levelmeter for a 30ft tank vs 3ft tub should not share the same deadband; same issue if the unit is “ft” vs “inches”.