The application currently has about 8,000 tags and that number could increase significantly.
Using the system.tag.queryTagHistory() function, retrieving all available samples from just the previous minute takes approximately 15 seconds.
Everything currently runs in the same Windows Server VM. I've seen some mention that the MSSQL server should be on its own dedicated server. Would separating the Ignition/Gateway to another server make that much difference? Since they are VMs, would they need to be on different physical hosts to see a benefit?
A gateway timer triggers the script. I've done some testing breaking the data into four instances of the timer, each running the same script but querying only 2,000 tags. This resulted in all four instances completing in about 5 seconds. Another test with eight instances, each querying 1,000 tags, finished in about 3 seconds.
I'm sure using multiple timers is bad for a few reasons I'm not aware of but it was any easy way to get dedicated threads for each instance of the script. It also means more timers need to be manually added as more tags are added.
I next plan on using the system.util.invokeAsynchronous() function to dynamically run the necessary number of script instances from one timer based on the total tag count.
How large are your partition sizes?
That many tags, depending on your storage settings, ie. On Change, tag group, min time/max time between records etc, will impact the query times.
We are down to 3 day partition sizes for the majority of our locations and that is with an average of just over 4000 tags per location.
This. They do not have to be physically separate, as long as the hypervisor is configured for low latency, and not overcommitted. See IA's guidance for VMs. (Typical hypervisors are not properly set up out of the box.)
The partition size is one month. As best I understand it, as we are only ever asking for values from the pervious minute, the partition size or using a pre-processed partition will not make the quires faster. They are set On Change, min time is one minute and max is one hour.
We are sending the data to another database. They first asked us to delay the data by 10 minutes, now it's just the last minute but if communications are down we also need to be able to send data that was missed. There is also a requirement to not forward on any bad data. On larger systems it will only be a subset of tags that are forwarded to the other database.
Note if the returnFormat is left as default then system.tag.queryTagHistory() will not return your raw data if that's what you're after. You will need to set it to return as Tall. Using the default will set it to "Wide" and give you a column per tag path and therefore each row will contain a single datetime and the interpolated values for each tagpath for that time.
However, I would most likely be looking into other ways to do this at the database instead, as I'm sure this will be far more efficient
They are also customizing the descriptions and adding some info about the points location in the system. This is currently based in the folder structure in Ignition.
The REST API is used to POST the data to an Azure database in a JSON format. Along with the timestamp, value and point name there are 4 other key-values that are set along with each point.
My understanding of using a history splitter or just writing directly to the other database was that the formats would not work. We also needed to delay the data sent by a certain amount of time, block some data that is marked as bad at the local BACnet device as well as add an alias point name and the other four key-values.
Ok, just checking. I thought maybe you were just pushing data to a second database in the same format, in which case the splitter would be easier, but since you're having to do some processing on it first, that won't work.
Not true. Table size will absolutely affect performance. Remember even though we call them tables and generally look at the data in some set order, that isn’t how the data is actually stored. The bigger the table, the more the DB has to search through to find all of the records that satisfy your query.
That said, I have 32k tags on 1 month partitions and my queries perform much better. (I also have a dedicated DB server)
Are you sure that it is the query and not other processing. You say that you’ve timed the queries, but haven’t shown the code for how you’re doing that.
I've had great success with timescaledb recently. I have a mini linux pc with 16 gigs of ram, 12 of which dedicated to timescale. I have very granular tick data for stocks, 16 year worth, almost 1 trillion rows, and I can query a full days worth of data in under a second. Performance blows me away tbh.
There are built in replication techniques for sql server to keep remote databases and tables up to date that might be better suited and appropriate for this like transactional replication (assuming your Azure db is also sql server)
"As best I understand it, as we are only ever asking for values from the pervious minute," and need raw values.
I don't know what to tell you. I tested it and there was no performance improvement. After doing a little research, I believe this is because we are only asking for the previous minute and need to get raw values.
There's another group dealing with the Azure side of things and the setup was decided by others over a year ago now. I'll check with them and see, but I think it was the initial myriad of requirements that got us where we are now.
Another guy in the company that's done Ignition a lot more than I have has also mentioned timescaledb for postgres. It doesn't feel like we're doing anything super fancy so it just feels surprising that the different databases can be so radically different in performance. Is it just something about the way the queries have to be structured for an MS SQL database that are super inefficient?
The free and open source stuff just scares me a little as I've seen stuff, lose support and just kind of disappear at times. I actually had an issue with postgres and Windows security updates. Someone had said well of course, because you're running on Windows, you need to be running on another OS, etc. That's difficult for some in the company and some of our customers to leave the Windows ecosystem.
The difference is that a historian is highly geared towards timeseries data, relational databases are not.
Timescalesb also offer a hosted solution which is supported