How to Improve MSSQL Historian Query Performance?

The application currently has about 8,000 tags and that number could increase significantly.

Using the system.tag.queryTagHistory() function, retrieving all available samples from just the previous minute takes approximately 15 seconds.

Everything currently runs in the same Windows Server VM. I've seen some mention that the MSSQL server should be on its own dedicated server. Would separating the Ignition/Gateway to another server make that much difference? Since they are VMs, would they need to be on different physical hosts to see a benefit?

A gateway timer triggers the script. I've done some testing breaking the data into four instances of the timer, each running the same script but querying only 2,000 tags. This resulted in all four instances completing in about 5 seconds. Another test with eight instances, each querying 1,000 tags, finished in about 3 seconds.

I'm sure using multiple timers is bad for a few reasons I'm not aware of but it was any easy way to get dedicated threads for each instance of the script. It also means more timers need to be manually added as more tags are added.

I next plan on using the system.util.invokeAsynchronous() function to dynamically run the necessary number of script instances from one timer based on the total tag count.

Does this approach seem feasible?

How large are your partition sizes?
That many tags, depending on your storage settings, ie. On Change, tag group, min time/max time between records etc, will impact the query times.

We are down to 3 day partition sizes for the majority of our locations and that is with an average of just over 4000 tags per location.

Oh and yes, it is recommended to split the SQL Instance to it's own dedicated VM to prevent the two from fighting for resources.

1 Like

This. They do not have to be physically separate, as long as the hypervisor is configured for low latency, and not overcommitted. See IA's guidance for VMs. (Typical hypervisors are not properly set up out of the box.)

1 Like

The partition size is one month. As best I understand it, as we are only ever asking for values from the pervious minute, the partition size or using a pre-processed partition will not make the quires faster. They are set On Change, min time is one minute and max is one hour.

I'm curious why you seemingly need to query the history for every one of your historised tags?

1 Like

We are sending the data to another database. They first asked us to delay the data by 10 minutes, now it's just the last minute but if communications are down we also need to be able to send data that was missed. There is also a requirement to not forward on any bad data. On larger systems it will only be a subset of tags that are forwarded to the other database.

Note if the returnFormat is left as default then system.tag.queryTagHistory() will not return your raw data if that's what you're after. You will need to set it to return as Tall. Using the default will set it to "Wide" and give you a column per tag path and therefore each row will contain a single datetime and the interpolated values for each tagpath for that time.

However, I would most likely be looking into other ways to do this at the database instead, as I'm sure this will be far more efficient

1 Like

I do have the returnFormat set to Tall. I found earlier that Wide was causing issues getting the raw data.

Is the other database storing the data in the same format? Would a tag history splitter work better or easier?

They are also customizing the descriptions and adding some info about the points location in the system. This is currently based in the folder structure in Ignition.

The REST API is used to POST the data to an Azure database in a JSON format. Along with the timestamp, value and point name there are 4 other key-values that are set along with each point.

My understanding of using a history splitter or just writing directly to the other database was that the formats would not work. We also needed to delay the data sent by a certain amount of time, block some data that is marked as bad at the local BACnet device as well as add an alias point name and the other four key-values.

Ok, just checking. I thought maybe you were just pushing data to a second database in the same format, in which case the splitter would be easier, but since you're having to do some processing on it first, that won't work.

1 Like

Not true. Table size will absolutely affect performance. Remember even though we call them tables and generally look at the data in some set order, that isn’t how the data is actually stored. The bigger the table, the more the DB has to search through to find all of the records that satisfy your query.

That said, I have 32k tags on 1 month partitions and my queries perform much better. (I also have a dedicated DB server)

Are you sure that it is the query and not other processing. You say that you’ve timed the queries, but haven’t shown the code for how you’re doing that.

1 Like

What about using a historian like timescaledb for postgres (it's free)? This would be significantly faster to both injest and out..jest? data

Check the forum for configuration tips

Disclaimer: I've never personally tested timescaledb before, but I've read good things

1 Like

I've had great success with timescaledb recently. I have a mini linux pc with 16 gigs of ram, 12 of which dedicated to timescale. I have very granular tick data for stocks, 16 year worth, almost 1 trillion rows, and I can query a full days worth of data in under a second. Performance blows me away tbh.

Edit: its actually over 2 trillion rows

1 Like

Wow, that's impressive! I've been meaning to use it on a project but I forget about it every time I would have the chance...

1 Like

There are built in replication techniques for sql server to keep remote databases and tables up to date that might be better suited and appropriate for this like transactional replication (assuming your Azure db is also sql server)