How to Improve MSSQL Historian Query Performance?

pturmel · May 20, 2025, 3:37pm

Part of the point of open source is to fix this problem, because if the original source stops publishing, the source code can be forked and maintained by anyone who needs it. Collaboratively, typically.

As for PostgreSQL, you should know that it is financially supported by a bunch of big companies that depend on it, like Google and Amazon. (Google's Big Data and Amazon's RedShift are both tweaked and rebranded derivatives of PostgreSQL.) PostgreSQL is probably the open source project least likely to be abandoned.

Any conflict between Windows and PostgreSQL I would lay at Microsoft's feet. MS has a long history of anti-competitive behavior, and PostgreSQL is a huge commercial threat to their DB platforms. (One of the many reasons I consider Windows on the production floor to be engineering malpractice.)

lrose · May 20, 2025, 4:12pm

How many total rows are in one of the monthly partitions?

r123 · May 20, 2025, 4:49pm

The SELECT COUNT(*) for last month shows 11024606.
I also asked support about it and they told me the same thing. ???

r123 · May 20, 2025, 5:17pm

The Ignition system is not really doing anything else other than historizing data and the CPU idles at 3-4%. If I have the Ignition system overview screens and a few Perspective screens used for testing open, the CPU idle is about 35%.

Once the system is setup, none of those screens need to be left open and no one will be running them remotely. No process screens or trends or multiple users to support.

When the POST script runs the CPU usages spiks to 100% for a few seconds.

For this use case, would moving the MSSQL database to another VM make as much of a difference?

Hmm...
"8.3 adds a built in historical logging time series database that can handle time series data much better than what is available in 8.1."

pturmel · May 20, 2025, 5:29pm

SCADA and database workloads have incompatible CPU and RAM utilization behavior. This may be tolerable in a training or test or development environment, but is not appropriate for production environments.

Move the DB to a different system. If moving to a different VM in the same hypervisor, make sure the hypervisor is set up for low latency, and is not overcommitted. Per IA's VM deployment guidance. (Commercial hypervisors are typically not correctly configured out of the box.)

jhottell · May 22, 2025, 4:54am

11 million rows is not a lot for any database engine, 11B then yes understandable it would be slow.
Here are some talking points.

Like Paul mentions, your DB should be on its own dedicated VM with adequate resources. You also get the convenience of being able to maintain and restart your DB and not SCADA at the same time and vice versa. There is no excuse to have your DB on the same box as SCADA unless you have a very small simple system.
Partitioning is an excellent way to improve query performance. You could easily switch to weekly partitions or even set up a history splitter and feed a DB with daily partitions. This can drastically improve performance. If partitioning does not speed up your queries something else is wrong.
Setting up or tuning DB indexes can greatly improve query performance.
If you want to have only a single partition so that external apps can access your ignition historical data, the option I would use is Timescale DB on top of Postgres. Your data will be stored in a single partition, but how TImescale DB works is it essentially creates virtual partitions with indexing. This is configurable so you can set the chunks to be minutes, hours, or days. By the way Postgres has been around as long as MSSQLServer and it is wildly popular, it runs on windows just as reliably as anything else.
At some scale it is better to let Ignition do the SCADA part and let something else do the heavy history queries straight to the DB. If you have these queries in stored procedures the SQL query plans can make these many times faster than ad-hoc queries from Ignition.

r123 · May 22, 2025, 11:37pm

Thanks for the post and I've seen these suggestions in a few different threads but some seem to not apply and or don't have an easy way to test.

Also just wondering which ones will help or help the most. In the order you listed?

Move the MSSQL to another server.
Partitioning?
Pre-Processed partitions, if not looking for raw values or all values in the last minute.
DB Indexes.
Move to Timescale DB on top of Postgres. Probably higher on the list.
Stored procedures.

"11 million rows is not a lot for any database engine"
That is why I was not initially worried about everything being on the same server. I did not think we were doing anything that intense (no process screens or remote/any users etc.) and I thought getting the data quired from the same server would be a plus. But it seems that even if Ignition is only doing historical logging, you still want the database on a different server. If it makes the queries faster and system more stable it's totally worth it but it feels like overkill on a system that is idle at 2-4% CPU unless a query is running.

Changing when and where some of the script is run, it went from almost a minute to query data for the 8000 tags and POST them, to about 15 seconds. That 15 seconds is close to just running the only the query with a known list of tags, without getting the data ready to POST and so on. Then without changing anything else and just splitting the query into four threads, the four queries finish in around 6-8 seconds. With one large query all cores on the server jump to 100%, with the four queries in their own threads in Ignition all cores also jump to 100% but the same data is read in about half the time. So it feels like the setup as is, could query the data faster.

I did try changing the partition size from one month to an hour and there was no change in the query time. Is there an issue changing the setting back and forth? Maybe I need to reset/delete all the tables after making the change.

nminchin · May 22, 2025, 11:54pm

This isn't a good measure of load. We have a system with 500k tags in split architecture with 95 vision clients and 40 perspective clients, with the front end sitting at 2.5% cpu

r123 · May 23, 2025, 1:04am

All clients active, everything all on one server? I also look at CPU, RAM, Disk activity, Network activity, errors, stability, responsiveness and so on. It's just how I felt at the time, no big red flags and nothing to compare the query times to.

jhottell · May 23, 2025, 2:00am

I wouldn’t do it a lot, but it’s probably fine to just switch it back to what you want.

pturmel · May 23, 2025, 12:29pm

Correct. Ignition is latency sensitive, and needs nearly idle cores (some) to work well. When running in the same system, DBs steal all the idle cores, crushing Ignition. That's what I meant by incompatible workloads.

brandon1 · May 23, 2025, 12:40pm

Just to throw in my experience. We compared side-by-side MSSQL and TimescaleDB and they had the same performance. We followed the recommendations from IA when setting up TimescaleDB. We spent about 3 days tinkering with settings and even reimported all of the data into TimescaleDB using a different tick setting. All that to say it seems like TimescaleDB takes quite a bit of expertise and tuning to make it perform the way @bkarabinchak.psi is talking about.

bkarabinchak.psi · May 23, 2025, 1:03pm

That's interesting, when I installed TimeScaleDB on my headless ubuntu server it asked me about setup, it auto-suggested using 75% of the available ram etc. It was a very easy setup once I get to that part. I just accepted all the suggestions.

You do have to make sure you use hyptertables iirc to take advantage of the point of TimeScaleDB.

Edit: Now I recall, I used timescaledb-tune to optimize the hardware config.

michael.flagler · May 24, 2025, 6:20am

This is the "template" I use when setting up TimescaleDB. Has worked well for us but don't have any real long term feedback.