Large MSSQL Tag History and Replication

hmiCanuck · February 10, 2025, 3:46pm

We have a 8.1 Vision setup with MSSQL running on the same machine. Our historical data collection is ~200GB / month. Yes, I know that's ridiculous, and it's actually been reduced from its previous 2TB / month.

I've setup snapshot DB replication with the idea that the master could remain "lean" (maybe 1 year of data) and the subscriber would store all history. It's not pretty and I thought I'd ask this to see if others were having similar pain.

I can't use the Tag History Splitter instead because the backup SQL server isn't always connected. Ignition's Store & Forward won't support the potential quarantine size I'd need.
I can't use history table partitioning because of the replication - it won't automatically pick up the new monthly tables.
Pruning will become a large pain because if I completely delete the master's history DB, the tag IDs will likely change. And if I don't then "shrinking" a table with replication enabled is also ugly.
Long term storage becomes a large pain with a single table growing to say 20TB in size.

Anyone else trying to store AWS sized historical data?

pturmel · February 10, 2025, 4:55pm

Don't do this. Especially with such a large workload as yours. Ignition's CPU usage pattern and general database CPU usage patterns do not play well together. Don't put them on the same machine.

Don't use MS SQL Server. Other, more competent, brands (like PostgreSQL) will happily replicate new tables, too. So you can (and you should) use Ignition's partitioning. Or you can use a DB like Timescale (which is actually enhanced PostgreSQL) to handle the very large tables. (It'll compress them, too.)

Your problems revolve around using MS SQL Server. Don't. Really.

hmiCanuck · February 10, 2025, 5:30pm

While I completely agree with you, unfortunately I don't have a choice with either.

I inherited the code from a previously developer. The code has already been "certified" for use in our environment and switching DBs or moving the DB server would literally set the project back about 6 months.

pturmel · February 10, 2025, 6:02pm

You are going to spend more time trying to solve this unsolvable problem. Tell your client/bosses that you need to rip the bandage off.

hunterdg · February 11, 2025, 7:13am

I believe I’ve ~~solved~~ worked around this headache with a combination of snapshot replication for the keyless tables and transactional replication (which picks up new articles if I recall correctly*) for the rest of the tables.

EDIT: I’m certain it was syncing new tables when we switched to weekly partitions for a test, but everything I can find with google/AI says that transactional replication does not replicate new tables.. so maybe I scripted/scheduled the article addition as it presumably requires a new snapshot every time. (I’m not a DBA)

addional thoughts:

perhaps selecting “all” vs individual articles makes a difference when creating the transactional publication, though googling seems to refute that as well
for some reason the “Replicate Schema Changes” (true) option keeps coming to mind, though it doesn’t make sense - maybe that was for a metadata table I was also syncing

I believe I’ve also implemented transactional replication on keyless tables by creating a view on the publisher and replicating that to/as a table on the subscriber