We have failed transactions in the sqlth_sce table regularly and I am looking for a solution. There are about 1000 rows in the SCE table. We use some driven scan classes that execute at 100ms so I expect it to grow slowly. We have two gateways pointing to the same database/schema. Stale Data Detection is ON and the Multiplier is 10 on both. The OPC servers are reading about 30K tags total with most at 1 second or 30 second scan rates. Of these, about 7000 total tags are historized with 50% at 60 sec; 25% at 10 sec; and 10% at 1 sec with the rest at varied rates with only a few a 500ms and 100ms each. In desperation, the table has been periodically rebuilt with the minimum start time and maximum end time for each scan class which destroys the history of course. Even with a minimum number of rows there are query failures that SQL Server identifies in the Query Store view.
The query that fails is UPDATE sqlth_sce SET "end_time"=@P0 WHERE "scid"=@P1 and "rate"=@P2 and "start_time"<=@P3 and "end_time">=@P4
I have tried creating indexes (clustered; non-clustered with the columns scid, rate, and start_time including end_time; all columns) and the total execution time increases since the indexes require update as well which increases the failure frequency. Currently the query successfully executes in ~0.4ms on average but failures take 3-10 seconds and happen a couple times an hour with a clean table and increase to several times an hour as the table grows past 2-3K rows. Keeping the row count low helps but does not eliminate the problem. I suspect that the failed transactions cause the stale detection to trip and a new row is created but I don't know the inner workings to that level. Evidence shows the higher the frequency of failures tracks with the increased row count.
Keeping the stale data detection history enabled is a requirement, even if the older history was retained in a separate table to be manually accessed when needed. Understanding is required between when the data didn't change enough to trip the deadband versus the scan class failed or the gateway was down. If the older history was moved out of the Scan Class Executions table for retention this would be acceptable.
I have been reading about MS SQL Server OLTP capability where the table is kept in memory. This strategy uses an in-memory table with row versioning to eliminate deadlocks for high traffic tables. Using the setting Durability = SCHEMA_AND_DATA the data is stored to disk async to retain values during reboots. Limitations include: limited memory for the table requires offloading the history to a separate table periodically to keep within the allocated memory space; and adding an integer primary key to the table.
Has anyone else done this? Are there additional downsides?
Thanks in advance.