Ignition History Import

Trent.Boudreaux · May 22, 2019, 3:41pm

I’m having some major issues importing a large amount of history and thought I might post my approach to see if anyone has some suggestions or alternative solutions.

I’m importing a year of history for about 300k+ tags which equates to 60+ million records.
My Ignition historian already has a few months of history, and is configured with default settings of 1 month partitions.

My initial approach was to read the data from the daily CSV files and run all the records through system.tag.storeTagHistory.
Since storeTagHistory allows to insert multiple history records at a time, I assumed going in that I’d have to test the optimal number of records to import per call, but that was just the beginning of my problems.
I eventually found that importing storeTagHistory with many records was more efficient that importing 1 record per call, so I set it to run the import for every 1k records.

My first issue is that the Store & Forward Buffer would fill up really fast and records would get dropped, so I increased the Disk Cache size by quite a bit to 100,000 and tried again.
This time, not only did the Disk Cache fill up again, but the Store & Forward process of writing to the database from the Disk Cache got very slow.
IA Support explained that there are issues with the HyperSQL database getting this large, so we lowered the max size to 25,000 records and added a sleep delay mid script to avoid filling it up.

Next problem was that when importing history, the realtime values would stop historizing. Which I figured out is because they too have to go through the same Store & Forward engine and will be placed in the back of the large Disk Cache queue.
So, the approach to this was to create a new Historian in my gateway dedicated for importing history.
This seemed to work well, but the throughput was still pretty limited by the single threaded nature of the Store & Forward Engine.
So, I’ve since added 3 more Historian’s in my gateway and my script now loops through to spread the load across all 4, storing 1k tags per connection, and sleeping every 4k tags for a second.

The next problem I ran into is that sometimes my records would get quarantined due to duplicate records.
Because of my previous partially successful attempts to import history, I had some of the records already imported, but couldn’t determine which were already imported and which had failed.
I was hesitant to just wipe the partitions due to the amount of time I had already put, and the amount of execution time on the gateway to get the records that I had so far.
At this point, working with support, I learned of the most frustrating behavior in this whole process.
Regardless of how storeTagHistory is called, Ignition batches multiple records to be inserted to the database per query for performance.
So, if any record in a batch is a duplicate, the entire batch will get quarantined.

My solution to this issue was to import all of my history to another database table, add a flag to indicate if it had previously been imported, and run a query to join my “history import staging” table with the history partitions and update the flag if it was already in the historian.
My script then imports records from the history import staging table that don’t have the imported flag set.
This seems to be mostly successful, but since that history import staging table is so large, all of the queries are very slow.

All-in-All this seems like a very complex solution for a problem that is fundamental to SCADA and I hope there is something I’m missing.

Thanks for your time.
-Trent

brandon1 · May 22, 2019, 4:54pm

You are importing this history from a different historian correct?
I have been told by Ignition that this process is slow. As in, it can take a week to move the data into Ignition’s tables. I think you are on the right track. The only other alternative I can think of is to write a SQL procedure that can take the data row by row and translate that into the rows you need for the ignition historian. So basically, since you have the tag ids that were generated by system.tag.storeTagHistory, you can let the procedure reference that and then create the monthly tables for you. Then you just insert that massive table into the database using the workbench query tool.

Kyle_Chase · May 23, 2019, 2:50pm

Hello Trent,
I feel your pain! Here are a few things you could try:

Monitor the S&F system and limit the historical writes in a way that only adds data to the S&F when the queue is below a certain threshold. This would remove the need for the delay.
You can pull data out of the quarantine, and filter out the data that is causing the quarantine. Won’t be overly efficient, but it is doable.
What backend are you using? SQL Server? Have you thought about trying an alternate store? Sales pitch here, but we have a module that is backed by InfluxDB, and we have also done away with the HyperSQL store, and replaced it with a more efficient on disk storage method.

Trent.Boudreaux · May 23, 2019, 3:15pm

Thanks for the response Brandon.
I thought about writing to the partitions directly, but my concern was with the complexity of managing the sqlth_te table and the high risk for breaking something.
I have access to the tag Ids in the queries, but my script currently does not interact with any of the ignition historian tables directly, and instead leverages storeTagHistory to save the history which handles finding the right tag id, adding new tag ids, or extending the retired dates in the sqlth_te table.

I was hoping I wouldn’t have to go the route of writing to the tables directly or modifying any of the tables. I’ve had issues with minor manual changes breaking the historian entirely in the past.

Trent.Boudreaux · May 23, 2019, 3:26pm

Thanks for the response Kyle.

I think this approach would be more efficient than just a straight delay, but I don’t know of a way to read the S&F queue size/status during the execution of the script.
One issue with exporting the quarantine file is that it truncates all of the milliseconds, so I’d have to account for that when checking to see if it already exists. This could be a good option otherwise. Basically the approach would be to import everything without checking if it exists first, then re-process the quarantined records by removing the duplicates and importing the rest. The quarantines can’t hold many records due to the size restrictions of the HSQL DB, so ideally I’d need a way to automate this within the import script.
Yes, we are using SQL Server for the Ignition Historian. I’ve looked into InfluxDB before and it looked interesting. Does your module use InfluxDB for the Historian Database or the S&F Disk Cache database?

Ultimately I’m looking for a History Import Tool that I can just point to a data source like CSV files or a SQL query, and after some (long) time, all of the history has been imported, without any other interaction.

brandon1 · May 23, 2019, 3:39pm

I totally understand your hesitation Trent. I have had to mess with the Historian tables enough that I have gotten more comfortable with editing there content. One thing you could consider is setting up a test database that you can try importing into and then if all goes well implement it on the production system.

MMaynard · May 23, 2019, 3:52pm

Since you are using SQL Server, have you thought about setting up a bulk insert job on the server?
Then you could call it from Ignition, passing it the CSV file name.

Kyle_Chase · May 23, 2019, 7:15pm

Hello Trent,

It would involve using some of the SDK from the gateway scope to pull that information out. This is what I have done in the past.
Yeah that makes sense, I know it is possible, but have not done this myself.
We use InfluxDB to store the history data, and use a different method (ring buffer) for storing data for the S&F Disk Cache. We have a few different historian modules that we use the ring buffer on, and it seems to work quite well.

I can give you a call today if you like to discuss further, just DM me your contact info.

Kyle

glenn.marquez · October 10, 2021, 1:00am

Hey guys, I am new to this site and new to Ignition SCADA as well. Please bear with for replying on a topic that has been posted 2 years ago. Don’t know what to do and where to start right now, I know this is a long shot. I have a similar issue right now on the same topic. Just wondering if someone can point me on the right directions or any idea where to start. Basically we have an Ignition SCADA running on MySQL database, it is running fine right now without any issues, I want to inject the history from 2016 that came from another Ignition SCADA server that ran on MySQL database. Thanks in advance.

brandon1 · October 11, 2021, 4:38pm

The safest option will be to use the system.tag.storeTagHistory function.
You will find discussions about it on the forum. Another thing is to make sure your connection properties has something like allowBatchUpdates specified? I can’t quite remember the exact syntax, but any new connection going forward should use that by default.

The last thing to look out for is setting every start time to 0 in the sqlth_te table. This is because otherwise the function listed above will make a new tagid which is not what you want.

Good luck!