Edge Historian on 7.9.12 Ignition Gateway stops working with SQLIteHistorian DB lock & requires gateway restart to fix

We recently began using the Edge Historian (Ignition 7.9.12) with a few tags to store high resolution short term history for some real-time charts in addition to the lower resolution long time history we store with a remote historian. It works great until it doesn’t and begins spamming the logs with ResultWriter errors:

ResultWriter error details

java.lang.RuntimeException: java.util.concurrent.TimeoutException: Query took over 30 seconds to produce data.
at com.inductiveautomation.gateway.tsdb.impl.BasicQueryResult.next(BasicQueryResult.java:128)
at com.inductiveautomation.gateway.tsdb.impl.BasicQueryResult.next(BasicQueryResult.java:22)
at com.inductiveautomation.gateway.tags.history.tsdb.HistorianQueryExecutor.moveForward(HistorianQueryExecutor.java:163)
at com.inductiveautomation.gateway.tags.history.tsdb.HistorianQueryExecutor.processData(HistorianQueryExecutor.java:177)
at com.inductiveautomation.ignition.gateway.sqltags.history.query.HistoryWriter.readData(HistoryWriter.java:308)
at com.inductiveautomation.ignition.gateway.sqltags.history.query.HistoryWriter.execute(HistoryWriter.java:230)
at com.inductiveautomation.ignition.gateway.sqltags.SQLTagsManagerImpl.queryHistory(SQLTagsManagerImpl.java:1280)
at com.inductiveautomation.ignition.gateway.sqltags.history.distributed.TagHistoryServiceImpl.lambda$queryTagHistory$3(TagHistoryServiceImpl.java:104)
at java.lang.Thread.run(Unknown Source)
Caused by: java.util.concurrent.TimeoutException: Query took over 30 seconds to produce data.
at com.inductiveautomation.gateway.tsdb.impl.BasicQueryResult.next(BasicQueryResult.java:116)
… 8 common frames omitted

Restarting the tag historian module looks like this in the logs:


After the tag historian module restart, charts in clients display remote historian data again (which was logged all along, but not displayed), but the Edge Historian is still offline (related to 20Nov2019 09:00:46 error above).

20Nov2019 09:00:46 Tags error details

org.sqlite.SQLiteException: [SQLITE_BUSY] The database file is locked (database is locked)

at org.sqlite.core.DB.newSQLException(DB.java:909)

at org.sqlite.core.DB.newSQLException(DB.java:921)

at org.sqlite.core.DB.throwex(DB.java:886)

at org.sqlite.core.NativeDB._exec_utf8(Native Method)

at org.sqlite.core.NativeDB._exec(NativeDB.java:87)

at org.sqlite.jdbc3.JDBC3Statement.executeUpdate(JDBC3Statement.java:116)

at com.inductiveautomation.gateway.tsdb.historians.SQLiteHistorian.runUpdateQuery(SQLiteHistorian.java:433)

at com.inductiveautomation.gateway.tsdb.historians.SQLiteHistorian$Tables.verifyTables(SQLiteHistorian.java:801)

at com.inductiveautomation.gateway.tsdb.historians.SQLiteHistorian.initialize(SQLiteHistorian.java:171)

at com.inductiveautomation.gateway.tags.history.tsdb.HistorianProvider.initialize(HistorianProvider.java:69)

at com.inductiveautomation.gateway.tags.history.edge.EdgeHistorianProviderType$EdgeHistorian.(EdgeHistorianProviderType.java:91)

at com.inductiveautomation.gateway.tags.history.edge.EdgeHistorianProviderType.createHistoryProvider(EdgeHistorianProviderType.java:52)

at com.inductiveautomation.ignition.gateway.sqltags.SQLTagsManagerImpl.instantiateHistoryProvider(SQLTagsManagerImpl.java:408)

at com.inductiveautomation.ignition.gateway.sqltags.SQLTagsManagerImpl.addSQLTagHistoryProviderType(SQLTagsManagerImpl.java:298)

at com.inductiveautomation.gateway.tags.history.module.TagHistorianGatewayHook.registerType(TagHistorianGatewayHook.java:107)

at com.inductiveautomation.gateway.tags.history.module.TagHistorianGatewayHook.setupTagHistorian(TagHistorianGatewayHook.java:115)

at com.inductiveautomation.gateway.tags.history.module.TagHistorianGatewayHook.setup(TagHistorianGatewayHook.java:102)

at com.inductiveautomation.ignition.gateway.modules.ModuleManagerImpl$LoadedModule.setup(ModuleManagerImpl.java:2282)

at com.inductiveautomation.ignition.gateway.modules.ModuleManagerImpl.startupModule(ModuleManagerImpl.java:1122)

at com.inductiveautomation.ignition.gateway.modules.ModuleManagerImpl$4.call(ModuleManagerImpl.java:813)

at com.inductiveautomation.ignition.gateway.modules.ModuleManagerImpl.executeModuleOperation(ModuleManagerImpl.java:860)

at com.inductiveautomation.ignition.gateway.modules.ModuleManagerImpl.restartModuleInternal(ModuleManagerImpl.java:792)

at com.inductiveautomation.ignition.gateway.modules.ModuleManagerImpl.restartModule(ModuleManagerImpl.java:784)

at com.inductiveautomation.ignition.gateway.web.pages.config.ModulePage$RestartAction.execute(ModulePage.java:561)

at com.inductiveautomation.ignition.gateway.web.components.ConfirmationPanel$1.onClick(ConfirmationPanel.java:49)

at org.apache.wicket.markup.html.link.Link.onLinkClicked(Link.java:189)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)

at java.lang.reflect.Method.invoke(Unknown Source)

at org.apache.wicket.RequestListenerInterface.internalInvoke(RequestListenerInterface.java:258)

at org.apache.wicket.RequestListenerInterface.invoke(RequestListenerInterface.java:216)

at org.apache.wicket.core.request.handler.ListenerInterfaceRequestHandler.invokeListener(ListenerInterfaceRequestHandler.java:240)

at org.apache.wicket.core.request.handler.ListenerInterfaceRequestHandler.respond(ListenerInterfaceRequestHandler.java:226)

at org.apache.wicket.request.cycle.RequestCycle$HandlerExecutor.respond(RequestCycle.java:814)

at org.apache.wicket.request.RequestHandlerStack.execute(RequestHandlerStack.java:64)

at org.apache.wicket.request.cycle.RequestCycle.execute(RequestCycle.java:253)

at org.apache.wicket.request.cycle.RequestCycle.processRequest(RequestCycle.java:210)

at org.apache.wicket.request.cycle.RequestCycle.processRequestAndDetach(RequestCycle.java:281)

at org.apache.wicket.protocol.http.WicketFilter.processRequest(WicketFilter.java:188)

at org.apache.wicket.protocol.http.WicketFilter.doFilter(WicketFilter.java:245)

at com.inductiveautomation.ignition.gateway.bootstrap.SRFilter.doFilter(SRFilter.java:80)

at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1668)

at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:581)

at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)

at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)

at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)

at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1160)

at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:511)

at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)

at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1092)

at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)

at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:52)

at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)

at org.eclipse.jetty.server.Server.handle(Server.java:518)

at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:308)

at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:244)

at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)

at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)

at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)

at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceAndRun(ExecuteProduceConsume.java:246)

at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:156)

at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:654)

at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572)

at java.lang.Thread.run(Unknown Source)

At this point, the Edge Historian local tag history provider shows as disabled on the gateway status page:


However, if we edit it, it shows as enabled:
image
After saving it as above (as it was when edit opened), it continues to show disabled. Restarting the gateway via gwcmd -r resolves the issue. Edge Historian local tag history begins logging and displaying again as expected… until the next error.

Any suggestions to diagnose/resolve this issue?

Historian issues started up a little earlier this morning:


The 03:47 ResultWrite errors correspond with the last stored values in the Edge Historian (after restarting local gateway, values show flat-lined from that point until a short gap during gateway restart around 06:00 after which new values begin recording when gateway comes back up. This is most obvious on the green line below. The purple line is from a remote tag history provider which records during the whole time local Edge Historian is not recording, though we can’t see the values (red overlay on charts) until we restart the local gateway Historian module.

Prior to restarting the historian module, the Edge Historian shows running, even though all of the client charts have a red overlay and display no data (including sparklines that access only the Edge Historian):
image
Something is causing the Edge Historian to quit recording values and requires a Historian module restart to even show remote tag history provider values (though they continue recording). It’s not till we restart the whole gateway that the Edge Historian begins recording values again.

It did it again so I gathered some more data before restarting the HIstorian module. At this point, the store & forward status page won’t load–it’ll stay like this forever (or at least more than five minutes… haven’t tested forever yet):


This page loads quickly as expected after restarting the Historian module.

Before restarting the Historian module, the tags status pane shows the local Edge Historian tag history status as good while the remote historian (second one) is errored (gateway not connected, although in the past when we’ve had this problem the gateway network was up and status pages showed remote historian running):
image
After restarting the Historian module, the local Edge Historian disappears from Tags status page:


This corresponds to it showing as Disabled on configuration page like in first post of this thread:
image
After restarting the gateway, this shows up as enabled again and is listed as expected on the tags status page.

I don’t think this is directly related to the problem as it hasn’t been the case when I’ve seen this happen before, but in this case Gateway network is down:


The interesting thing about the above is last connection four hours ago corresponds roughly to the time operators noted the graphs stopped working.

Thread viewer has pages and pages of these waiting threads (so many pages the browser takes a while to rerender the page if browser is resized):


And it looks like we have a winner–here’s the details on one of the many waiting threads–waiting for an SQLiteHistorian DB lock:

I checked some more and it looks like they’re all the same. After restarting the gateway via gwcmd -r the pages of those threads are gone.

What could cause this to get stuck with the internal SQLiteHistorian locked till gateway restart? What can we do to avoid or resolve it? Is there anything else we should look at to diagnose this?

We are experiencing the exact same issue. It was supposed to be fixed in 7.9.13, but we have the same situation after the upgrade.

Good to know we’re not the only ones; thanks @Ivan_a . We just upgraded this one to 7.9.13 yesterday. IA support has looked at a thread dump and wrapper log files from both gateways (when this was on 7.9.12). Now we’re waiting for the issue to recur at a time when they can get a live look at it. I plan to post an update here when this is resolved. Please let us know if you get it figured out first.

It recurred on 7.9.13 for us too.The issue looks the same as posted above, except the thread viewer doesn’t successfully load a list of threads (stays blank–EDIT: it did eventually load, just took a very long time to load a really long list of waiting threads). It also took a lot longer to recur than expected based on past patterns on 7.9.12, though that may be a coincidence.

The later recurrence in 7.9.13 appears to have been a coincidence. It happened just as frequently later. We’ve since upgraded to 8.0.7 and haven’t had a recurrence in six days of uptime. That’s too early to say for sure as we went about that long on 7.9.13 too, but so far so good.

And we just hit the same issue in 8.0.7 (a bunch of threads waiting on SQLiteHistorian DB lock). This time the server was saying the associated S&F didn’t exist (and didn’t show it on S&F status page). Restarting the gateway resolved the issue with the local historian (it reappeared and worked). We’ll be following up with IA support so they have some 8.0.7 thread dumps and logs to compare with the ones from 7.9.13.