Ignition crashing due to alarm journal

We just had an Ignition server crashing in a plant.

The clients lost connection, and when we looked at the gateway control utility, the service appeared to be dead.
Restarting the service caused the clients to reconnect, but after a few minutes, the service crashed again.

Looking at the wrapper logs, the first errors were related to the alarm journal. So we tried disabling the alarm journal, and the server keeps running for now.

When looking at the alarms in the DB, it seems like all alarms on all devices were triggered at once. It logged hundreds of alarms every second, presumably until the buffer was full. All our alarms are based on integer error codes, and the code always has to be equal before an alarm can happen, so it really doesn’t make sense that all alarms for all devices were triggered at once.

Does anyone have an idea what happened, and how to prevent it?

INFO   | jvm 5    | 2018/01/22 20:33:03 | <CTC|MES|I|278|0|Crane|2|1|0|100|0|00000000000000000|M|2|2|1|R|0|M|0|0|0|0|L|0|M|0|0|>
INFO   | jvm 5    | 2018/01/22 20:33:03 | E [a.J.Journal                   ] [19:33:03]: Error logging alarm event to the database. alarm-source=default/100 Vulzone/10_Devices_Digital/ROL/rol_A010a_Crossing_uitvoer_lege_kisten_rolbaan_kraan_3/cbMess/-- A010a Crossing uitvoer lege kisten rolbaan kraan -- 10: Error feedback run, alarm-name=-- A010a Crossing uitvoer lege kisten rolbaan kraan -- 10: Error feedback run
INFO   | jvm 5    | 2018/01/22 20:33:03 | java.lang.Exception: Unable to add data to memory buffer, buffer full.
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.history.stores.MemoryStore.insertInBuffer(MemoryStore.java:171)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.history.stores.MemoryStore.syncdStoreData(MemoryStore.java:79)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.history.stores.AbstractStore.storeData(AbstractStore.java:233)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.history.stores.MultiStageStore.syncdStoreData(MultiStageStore.java:149)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.history.stores.AbstractStore.storeData(AbstractStore.java:233)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.history.DefaultStoreAndForwardEngine.storeData(DefaultStoreAndForwardEngine.java:175)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.history.HistoryManagerImpl.storeHistory(HistoryManagerImpl.java:187)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.alarming.journal.AlarmJournal.storeEvent(AlarmJournal.java:256)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.alarming.journal.AlarmJournal.storeEvent(AlarmJournal.java:204)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.alarming.journal.AlarmJournalManager.storeEvent(AlarmJournalManager.java:194)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.alarming.AlarmManagerImpl.transitionAlarm(AlarmManagerImpl.java:344)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.alarming.evaluation.Alarm.processTransition(Alarm.java:744)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.alarming.evaluation.Alarm.finishTransitionEvent(Alarm.java:708)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.alarming.evaluation.Alarm.initTransitionEvent(Alarm.java:632)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.alarming.evaluation.Alarm.evaluate(Alarm.java:515)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.alarming.AlarmManagerImpl$AlarmEvaluatorImpl.evaluate(AlarmManagerImpl.java:508)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.sqltags.execution.tags.AbstractExecutableTag.evaluateAlerts(AbstractExecutableTag.java:261)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.sqltags.execution.tags.AbstractExecutableTag.setCurrentValue(AbstractExecutableTag.java:250)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.sqltags.execution.tags.SubscribedNodeTag.setValue(SubscribedNodeTag.java:158)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.xopc2.client.OpcUaConnection$DataValueConsumer.accept(OpcUaConnection.java:1522)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.xopc2.client.OpcUaConnection$DataValueConsumer.accept(OpcUaConnection.java:1495)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at org.eclipse.milo.opcua.sdk.client.subscriptions.OpcUaMonitoredItem.lambda$setValueConsumer$0(OpcUaMonitoredItem.java:128)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at org.eclipse.milo.opcua.sdk.client.subscriptions.OpcUaMonitoredItem.onValueArrived(OpcUaMonitoredItem.java:176)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at org.eclipse.milo.opcua.sdk.client.subscriptions.OpcUaSubscriptionManager.lambda$null$28(OpcUaSubscriptionManager.java:547)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at org.eclipse.milo.opcua.stack.core.util.ExecutionQueue$PollAndExecute.run(ExecutionQueue.java:107)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at java.util.concurrent.FutureTask.run(Unknown Source)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at java.lang.Thread.run(Unknown Source)
INFO   | jvm 5    | 2018/01/22 20:33:03 | E [a.J.Journal                   ] [19:33:03]: Error logging alarm event to the database. alarm-source=default/100 Vulzone/10_Devices_Digital/ROL/rol_A010a_Crossing_uitvoer_lege_kisten_rolbaan_kraan_3/cbMess/-- A010a Crossing uitvoer lege kisten rolbaan kraan -- 49: Error load, alarm-name=-- A010a Crossing uitvoer lege kisten rolbaan kraan -- 49: Error load
INFO   | jvm 5    | 2018/01/22 20:33:03 | java.lang.Exception: Unable to add data to memory buffer, buffer full.
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.history.stores.MemoryStore.insertInBuffer(MemoryStore.java:171)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.history.stores.MemoryStore.syncdStoreData(MemoryStore.java:79)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.history.stores.AbstractStore.storeData(AbstractStore.java:233)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.history.stores.MultiStageStore.syncdStoreData(MultiStageStore.java:149)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.history.stores.AbstractStore.storeData(AbstractStore.java:233)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.history.DefaultStoreAndForwardEngine.storeData(DefaultStoreAndForwardEngine.java:175)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.history.HistoryManagerImpl.storeHistory(HistoryManagerImpl.java:187)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.alarming.journal.AlarmJournal.storeEvent(AlarmJournal.java:256)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.alarming.journal.AlarmJournal.storeEvent(AlarmJournal.java:204)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.alarming.journal.AlarmJournalManager.storeEvent(AlarmJournalManager.java:194)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.alarming.AlarmManagerImpl.transitionAlarm(AlarmManagerImpl.java:344)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.alarming.evaluation.Alarm.processTransition(Alarm.java:744)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.alarming.evaluation.Alarm.finishTransitionEvent(Alarm.java:708)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.alarming.evaluation.Alarm.initTransitionEvent(Alarm.java:632)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.alarming.evaluation.Alarm.evaluate(Alarm.java:515)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.alarming.AlarmManagerImpl$AlarmEvaluatorImpl.evaluate(AlarmManagerImpl.java:508)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.sqltags.execution.tags.AbstractExecutableTag.evaluateAlerts(AbstractExecutableTag.java:261)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.sqltags.execution.tags.AbstractExecutableTag.setCurrentValue(AbstractExecutableTag.java:250)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.sqltags.execution.tags.SubscribedNodeTag.setValue(SubscribedNodeTag.java:158)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.xopc2.client.OpcUaConnection$DataValueConsumer.accept(OpcUaConnection.java:1522)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.xopc2.client.OpcUaConnection$DataValueConsumer.accept(OpcUaConnection.java:1495)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at org.eclipse.milo.opcua.sdk.client.subscriptions.OpcUaMonitoredItem.lambda$setValueConsumer$0(OpcUaMonitoredItem.java:128)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at org.eclipse.milo.opcua.sdk.client.subscriptions.OpcUaMonitoredItem.onValueArrived(OpcUaMonitoredItem.java:176)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at org.eclipse.milo.opcua.sdk.client.subscriptions.OpcUaSubscriptionManager.lambda$null$28(OpcUaSubscriptionManager.java:547)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at org.eclipse.milo.opcua.stack.core.util.ExecutionQueue$PollAndExecute.run(ExecutionQueue.java:107)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at java.util.concurrent.FutureTask.run(Unknown Source)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at java.lang.Thread.run(Unknown Source)
INFO   | jvm 5    | 2018/01/22 20:33:03 | E [a.J.Journal                   ] [19:33:03]: Error logging alarm event to the database. alarm-source=default/100 Vulzone/10_Devices_Digital/ROL/rol_A010a_Crossing_uitvoer_lege_kisten_rolbaan_kraan_3/cbMess/-- A010a Crossing uitvoer lege kisten rolbaan kraan -- 35: Multimotor Th, alarm-name=-- A010a Crossing uitvoer lege kisten rolbaan kraan -- 35: Multimotor Th
INFO   | jvm 5    | 2018/01/22 20:33:03 | java.lang.Exception: Unable to add data to memory buffer, buffer full.
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.history.stores.MemoryStore.insertInBuffer(MemoryStore.java:171)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.history.stores.MemoryStore.syncdStoreData(MemoryStore.java:79)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.history.stores.AbstractStore.storeData(AbstractStore.java:233)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.history.stores.MultiStageStore.syncdStoreData(MultiStageStore.java:149)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.history.stores.AbstractStore.storeData(AbstractStore.java:233)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.history.DefaultStoreAndForwardEngine.storeData(DefaultStoreAndForwardEngine.java:175)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.history.HistoryManagerImpl.storeHistory(HistoryManagerImpl.java:187)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.alarming.journal.AlarmJournal.storeEvent(AlarmJournal.java:256)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.alarming.journal.AlarmJournal.storeEvent(AlarmJournal.java:204)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.alarming.journal.AlarmJournalManager.storeEvent(AlarmJournalManager.java:194)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.alarming.AlarmManagerImpl.transitionAlarm(AlarmManagerImpl.java:344)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.alarming.evaluation.Alarm.processTransition(Alarm.java:744)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.alarming.evaluation.Alarm.finishTransitionEvent(Alarm.java:708)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.alarming.evaluation.Alarm.initTransitionEvent(Alarm.java:632)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.alarming.evaluation.Alarm.evaluate(Alarm.java:515)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.alarming.AlarmManagerImpl$AlarmEvaluatorImpl.evaluate(AlarmManagerImpl.java:508)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.sqltags.execution.tags.AbstractExecutableTag.evaluateAlerts(AbstractExecutableTag.java:261)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.sqltags.execution.tags.AbstractExecutableTag.setCurrentValue(AbstractExecutableTag.java:250)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.ignition.gateway.sqltags.execution.tags.SubscribedNodeTag.setValue(SubscribedNodeTag.java:158)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.xopc2.client.OpcUaConnection$DataValueConsumer.accept(OpcUaConnection.java:1522)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at com.inductiveautomation.xopc2.client.OpcUaConnection$DataValueConsumer.accept(OpcUaConnection.java:1495)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at org.eclipse.milo.opcua.sdk.client.subscriptions.OpcUaMonitoredItem.lambda$setValueConsumer$0(OpcUaMonitoredItem.java:128)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at org.eclipse.milo.opcua.sdk.client.subscriptions.OpcUaMonitoredItem.onValueArrived(OpcUaMonitoredItem.java:176)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at org.eclipse.milo.opcua.sdk.client.subscriptions.OpcUaSubscriptionManager.lambda$null$28(OpcUaSubscriptionManager.java:547)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at org.eclipse.milo.opcua.stack.core.util.ExecutionQueue$PollAndExecute.run(ExecutionQueue.java:107)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at java.util.concurrent.FutureTask.run(Unknown Source)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
INFO   | jvm 5    | 2018/01/22 20:33:03 | 	at java.lang.Thread.run(Unknown Source)

We now had a similar issue at a different plant, where Ignition wouldn’t crash, but the OPC tags didn’t come through anymore. Here, the solution was again to disable the alarm journal.

Could this fix in 7.9.6 be somewhat related to it? The versions that showed the problem are 7.9.5 and 7.9.4, both still in development, so it’s likely someone hit the “refresh” button.

1 Like

So after some more investigation, it turns out it was due to a growing string tag, used to visualise some logging.

Apparenly, the text was appended to on every communication action we do to a third party. But on every change, the entire text was also stored to the audit log.

This meant that the DataBase connection was very busy, and certain queries failed.

Limiting that tag in length fixed the crashes. Though we still don’t know why the entire service crashed, instead of just crashing f.e. the DB communication to that database.