External Monitoring of Ignition

I need to monitor ignition better. I had the last version bomb out a couple of times, where it wasn’t monitoring tags, nor sending alarms (hasn’t happened in 7.6.4 yet). The fact remains that I need some good indicators to know if Ignition is working.

A few ideas:
Monitor the logs (but not really sure of what to look for)
Monitor the tags (MSSQL - check a few tags for freshness)

I really need to know for sure that:

  1. Ignition is working
  2. Tags are working as expected
  3. Alarms will go out (via Voice and Email)

I can’t be the first person to run into this. What have other people done?

I have a heartbeat in Ignition that is monitored by one of the PLCs, the PLC also monitors an “anyActiveAlarm” bit. If the heartbeat goes down or the “anyActiveAlarm” is on for 30 minutes the PLC activates a local alarm and uses an Alarm Dialer to call out an alarm.

I like the idea of a heartbeat for my projects. You could even set up a timer script to simply write the current time into the database every 5 seconds. Then use whatever (non-Ignition) alarming you want to check how long it’s been since that timestamp and send out an alarm.

I have win911 running with ignition. it has heartbeat watchdog functionality built in. essentially, it looks at the tag you tell it, and if that doesnt change in the set amount of time that you set the watchdog for it will call you up.

The cool thing is that my opc server has a seconds tag, so I have ignition write that tag to the database and using the kepware db to opc interface I have win911 monitor it.

so essentially I am monitoring the complete health of my system by looking at that one tag. the opc server has to be running for that tag to change. the opc connectivity between ignition and the opc server has to be up. the database has to be up for ignition to write the tag value for win911 to see it thru the db to opc interface. so every aspect of the system has to work correctly.

we have also instituted some other neat things like sending out an alarm email every hour and having a python script go out and check to verify that that alarm email is being sent out every hour. that way if alarming ceases we can get notified on that.

some other things we have done is write scripts that check comm status on our devices. we have some pretty large systems, so we might have 50 or so serial devices behind 1 ip address. we go out and check and verify that we dont have large amounts of devices in comm failure. emails get sent out if there are any problems.

Our opc server hands us a last good poll time. another thing that I do is check to see if that last poll time goes past say 3 hours from now. If it does then I generate an email telling which device is down, how long its been down, and when the event occured.

so there are alot of things you can do to monitor system health, especially if your opc server gives you good comm statistics.