Gateway Crashes Every Night

Can anyone share some troubleshooting tips on why my gateway crashes every night?

I haven't been able to locate anything in the logs.

Info below on server:

AWS Lightsail Instance 2 vCPUs, 2 GB Ram
Ubuntu OS Only in Creation
Ignition Version 8.1.39
Modules
Maker Modules
MQTT Distributor
MQTT Engine

No perspective sessions running at night.
I've disabled the few scripts that run in the gateway.
There's nothing else on the gateway (no alarms, no reports, etc.).

How can I figure out if it's just because it's only 2 CPUs and 2 GB Ram? Or if there's something causing it to crash?

How much RAM did you tell Ignition it could use for the Java Heap (via the settings in ignition.conf) ? For a 2GB VM, you probably shouldn't allow more than 1GB heap--you need to leave room in the VM for java's overhead and for the OS to have buffer space.

Did you make initial memory allowance equal to max allowance? (You should.)

Where is your database? If in the same VM, that is probably contributing to your misery. Databases running in the same environment as Ignition tend to hog all the resources.

Are you running anything else in parallel with Ignition in that VM?

Have you enabled history for the [System] tags for gateway CPU and Heap Memory usage?

I didn't modify it, it's left at default 2046. I'll change it to 1024 and that'll match the Initial heap size.

There's no database except for a few SQL lite tables I created, no more than 50 rows with 5 tables, not sure what I'm going to do here yet, want to keep costs low.

Nothing else is running on this VM, it was created with the OS-only option selected via the Lightsail GUI.

I have not enabled history for the system tags yet. Since I have not created a real database yet, I haven't enabled tag history on anything yet.

In a 2GB VM, that would be sufficient to eventually crash your gateway. A JVM eventually claims the max it is allowed, and on a 2GB VM, that would invoke Linux's OOM-killer.

Got it. I have changed the initial and max to 1024.

I also added some historical logging to my SQL lite DB of the CPU % and Memory % (with dead bands so I don't store an insane amount of records) to see if there is anything helpful there when it crashes, which hopefully it doesn't with the change to the Heap Memory.

Time will tell because it has happened every night since I created this gateway.

If you are concerned about cost while learning Ignition, don't use a cloud VM. Set up your own hypervisor or dedicate a cheap old laptop to the purpose. Ignition is not a light CPU or RAM or storage load for any useful work.

It's not used to learn Ignition but the cloud instance is used to control my home automation remotely and showcase projects, kind of like a portfolio site.

Consider setting up a cheap cloud server with just a VPN endpoint and a HTTP reverse proxy. Connect to that VPN endpoint from your real gateway in your home, and expose (carefully) exactly and only what you wish via reverse proxy.

That cloud server can be super cheap. Plus you can configure your home DNS to redirect the public DNS for your gateway to the real gateway--your home control won't be broken during internet outages.

To be clear here the cloud server would connect to my Ignition Gateway running in my home network?
And not the other way around? Connecting my machine to the cloud server?

Both. Your home gateway would run a VPN client to connect privately to your cloud server. Your cloud server would run a VPN server endpoint to support that. With the VPN running, the cloud server's reverse proxy (nginx, apache, whatever) can privately divert public traffic to the real gateway.

(I recommend OpenVPN running with certificate-based authentication. Easily installed in every Linux distro I know of.)

1 Like