Backup gateway constantly reboots with error when installing a module in 8.3 redundancy mode

Hey Inductive,

We're trying to develop a module that supports Redundancy in Ignition 8.3 like we did in 8.1.

We run 2 images of the gateway (v8.3.3), one for the primary and the other one for the backup. Then, once both gateways are up, we configure them as such:

Primary (running on http://localhost:9081, container name: ign-primary)

  • Redundancy Settings
    • Mode=Master
    • Port=8060 (default)
  • Network Settings
    • Require SSL=Off

Backup (running on http://localhost:9082, container name: ign-backup)

  • Redundancy Settings: Mode=Backup
  • Backup Node Settings:
    • Master Node Address=ign-primary
    • Port=8088
    • Use SSL=Off

Then, I wait until the connection looks good on both:

Then I install the module in the master gateway.

The module we developed works well in independent mode, but when installed in redundancy mode, caused the backup to constantly reboot due to an error. So I tried with the dummy module provided on IA's GitHub :

https://github.com/inductiveautomation/ignition-sdk-examples/tree/ignition-8.3/webui-webpage

I install it on the master gateway and get the same thing: I restart the master gateway and after a minute, the backup gateway restarts, then errors, then restarts, then errors, etc... for a total of 5 times I think, until it just stops trying.

Here are some error logs I could manage to get from the backup while it was up (REDACTED is mine, but it's just an IP address):

-> outgoing local='ign-primary-backup' remote='ign-primary-master' method=handleConnectException: Connection attempt to 'ws://ign-primary:8088/system/ws-control-servlet?name=ign-primary-backup&uuid=14c4920f-d033-4459-ba0a-1cadffd9b0bb&serializer=protobuf&url=http://REDACTED:8088/system' failed! Verify that your host and port settings are correct. Error message='java.net.ConnectException: Connection refused'


-> outgoing local='ign-primary-backup' remote='ign-primary-master' method=handleConnectException: Connection attempt to 'ws://ign-primary:8088/system/ws-control-servlet?name=ign-primary-backup&uuid=14c4920f-d033-4459-ba0a-1cadffd9b0bb&serializer=protobuf&url=http://REDACTED:8088/system' failed! Response code='200', error message='org.eclipse.jetty.websocket.api.exceptions.UpgradeException: org.eclipse.jetty.websocket.core.exception.UpgradeException: Failed to upgrade to websocket: Unexpected HTTP Response Status Code: 200 OK'.

One thing I noticed that is weird is that initially in the top-left corner of the page, the backup gateway shows "ign-backup", but after its first restarts says "ign-primary"

Here's what it does (left is primary, right is backup):

So I'm wondering:

  • Are modules supported in redundancy mode in 8.3?
  • If so, did I simply misconfigure the redundancy and network settings at the beginning? Did I forget a step?

Sorry for the lengthy post, but I wanted to include as much details as possible. Let me know if you need any more.

Thanks!

Yes, v8.3 supports 3rd party modules in redundancy. Significant changes from v8.1, though.

Keep in mind that a "full resync" involves a backup gateway restart. If that chokes, the resync won't finish, and therefore a full resync is still needed.

Study your backup gateway logs carefully.

Psssst! This forum is mostly not IA, though a few IA staff are regular contributors.

Hey, thanks for the reply.

It makes sense that a full resync involves a backup gateway restart given that module hot reloads are no longer supported in 8.3.

So, if 1) modules are supported in redundancy mode and 2) I'm using the module directly provided on IA's Github, then am I missing something in my redundancy/network configuration?

I tried to look at the gateway logs, but then the constant backup gateway restarts make it very hard to get a hold of them. I included what I could in my initial post.

Which IA module? I have looked lately, but I don't recall seeing an example that implemented everything needed for redundancy synchronization.

Look at the wrapper.log text file. As you noted, you cannot use the web interface to retrieve logs if it won't stay running.

I used this one:

https://github.com/inductiveautomation/ignition-sdk-examples/tree/ignition-8.3/webui-webpage

You are right that it doesn't provide redundancy synchronization, but it's just a simple React page with barely any backend logic, which shouldn't need synchronizing. During the 8.1 module development, I don't recall having had to deal with synchronization logic for purely front-end stuff.

Thanks for the wrapper.log pointer, I'll try to see how I can get access to this file.

Can you try 8.3.5, in case this is just an underlying sync bug that we've fixed?

Hey Paul, thanks for the reply.

Just tried and I'm getting the same thing with v8.3.5. Do you see anything wrong or missing with the configuration in my initial post?

Still trying to access wrapper.log in the container as Pturmel suggested, but no luck yet

find / -type f -name "*.log"
find: ‘/home/ubuntu’: Permission denied
find: ‘/proc/tty/driver’: Permission denied
find: ‘/root’: Permission denied
find: ‘/var/cache/apt/archives/partial’: Permission denied
find: ‘/var/cache/ldconfig’: Permission denied
/var/log/alternatives.log
/var/log/apt/history.log
/var/log/apt/term.log
/var/log/bootstrap.log
/var/log/dpkg.log
/var/log/fontconfig.log

Docker images don't use the wrapper log, they log directly to the container as far as I remember.

You said that you were using Docker containers? I am wondering if the backup gateway is downloading the module from the master, but then the .modl file disappears from the backup container when it restarts. That would cause another sync attempt when the backup gateway comes back up. I did try my own custom module in some 8.3 docker containers, and it did sync the module properly across the gateways. So conceptually it should work. Also, you should be able to use a command like docker logs my-backup-gateway or docker compose logs my-backup-gateway (if using compose) to get the wrapper log text from a container.

Yes, possibly that is what happens. I remember in 8.1 I would sometimes get a "module version mismatch" error (or something similar) and maybe this is the 8.3 version of the same error.

If you succeeded in getting a module to work in redundancy mode in 8.3, do you mind sharing it (or a stripped down version of it)? Or see if you see anything wrong/missing with the configuration below?

Primary (running on http://localhost:9081, container name: ign-primary)

  • Redundancy Settings

    • Mode=Master
    • Port=8060 (default)
  • Network Settings

    • Require SSL=Off

Backup (running on http://localhost:9082, container name: ign-backup)

  • Redundancy Settings: Mode=Backup

  • Backup Node Settings:

    • Master Node Address=ign-primary
    • Port=8088
    • Use SSL=Off

Docker, while nice and quick for most things, I find difficult for module development. You cannot use the module install page on the gateway with Docker. Use VMs.

Hmm, I'm not sure what you mean by that. I use Docker to run the gateway, but the module is built outside of it. In the gateway, I install the .modl file in the Platform/Modules page. Did I understand correctly or maybe you meant something else?

I think this situation improved in 8.3... because I had the same initial thought, but then looked at the docs: Docker Image Examples | Ignition User Manual

Third party modules installed through the web interface are automatically routed to the external modules folder within the data volume. This means no special consideration is required for these modules when upgrading Ignition to a newer image version.

You have to go through a bunch of hoops to get modules to persist in Docker if installing through the gateway web page. The instructions for modules in Ignition's docker image requires bind mounting the module file or constructing a custom image with the 3rd party module file in place. Then putting the module ID in the appropriate environment variable.

It is not conducive to quick testing, in my experience. VMs with permanent installs of Ignition are much easier to work with for module development.

Hmmm. May have to try again.

I'll try to see if I can have a VM set up to test that. I am not convinced it is that though, as I also developed the 8.1 module using a gateway deployed with Docker and I didn't have these issues in redundancy mode. Just the very occasional "module mismatch" error I was talking about earlier.

@mgross any ideas? Is it possible something about third party modules in Docker and redundancy is broken?

Giving some more context:

We tried installing the WebUI tutorial module (the one from IA's GitHub) on the gateway deployed on a VM using the same steps and parameters as we did on Docker and redundancy worked. So it looks like it's something with Docker/redundancy/third-party modules.

I do all my development on docker, and have never run into this problem. I don’t do anything with redundancy though.

Yes, the issue is specific to redundancy.

---

We found the issue and now it even works in Docker.

First, we tried installing the module on both gateways instead of just the master and it worked. That means it had to be an issue with how Ignition replicates the module from master to backup.

Then, we tried signing the module and only installing it on master and it worked. We don't sign when we develop (we use -Dignition.allowunsignedmodules=true) but it looks like the flag is ineffective in redundancy mode.

@Kevin.Herron @paul-griffith @mgross Our conclusion is that it looks like the installed module is not properly replicated from the master to the backup gateway when it is unsigned, even if both gateways run with flag -Dignition.allowunsignedmodules=true . That seems to be a bug on Ignition's side.

cc @pturmel

I always sign in my build process, so I never would have found this.