Docker Swarm (arm) Gateway Cannot Start

I’m trying to run the ignition docker image on a docker swarm (arm/v7 processors).

networks:
   ignet:
     external:
       name: ignet
services:
  distributor:
    environment:
      GATEWAY_ADMIN_PASSWORD: password
      IGNITION_EDITION: full
    image: ignition_configured
    networks:
      ignet:
        ipv4_address: 172.25.1.2

The image is just the base kcollins/ignition image with the MQTT Distributor module installed and committed.

The network was created like this:

docker network create -d overlay ignet --subnet 172.25.0.0/16

The setup works in non-swarm mode (docker-compose up) and works on my amd system in both swarm and non-swarm modes. On the arm swarm, though, I get the following error:

jvm 1    | 2021/04/07 16:05:57 | I [g.ModuleManager               ] [16:05:56]: Setting up modules
jvm 1    | 2021/04/07 16:05:57 | I [g.LicenseManager              ] [16:05:56]: Trial time reset.  Time remaining = 7199.
jvm 1    | 2021/04/07 16:05:58 | I [G.L.A.AlarmNotificationService] [16:05:58]: Remote Alarm Notification Manager initialized successfully.
jvm 1    | 2021/04/07 16:05:58 | I [c.i.i.g.o.KeyStoreManager     ] [16:05:58]: Loading KeyStore at /usr/local/share/ignition/data/opcua/client/security/certificates.pfx
jvm 1    | 2021/04/07 16:06:00 | I [c.i.i.g.o.KeyStoreManager     ] [16:06:00]: Loading KeyStore at /usr/local/share/ignition/data/opcua/server/security/certificates.pfx
wrapper  | 2021/04/07 16:06:03 | INT trapped.  Shutting down.
jvm 1    | 2021/04/07 16:06:03 | I [o.e.j.s.AbstractConnector     ] [16:06:03]: Stopped ServerConnector@1d529a{HTTP/1.1,[http/1.1]}{0.0.0.0:8088}
jvm 1    | 2021/04/07 16:06:03 | I [o.e.j.s.AbstractConnector     ] [16:06:03]: Stopped ServerConnector@586b9e{SSL,[ssl, http/1.1]}{0.0.0.0:8060}
jvm 1    | 2021/04/07 16:06:03 | I [o.e.j.s.session               ] [16:06:03]: node0 Stopped scavenging
jvm 1    | 2021/04/07 16:06:03 | I [IgnitionGateway               ] [16:06:03]: Ignition[state=STARTING] ContextState = STOPPING
jvm 1    | 2021/04/07 16:06:03 | I [IgnitionGateway               ] [16:06:03]: Ignition Gateway shutting down...
jvm 1    | 2021/04/07 16:06:03 | I [g.ModuleManager               ] [16:06:03]: ModuleManager shutting down...
jvm 1    | 2021/04/07 16:06:04 | I [g.ModuleManager               ] [16:06:03]: ModuleManager shut down in 305ms
jvm 1    | 2021/04/07 16:06:04 | W [C.BasicExecutionEngine        ] [16:06:03]: Tried to unregister non existent unit [gatewayareanetworkconnectionmanager connection monitor].
jvm 1    | 2021/04/07 16:06:04 | E [IgnitionGateway               ] [16:06:04]: Error shutting down GatewayNetworkManager.
jvm 1    | 2021/04/07 16:06:04 | java.lang.NullPointerException: null
jvm 1    | 2021/04/07 15:49:30 | 	at com.inductiveautomation.metro.impl.services.ServiceManagerImpl.shutdown(ServiceManagerImpl.java:133)
jvm 1    | 2021/04/07 15:49:30 | 	at com.inductiveautomation.metro.impl.CentralManagerImpl.shutdown(CentralManagerImpl.java:189)
jvm 1    | 2021/04/07 15:49:30 | 	at com.inductiveautomation.ignition.gateway.gan.GatewayAreaNetworkManagerImpl.shutdownComms(GatewayAreaNetworkManagerImpl.java:522)
jvm 1    | 2021/04/07 15:49:30 | 	at com.inductiveautomation.ignition.gateway.gan.GatewayAreaNetworkManagerImpl.stop(GatewayAreaNetworkManagerImpl.java:250)
jvm 1    | 2021/04/07 15:49:30 | 	at com.inductiveautomation.ignition.gateway.gan.GatewayAreaNetworkManagerImpl.shutdown(GatewayAreaNetworkManagerImpl.java:231)
jvm 1    | 2021/04/07 15:49:30 | 	at com.inductiveautomation.ignition.gateway.IgnitionGateway.shutdownInternal(IgnitionGateway.java:1607)
jvm 1    | 2021/04/07 15:49:30 | 	at com.inductiveautomation.ignition.gateway.IgnitionGateway.shutdown(IgnitionGateway.java:2294)
jvm 1    | 2021/04/07 15:49:30 | 	at com.inductiveautomation.ignition.gateway.web.IgnitionWebAppImpl.onDestroy(IgnitionWebAppImpl.java:130)
jvm 1    | 2021/04/07 15:49:30 | 	at org.apache.wicket.Application.internalDestroy(Application.java:675)
jvm 1    | 2021/04/07 15:49:30 | 	at org.apache.wicket.protocol.http.WebApplication.internalDestroy(WebApplication.java:700)
jvm 1    | 2021/04/07 15:49:30 | 	at org.apache.wicket.protocol.http.WicketFilter.destroy(WicketFilter.java:591)
jvm 1    | 2021/04/07 15:49:30 | 	at com.inductiveautomation.ignition.gateway.bootstrap.GatewayFilter.destroy(GatewayFilter.java:67)
jvm 1    | 2021/04/07 15:49:30 | 	at org.eclipse.jetty.servlet.FilterHolder.destroyInstance(FilterHolder.java:164)
jvm 1    | 2021/04/07 15:49:30 | 	at org.eclipse.jetty.servlet.FilterHolder.doStop(FilterHolder.java:148)
jvm 1    | 2021/04/07 15:49:30 | 	at org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:93)
jvm 1    | 2021/04/07 15:49:30 | 	at org.eclipse.jetty.servlet.ServletHandler.doStop(ServletHandler.java:238)
jvm 1    | 2021/04/07 15:49:30 | 	at org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:93)
jvm 1    | 2021/04/07 15:49:30 | 	at org.eclipse.jetty.util.component.ContainerLifeCycle.stop(ContainerLifeCycle.java:180)
jvm 1    | 2021/04/07 15:49:30 | 	at org.eclipse.jetty.util.component.ContainerLifeCycle.doStop(ContainerLifeCycle.java:201)
jvm 1    | 2021/04/07 15:49:30 | 	at org.eclipse.jetty.server.handler.AbstractHandler.doStop(AbstractHandler.java:111)
jvm 1    | 2021/04/07 15:49:30 | 	at org.eclipse.jetty.security.SecurityHandler.doStop(SecurityHandler.java:425)
jvm 1    | 2021/04/07 15:49:30 | 	at org.eclipse.jetty.security.ConstraintSecurityHandler.doStop(ConstraintSecurityHandler.java:425)
jvm 1    | 2021/04/07 15:49:30 | 	at org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:93)
jvm 1    | 2021/04/07 15:49:30 | 	at org.eclipse.jetty.util.component.ContainerLifeCycle.stop(ContainerLifeCycle.java:180)
jvm 1    | 2021/04/07 15:49:30 | 	at org.eclipse.jetty.util.component.ContainerLifeCycle.doStop(ContainerLifeCycle.java:201)
jvm 1    | 2021/04/07 15:49:30 | 	at org.eclipse.jetty.server.handler.AbstractHandler.doStop(AbstractHandler.java:111)
jvm 1    | 2021/04/07 15:49:30 | 	at org.eclipse.jetty.server.session.SessionHandler.doStop(SessionHandler.java:519)
jvm 1    | 2021/04/07 15:49:30 | 	at org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:93)
jvm 1    | 2021/04/07 15:49:30 | 	at org.eclipse.jetty.util.component.ContainerLifeCycle.stop(ContainerLifeCycle.java:180)
jvm 1    | 2021/04/07 15:49:30 | 	at org.eclipse.jetty.util.component.ContainerLifeCycle.doStop(ContainerLifeCycle.java:201)
jvm 1    | 2021/04/07 15:49:30 | 	at org.eclipse.jetty.server.handler.AbstractHandler.doStop(AbstractHandler.java:111)
jvm 1    | 2021/04/07 15:49:30 | 	at org.eclipse.jetty.server.handler.ContextHandler.stopContext(ContextHandler.java:916)
jvm 1    | 2021/04/07 15:49:30 | 	at org.eclipse.jetty.servlet.ServletContextHandler.stopContext(ServletContextHandler.java:367)
jvm 1    | 2021/04/07 15:49:30 | 	at org.eclipse.jetty.webapp.WebAppContext.stopWebapp(WebAppContext.java:1450)
jvm 1    | 2021/04/07 15:49:30 | 	at org.eclipse.jetty.webapp.WebAppContext.stopContext(WebAppContext.java:1415)
jvm 1    | 2021/04/07 15:49:30 | 	at org.eclipse.jetty.server.handler.ContextHandler.doStop(ContextHandler.java:980)
jvm 1    | 2021/04/07 15:49:30 | 	at org.eclipse.jetty.servlet.ServletContextHandler.doStop(ServletContextHandler.java:284)
jvm 1    | 2021/04/07 15:49:30 | 	at org.eclipse.jetty.webapp.WebAppContext.doStop(WebAppContext.java:547)
jvm 1    | 2021/04/07 15:49:30 | 	at org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:93)
jvm 1    | 2021/04/07 15:49:30 | 	at org.eclipse.jetty.util.component.ContainerLifeCycle.stop(ContainerLifeCycle.java:180)
jvm 1    | 2021/04/07 15:49:30 | 	at org.eclipse.jetty.util.component.ContainerLifeCycle.doStop(ContainerLifeCycle.java:201)
jvm 1    | 2021/04/07 15:49:30 | 	at org.eclipse.jetty.server.handler.AbstractHandler.doStop(AbstractHandler.java:111)
jvm 1    | 2021/04/07 15:49:30 | 	at org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:93)
jvm 1    | 2021/04/07 15:49:30 | 	at org.eclipse.jetty.util.component.ContainerLifeCycle.stop(ContainerLifeCycle.java:180)
jvm 1    | 2021/04/07 15:49:30 | 	at org.eclipse.jetty.util.component.ContainerLifeCycle.doStop(ContainerLifeCycle.java:201)
jvm 1    | 2021/04/07 15:49:30 | 	at org.eclipse.jetty.server.handler.AbstractHandler.doStop(AbstractHandler.java:111)
jvm 1    | 2021/04/07 15:49:30 | 	at org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:93)
jvm 1    | 2021/04/07 15:49:30 | 	at org.eclipse.jetty.util.component.ContainerLifeCycle.stop(ContainerLifeCycle.java:180)
jvm 1    | 2021/04/07 15:49:30 | 	at org.eclipse.jetty.util.component.ContainerLifeCycle.doStop(ContainerLifeCycle.java:201)
jvm 1    | 2021/04/07 15:49:30 | 	at org.eclipse.jetty.server.handler.AbstractHandler.doStop(AbstractHandler.java:111)
jvm 1    | 2021/04/07 15:49:30 | 	at org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:93)
jvm 1    | 2021/04/07 15:49:30 | 	at org.eclipse.jetty.util.component.ContainerLifeCycle.stop(ContainerLifeCycle.java:180)
jvm 1    | 2021/04/07 15:49:30 | 	at org.eclipse.jetty.util.component.ContainerLifeCycle.doStop(ContainerLifeCycle.java:201)
jvm 1    | 2021/04/07 15:49:30 | 	at org.eclipse.jetty.server.handler.AbstractHandler.doStop(AbstractHandler.java:111)
jvm 1    | 2021/04/07 15:49:30 | 	at org.eclipse.jetty.server.Server.doStop(Server.java:454)
jvm 1    | 2021/04/07 15:49:30 | 	at org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:93)
jvm 1    | 2021/04/07 15:49:30 | 	at org.eclipse.jetty.util.thread.ShutdownThread.run(ShutdownThread.java:127)

...

jvm 1    | 2021/04/07 16:06:04 | I [IgnitionGateway               ] [16:06:04]: Ignition Gateway shut down in 721ms
jvm 1    | 2021/04/07 16:06:04 | 16:06:04,275 |-INFO in ch.qos.logback.classic.AsyncAppender[SysoutAsync] - Queue flush finished successfully within timeout.
jvm 1    | 2021/04/07 16:06:04 | 16:06:04,284 |-INFO in ch.qos.logback.classic.AsyncAppender[DBAsync] - Worker thread will flush remaining events before exiting.
jvm 1    | 2021/04/07 16:06:04 | 16:06:04,286 |-INFO in ch.qos.logback.classic.AsyncAppender[DBAsync] - Queue flush finished successfully within timeout.
wrapper  | 2021/04/07 16:06:05 | <-- Wrapper Stopped

Where the line wrapper | 2021/04/07 16:06:03 | INT trapped. Shutting down. seems to be the first place of divergence from normal behavior.

Any help would be greatly appreciated!

Sorry that I missed this post from a while back… I think what you’re running into is the container healthcheck. When running on Pi’s (or other slower hardware, especially slow disk I/O), you will likely have to override the healthcheck to provide more startup time. Otherwise, Swarm will try to restart the container over and over due to healthcheck failure. The health check is informational when running containers directly on Docker Engine, but Swarm actually takes action based on its status.

Try adding this to your service in your YAML:

healthcheck:
  test: [ "CMD-SHELL", "curl --max-time 3 -f http://localhost:8088/StatusPing 2>&1 | grep RUNNING" ]
  interval: 60s
  timeout: 10s
  retries: 3
  start_period: 120s

I wish you could just override the timing aspects, but in order to override the health check at all, you have to provide a complete definition…