Gateway Restart Troubles & Complications after upgrade to 7.9.9

Hey folks,

Java: 1.8.0_161-b12

A few weeks ago I struggled with upgrading 7.9.7 to 7.9.9, in that the gateway service was unable to start after the upgrade. I managed to get it running after a series of uninstalls, reinstalls, and gateway restores. Walked away with it running 7.9.9 stable. I had no issues with the upgrade on the redundant machine.

Fast forward to today. The site lost power and on restart, the gateway service did not restart automatically. I restarted and pulled a gateway backup, moved it to the testbench, and it worked fine; reboots and all. So I uninstalled and reinstalled ignition on the production device, restored to it, and still had issues on reboot. Even a clean gateway backup failed to restart on its own.

After a reboot, if I open the gateway control utility, this is what I get:

If I click Stop, it tells me the service isn’t running. If I hit start, it will start and eventually the fields will populate with the right information.

The ignition gateway service is properly configured in services.msc to start automatically.

The one thing that really stood out to me was that, even after an uninstall, deleting the remaining files, and reisntall, the fresh install would suffer from the same issues.

Any thoughts on what’s going on? I have yet to try reinstalling Java.

What showed in the wrapper log file? There should be some form of fatal error at the end when the gateway has tried and failed to start.

Hi Phil,

As best as I can tell, this is the first thing that occurred after restart.

STATUS | wrapper  | 2018/11/29 12:05:17 | --> Wrapper Started as Service
STATUS | wrapper  | 2018/11/29 12:05:20 | Java Service Wrapper Standard Edition 64-bit 3.5.35
STATUS | wrapper  | 2018/11/29 12:05:20 |   Copyright (C) 1999-2018 Tanuki Software, Ltd. All Rights Reserved.
STATUS | wrapper  | 2018/11/29 12:05:20 |     http://wrapper.tanukisoftware.com
STATUS | wrapper  | 2018/11/29 12:05:20 |   Licensed to Inductive Automation for Inductive Automation
STATUS | wrapper  | 2018/11/29 12:05:20 | 
WARN   | wrapper  | 2018/11/29 12:05:31 | Child process: Java version: timed out
STATUS | wrapper  | 2018/11/29 12:05:31 | <-- Wrapper Stopped

Following that is all of the information from my manual restart.

If you’d like the full wrapper, I can do that, too.

Got me. Never seen that one before.

Nor have I.

I’m currently working on reinstalling Java to see if that will help the issue.

No luck. Updated, uninstalled,reinstalled, all nothing. I’m kind of stumped. Guess I’ll reach out to support.

For what it’s worth, Phil, I rolled back to 7.9.7 and it fixed the issue. Upgrading from there (7.9.8-7.9.10) all yield the same result. In touch with support seeking answers.

1 Like

Could you try changing the startup type for the Ignition service to be Automatic (Delayed Start)?

Hi Kurt,

I got the same fix suggested on my ticket. I’ll give it a shot at the next available downtime window (hopefully an evening this week) and let you know how it goes.

Thank you.

Kurt,

So, changing the startup to delayed start worked on the gateway that I had reinstalled Java on.
The other gateway, no such luck. Once I uninstalled and reinstalled java (8_161 to 8_191), it worked like a charm.

Apparently, between my startup settings and the java install / version, it was keeping me from progressing past 7.9.7.

Both gateways running on 7.9.10 now.

(For the record, these are both corporate servers running Oracle’s Java and Win Server '16. My redundant desktop server running Zulu and win 7 never had a problem with upgrading, nor did I have to resintall java).

Thanks!

1 Like

Hi Kurt,

Fast-forward a year and some change, we’ve recently begun experiencing the same restart issue.
Gateway is running 7.9.12 and Java is still the same, versions haven’t changed. Ignition is running delayed start, and my latest wrapper log shows the same Java timeout. We’ve largely left this server alone and first noticed a failed restart around Nov '19. Since then we’ve had 3 failed restarts.

INFO   | jvm 1    | 2020/02/21 12:03:03 | WSTATUS | wrapper  | 2020/02/21 14:22:40 | --> Wrapper Started as Service
STATUS | wrapper  | 2020/02/21 14:22:45 | Java Service Wrapper Standard Edition 64-bit 3.5.35
STATUS | wrapper  | 2020/02/21 14:22:45 |   Copyright (C) 1999-2018 Tanuki Software, Ltd. All Rights Reserved.
STATUS | wrapper  | 2020/02/21 14:22:45 |     http://wrapper.tanukisoftware.com
STATUS | wrapper  | 2020/02/21 14:22:45 |   Licensed to Inductive Automation for Inductive Automation
STATUS | wrapper  | 2020/02/21 14:22:45 | 
WARN   | wrapper  | 2020/02/21 14:23:03 | Child process: Java version: timed out
INFO   | wrapper  | 2020/02/21 14:23:05 | Wrapper Process has not received any CPU time for 15 seconds.  Extending timeouts.
STATUS | wrapper  | 2020/02/21 14:23:05 | <-- Wrapper Stopped

Any thoughts? I sent an email to support@inductiveautomation.com and it’s been about a week with no response.

Edit: Upgraded to 7.9.12 early 2019.

That sounds like too many other background services on the same machine. If IT won’t let you clean them out, maybe you should be running some other OS…

Unfortunately, not my IT, not my server, not my spec. I don’t have a lot of control over the server’s specs or configuration, other than managing the Ignition install.

I’ll take this back to IT and see what they have to say.

Additional information, if it proves helpful:

There is a second server, same specs & configuration that runs on the same physical hardware (starts 40 seconds before this server). They are identical, barring the Ignition projects they run.

This other server has no problem starting. Any thoughts?

Find out what is monopolizing the CPU. Ignition doesn’t like getting choked by the CPU and things go south really badly when it is. Is it just Java (Ignition) taking up all the CPU, or is there another process(es) using a lot of it at the time?

Thanks. Any thoughts on the best way to trace CPU usage during startup? I only have RDP connectivity, so I expect I’ll miss most of the activity trying to login and observe it.

Consider swapping the startup timing of the two servers (same hypervisor?) and seeing what happens. An over-committed hypervisor can create much grief.

Anyways, being starved of CPU means it it really is IT’s problem, not yours.

Thanks, Phil.

I’ll suggest swapping the start-up order and see if we have the same issue. I just rebooted the gateway to test and wrapper log is as follows:

INFO   |STATUS | wrapper  | 2020/02/28 12:30:47 | --> Wrapper Started as Service
STATUS | wrapper  | 2020/02/28 12:30:49 | Java Service Wrapper Standard Edition 64-bit 3.5.35
STATUS | wrapper  | 2020/02/28 12:30:49 |   Copyright (C) 1999-2018 Tanuki Software, Ltd. All Rights Reserved.
STATUS | wrapper  | 2020/02/28 12:30:49 |     http://wrapper.tanukisoftware.com
STATUS | wrapper  | 2020/02/28 12:30:49 |   Licensed to Inductive Automation for Inductive Automation
STATUS | wrapper  | 2020/02/28 12:30:49 | 
INFO   | wrapper  | 2020/02/28 12:31:02 | Wrapper Process has not received any CPU time for 12 seconds.  Extending timeouts.
STATUS | wrapper  | 2020/02/28 12:31:02 | Launching a JVM...
INFO   | jvm 1    | 2020/02/28 12:31:04 | WrapperManager: Initializing...

Since it managed to start when just a single VM and not a power outage, I think your suggestion is correct. I’ll see what they have to say. Thank you.

Cheers!

Performance Monitor might be able to grab startup data for you (disclaimer: I haven’t tested setting up a data collector to catch startup data). See How to use Performance Monitor on Windows 10 | Windows Central for an overview. For process specific data, you’d be looking at counters under the process section–select All instances to get that data on all processes:


That said, @pturmel is right this is an IT issue and with both of those VMs on the same hardware it is likely the first one to start is starving the second one of resources–Ignition tends to use a lot of CPU for a bit as it gets going.

Are there any other VMs on the same hardware? If so, bumping up CPU allocation to your Ignition VMs may help. I just had to get this done on one install on a VM this morning that was running a steady 95-100% CPU with Ignition unable to keep up with what was being asked of it when a certain application was running on another server that proxied the tag subscriptions through this VM.

Thanks, Witman.

All of the facility’s VMs run on this server, and I’m not actually sure where the Ignition servers & SQL services fall in that list.

I will kick it to IT - thanks again.