Memory Usage, Gateway crash, JVM hung up

Looking for some pointers or resources to turn to about this. We had a gateway crash and reboot.

From wrapper.log

STATUS | wrapper  | 2023/06/29 15:18:42 | JVM appears hung: Timed out waiting for signal from JVM.  Restarting JVM.
STATUS | wrapper  | 2023/06/29 15:18:43 | JVM exited after being requested to terminate.
STATUS | wrapper  | 2023/06/29 15:18:48 | Reloading Wrapper configuration...
STATUS | wrapper  | 2023/06/29 15:18:48 | Launching a JVM...

This happened at the exact time I was executing a very large named query so I think our database got hung up on it and everything that came after also got hung up.

What is concerning is I went over the 2 GB allotted. Is this a resource issue? Should I increase the memory allotted for Ignition? Should I move the named query to a stored procedure and execute it on the SQL database?

Any advice is much appreciated.

I don't necessarily see the connection from the named query to the logs you posted here outside of the sequence of events you describe but outside of that I would say 2 gigs is very tiny for Ignition, especially if you have perspective projects. Can't say for sure from your post and logs, but I would not be surprised in the slightest if you ended having resource issues with that setup. I would add more for sure if you can, not even knowing anything else about what you have going on.

2 Likes

The named query takes place almost exactly at the time of the reboot shown in the graph as an empty space. The second spike was me trying to execute it again but cancelling sooner.

Is it a select query with a lot of rows?

Before I limited the query to 10,000 rows it was everything in the table ( so yes it was).

1 Like

Oh yea that tracks then imo. I think your analysis is probably correct. If this isn't a prod environment, I bet if you tested it (without limit) and ran the NQ again you could cause a gateway crash at will.

Just to be pedantic/as clear as possible...

The snippet of logs you posted at the beginning doesn't actually mean the gateway wasn't responsive. It probably was struggling under GC load, which caused it to appear unresponsive to the service wrapper. However, you can increase that timeout to make the wrapper less aggressive. In particular, the fact that the JVM exited cleanly after the wrapper asked it to means that your gateway was probably still running (just struggling) and would have eventually completed this query.

2gB is definitely under-provisioned, though.

1 Like

I opened an official ticket and submitted the wrapper.log.

What would be a good amount of memory? 4GB?

If this is the first time you've noticed issues, then 4gB is a reasonable place to start. Impossible to give a definitive answer of how much is enough without knowing all the details of your system.

That's a very hard question to answer in a vacuum but for new projects I like to start with 8 GB when possible and change from there as needed.

The system is a week old and just today we added about 58 vision clients with very tiny functionality (4 buttons that run a script). And then we have 5 perspective sessions open currently.

Except for extreme queries, Vision is a pretty light gateway load. Perspective, not so much. I'd make sure the server had a couple GB for every Perspective client that can do more than trivial trends, and plenty of cores.

That seems extreme @pturmel. If I end up with 50 perspective clients are you recommending 50 GB memory?

Depends on what you allow those clients to do, but yes.

I recommend you test thoroughly. I strongly recommend frontend/backend split setups for Perspective when there's more than one non-trivial client session expected. And sometimes even then.

Bogging down Perspective can disrupt your drivers and historization and any supervisory control in SFCs or timer events. If you are sizing a system for Perspective using your experience with Vision, you are screwing up.

Just think about how much RAM you give your Vision clients--add about that much on your gateway for every Perspective client, for similar activity.

2 Likes

Heh... well the system is in production now so building the plane while I fly I guess.

I am fairly confident my perspective clients are all trivial (except for the larger queries).

For reference the gateway is executing 0.4 queries per minute and I was careful to only trigger queries when necessary.

1 Like

Final word from support ticket.