Ephemeral port exhaustion

On a site with a pretty big Ignition project, we’re suddenly experiencing port exhaustions. It happens frequently in the last 3 days, while the project hasn’t changed much in the last month.

It happens on an RDP server where multiple accounts connect on, but it also happens on a stand-alone windows PC that just runs one client.

When it happens, the tags and database connections time out, and the client starts using 100% CPU (I’m not sure if that’s a result of the port exhaustion, or if a common software bug).

Has anyone experienced the same? What should we do to resolve this issue?

I don’t understand the network architecture. Where’s the Ignition Gateway? Is it the RDP server having port exhaustion, or the gateway?

The Ignition server is a separate virtual machine on the same physical hardware as the RDP server.
The Ignition server has no difficulties.
The RDP server is used to run a number of Ignition clients (different people connect with their own RDP sessions, some of those needing an Ignition session).
It’s the RDP server (the Windows instance) that generates error logs about running out of ephemeral ports. When looking, there are Java processes that consume 100% CPU and an Ignition instance is stuck.

A different client (a stand-alone PC on physical hardware) also shows the same issues.

I hope this makes it clearer.

Yes and no. /:

My first reaction would be to try without the RDP server, but you’ve done that. Is there any scripting in the project that makes network requests elsewhere that might be recursing? Ephemeral ports are a system-wide resource. An RDP server would run out sooner in such a situation just by hosting more clients, but a run-away script could do it even stand-alone.

Such a script, if set to catch network exceptions and retry instantly, would explain the 100% java thread.

We are connected to printers via a TCP protocol. The client opens a socket and sends a command to the printer on a user action (click on the send button). But I just checked, an the entire thing is wrapped in a try clause, and the socket is closed in the finally clause. There’s no retry code, it just shows an error popup on failure. So that shouldn’t cause issues I guess.

We also use an IP camera viewer but with a varying endpoint (bound to a property) in order to show an image related to a certain selected line. Could that be the issue?

Apart from that, I can’t think of any special connections made from the client directly. Everything repetitive is done from server scripts, so the clients shouldn’t notice this. The remaining connections are common things like SQL bindings and Tag bindings from the client to the server.

Further investigating the problem. At the moment, it looks to come from the antivirus (or something the antivirus is blocking), and Ignition just seems to be the first victim due to the amount of communication it needs.

2 Likes