Large request from Gateway to SQL server

lrose · January 5, 2026, 4:50pm

Recently, our IT departement swapped out our network harware in favor of SD-WAN technology. After that change we noticed a degredation in performance across our clients (95% of our clients are Vision, though there is some perspective being used).

To make a long story short, in looking into potential bottlenecks, they found that there was a significant amount of data passing from the Gateway to the SQL server. The new technology now uses 256bit encryption (except on one tunnel which they changed to 128bit and saw an increase in performance).

What they saw was a 3GB request in a single session.

My question, while I understand that a session can consist of multiple transactions, which in turn can consist of multiple queries, 3GB still seems like a huge ask for a query.

Am I just unfamiliar enough with how the JDBC driver works, that what I see as odd is actually expected? Or is there some thing that we have done in our project which is leading to these large requests.

Of note, the large requests seem to be on a 2 min. cycle. I have IT running a Trace on the SQL server to see if I can narrow down, where the query is originating from. Also of note, the return size of this particular request is around 11GB, which also seems very large to me, but if were making a 3GB request, then I can understand the size.

All gateway logs are fine, no extar long running queries are being reported. Performance metrics on the Gateway also look good to me.

Has anyone seen anything like this before. Any information or discussion around this would be appriciated.

Ignition Version: 8.1.48
Windows OS
MSSQL

pturmel · January 5, 2026, 5:10pm

Does this mean your production history database connection is over a WAN? Ewww!

Don't get hung up on DB "sessions", as Ignition pools them and re-uses them. Any particular query can run on any of the open sessions. Sessions won't combine multiple queries into a transaction unless you have deliberately invoked Ignition's transaction start/end functions.

lrose · January 5, 2026, 5:14pm

No, we have multiple gateways, production history is connected locally to the SQL DB.

This connection is from the Central Gateway to the Local SQL DB.

Leor_Fishman · January 5, 2026, 5:23pm

What sorts of queries are generally running inside your system? a 3 GB query looks like it’s some kind of ‘batching together a bunch of smaller queries into a unioned runprepquery to optimize for minimizing transit hops between the db and the gateway’, but this is probably going to require digging into the code

lrose · January 5, 2026, 5:26pm

90% of the queries are Tag History queries, and the vast majority of those are generated from an Easy Chart binding.

Leor_Fishman · January 5, 2026, 5:26pm

can you instrument deeply enough to see if the 3 GB query is tag history?

lrose · January 5, 2026, 5:29pm

No, that's why I have IT running a trace. To try and figure out what query (or combination of) is being sent.

Nol · January 5, 2026, 5:49pm

Do you have any trends running in realtime mode? In Perspective if I open a powerchart with 5 pens, 1 day range, and set to realtime then I'll see 20MB/s sustained traffic on our gateway until the trend is closed or switched to historical mode.

Unsure about Vision but that was a large source of traffic I noticed initially and then had to change the trends to use historical instead. Initially I thought it'd be smart enough to only poll the new data but it seems like it queries the entire history every poll.

Chris_Bingham · January 5, 2026, 7:10pm

Long Running Queries aside...

Are these servers on different subnets, by chance?

lrose · January 5, 2026, 7:35pm

They are

lrose · January 5, 2026, 7:42pm

Periodically, perhaps, but not constantly.

David_Stone · January 5, 2026, 8:20pm

If you run a traceroute (I assume you are running windows on one of these servers) between the gateway and the SQL, what sort of latency and how many hops are there between the two?

I have seen this sort of thing before where IT sets up an SD-WAN where they have a brand new set of routing tables system wide. This means you find new cases where routing will redirect subnet-subnet traffic via main routers that are on remote systems for odd reasons also they may have installed new firewalls that do the SD-WAN that are slower at routing than the old routers, so the higher traffic and encryption requirements will slow the whole process down.

Definitely worth the traceroute and make sure the path that is being taken is the expected path for the data.

Matrix_Engineering · January 5, 2026, 8:27pm

Available on Linux too, and in my experience executes way faster compared to CMD.

David_Stone · January 5, 2026, 8:31pm

Hah, but of course... oddly I have never used it on Linux. Different spelling of the command.

Steve_Laubach · January 5, 2026, 8:32pm

There’s an open source program called “WinMTR” that is great for detecting packet loss on a path also if you’re looking to monitor for packet loss.

Steve_Laubach · January 5, 2026, 8:45pm

I googled “SD-wan 2 min cycle” and found something about Cisco IOS XE SD-WAN device behind a NAT device rotating through a list of base ports to establish DTLS connections when a connection attempt is unsuccessful. It could be a coincidence that you’re seeing large requests on a 2 minutes cycle but I thought I’d mention this in case it’s helpful.

Search this documentation for “2 minutes” to find what I’m talking about.

Here’s a copy/paste of the text of the part of the manual I am referring to:

For a tunnel interface (TLOC) on a Cisco IOS XE SD-WAN device behind a NAT device, you can configure the interface to rotate through a pool of preselected OMP port numbers, known as base ports, to establish DTLS connections with other WAN edge devices when a connection attempt is unsuccessful. By default, port hopping is enabled on Cisco IOS XE SD-WAN devices and on all tunnel interfaces on Cisco IOS XE SD-WAN devices.

There are five base ports: 12346, 12366, 12386, 12406, and 12426. These port numbers determine the ports used for connection attempts. The first connection attempt is made on port 12346. If the first connection does not succeed after about 1 minute, port 12366 is tried. After about 2 minutes, port 12386 is tried; after about 5 minutes, port 12406; after about 6 minutes, port 12426 is tried. Then the cycle returns to port 12346.

If you have configured a port offset with the port-offset command, the five base ports are a function of the configured offset. For example, with a port offset of 2, the five base ports are 12348, 12368, 12388, 12408, and 12428. Cycling through these base ports happens in the same way as if you had not configured an offset.

Chris_Bingham · January 5, 2026, 10:24pm

Worthwhile running a trace to see if any obvious route issues jump out. Even if the route looks good...I would see if 'they' are onboard with a temporary test to place the SQL server and Ignition server on the same subnet...invasive as it might be.

Even if performance can be restored to clients, 3GB (in a short? duration) seems excessive (even with asynchronous routing). Might be worthwhile taking advantage of the poor performance to pinpoint & fix any views which have long-running (large) queries configured by default.