Speed of system.net.httpClient()

Clayton.Balfour · February 9, 2024, 7:17pm

I have a question regarding the system.net.httpClient() function. I have a script that loops through a list of API tag IDs and queries an API for the latest data from each tag. I noticed it was fairly slow (about 4 seconds per tag). My original code creates a client instance and then reuses it for each tag in the loop:

client = system.net.httpClient(bypass_cert_validation=True, cookie_policy = "ACCEPT_ALL")

for tagID in tags:
	url = "%s/sources/%s/values?apiKey=%s&start_date=%s&end_date=%s&lang=en" % (host_url, tagID, apiKey, start, end)
	client.get(url=url)

But when I was playing around with it, I noticed it was much faster (0.3 seconds per tag) if I created a new client instance each time:

for tagID in tags:
	
	client = system.net.httpClient(bypass_cert_validation=True, cookie_policy = "ACCEPT_ALL")
	
	url = "%s/sources/%s/values?apiKey=%s&start_date=%s&end_date=%s&lang=en" % (host_url, tagID, apiKey, start, end)
	client.get(url=url)

I noticed the manual says this is not recommended, but I don't think it will be fast enough using the first method. Is there a major concern with the second method? Is there something I am missing for why the first method is so slow?

Excerpt from manual:

Be aware that httpClient instances are heavyweight, so they should be created sparingly and reused as much as possible. For ease of reuse, consider instantiating a new httpClient as a top-level variable in a project library script.

PGriffith · February 12, 2024, 5:52pm

That's really weird, to the point I'm suspicious of your testing methodology.

As in, given this test code making a "real" (albeit trivial) request to an external service:

Code

from string import ascii_uppercase
from random import choice
from java.lang import System
from java.util import Locale
from com.inductiveautomation.ignition.common import FormatUtil

def timeIt(function, *args, **kwargs):
	n = System.nanoTime()

	result = function(*args, **kwargs)
	
	elapsed = FormatUtil.formatDuration(
		Locale.getDefault(), 
		(System.nanoTime() - n) / 1000000,
		FormatUtil.DurationFormatStyle.ABBREVIATED,
		True
	)
	
	print "Ran", function.__name__, "in", elapsed, result if result is not None else ""

def clientFirst(tags):
	client = system.net.httpClient(bypass_cert_validation=True, cookie_policy = "ACCEPT_ALL")
	for tagID in tags:
		url = "https://httpbin.org/headers"
		headers = {
			"X-Tag-ID": tagID,
		}
		client.get(url=url, headers = headers)

def clientLoop(tags):
	for tagID in tags:
		client = system.net.httpClient(bypass_cert_validation=True, cookie_policy = "ACCEPT_ALL")
		url = "https://httpbin.org/headers"
		headers = {
			"X-Tag-ID": tagID,
		}
		client.get(url=url, headers = headers)

tags = [
	''.join(choice(ascii_uppercase) for _ in xrange(20))
	for _ in xrange(10)
]

timeIt(clientFirst, tags)
timeIt(clientLoop, tags)

I get about the results I expected - instantiating the client once is significantly faster:

>>> 
Ran clientFirst in 1s, 757ms 
Ran clientLoop in 3s, 838ms

Besides the fact that literally every other time I've used it, it's much faster to reuse an HTTP client, it's also prone to create socket leaks (mostly on Windows hosts), due to underlying issues with the Java implementation.

Clayton.Balfour · February 14, 2024, 3:19pm

Thanks for the info! Here is my simple test code:

sTime = system.date.now()
response = client.get(url=url)
eTime = system.date.now()
queryTime = system.date.millisBetween(sTime, eTime) / 1000.0
print "queryTime = %s seconds" %queryTime

print response.json

Here is the result from my clientFirst version:

queryTime = 0.391 seconds
[[u'20240214T145400Z', u'0.0'], [u'20240214T145500Z', u'0.0'], [u'20240214T145600Z', u'0.0'], [u'20240214T145700Z', u'0.0'], [u'20240214T145800Z', u'0.0']]
queryTime = 9.135 seconds
[[u'20240214T145400Z', u'0.0'], [u'20240214T145500Z', u'0.0'], [u'20240214T145600Z', u'0.0'], [u'20240214T145700Z', u'0.0'], [u'20240214T145800Z', u'0.0']]
queryTime = 4.625 seconds
[[u'20240214T145400Z', u'8'], [u'20240214T145500Z', u'8'], [u'20240214T145600Z', u'8'], [u'20240214T145700Z', u'8'], [u'20240214T145800Z', u'8']]

And here is the result from the clientLoop version:

queryTime = 0.375 seconds
[[u'20240214T145400Z', u'0.0'], [u'20240214T145500Z', u'0.0'], [u'20240214T145600Z', u'0.0'], [u'20240214T145700Z', u'0.0'], [u'20240214T145800Z', u'0.0']]
queryTime = 0.397 seconds
[[u'20240214T145400Z', u'0.0'], [u'20240214T145500Z', u'0.0'], [u'20240214T145600Z', u'0.0'], [u'20240214T145700Z', u'0.0'], [u'20240214T145800Z', u'0.0']]
queryTime = 0.374 seconds
[[u'20240214T145400Z', u'8'], [u'20240214T145500Z', u'8'], [u'20240214T145600Z', u'8'], [u'20240214T145700Z', u'8'], [u'20240214T145800Z', u'8']]

I noticed that the query for the first tag has roughly the same query time. I'm wondering if there is something limiting the frequency of queries on the server side that I am somehow getting around by making a new HTTP client every time.

victordcq · February 14, 2024, 3:23pm

This has been a real use with use once before we started using a single client.

The warning is there for a reason!

pascal.fragnoud · February 14, 2024, 3:43pm

Try using the same methodology as Paul to do your testing. Just replace url.

Clayton.Balfour · February 14, 2024, 4:07pm

Ok, to be thorough I rewrote my code to use Paul's methodology. Here are the results when querying for 3 tags:

Ran clientFirst in 14s, 135ms 
Ran clientLoop in 1s, 148ms

Full Code

from string import ascii_uppercase
from random import choice
from java.lang import System
from java.util import Locale
from com.inductiveautomation.ignition.common import FormatUtil

def timeIt(function, *args, **kwargs):
	n = System.nanoTime()

	result = function(*args, **kwargs)
	
	elapsed = FormatUtil.formatDuration(
		Locale.getDefault(), 
		(System.nanoTime() - n) / 1000000,
		FormatUtil.DurationFormatStyle.ABBREVIATED,
		True
	)
	
	print "Ran", function.__name__, "in", elapsed, result if result is not None else ""



parameters  = {"type":"api"}
securityData = system.dataset.toPyDataSet(system.db.runNamedQuery('security',parameters))

api_details = {}

for row in securityData:
	item = {}
	item['user'] = row['user']
	item['password'] = row['password']
	item['apikey'] = row['apikey']
	item['host_url'] = row['host_url']
	api_details[row['site']] = item

		
def getUtcTimeNow():
	
	now = system.date.now()
	timezoneOffset = system.date.getTimezoneOffset()
	
	utcTime = system.date.addHours(now, int(-timezoneOffset))
	
	# NOTE: this date object includes the now incorrect eastern timezone. This is removed when the date is formated.
	return utcTime
	

def clientFirst(site, tags):
	
	client = system.net.httpClient(bypass_cert_validation=True, cookie_policy = "ACCEPT_ALL")
	
	for tagID in tags:
		
		host_url = api_details[site.upper()]['host_url']
		apiKey = security.decrypt(api_details[site.upper()]['apikey'])
		
		endDate = getUtcTimeNow()
		endDateFormat = system.date.format(endDate, "yyyyMMdd'T'HHmmss'Z'")
		startDate = system.date.addMinutes(endDate, -5)
		startDateFormat = system.date.format(startDate, "yyyyMMdd'T'HHmmss'Z'")
		
		
		url = "%s/sources/%s/values?apiKey=%s&start_date=%s&end_date=%s&lang=en" % (host_url, tagID, apiKey, startDateFormat, endDateFormat )
		
		response = client.get(url=url)

def clientLoop(site, tags):
	
	for tagID in tags:
	
		client = system.net.httpClient(bypass_cert_validation=True, cookie_policy = "ACCEPT_ALL")
		
		host_url = api_details[site.upper()]['host_url']
		apiKey = security.decrypt(api_details[site.upper()]['apikey'])
		
		endDate = getUtcTimeNow()
		endDateFormat = system.date.format(endDate, "yyyyMMdd'T'HHmmss'Z'")
		startDate = system.date.addMinutes(endDate, -5)
		startDateFormat = system.date.format(startDate, "yyyyMMdd'T'HHmmss'Z'")
		
		
		url = "%s/sources/%s/values?apiKey=%s&start_date=%s&end_date=%s&lang=en" % (host_url, tagID, apiKey, startDateFormat, endDateFormat )
		
		response = client.get(url=url)

tags = ['d26e3134-f049-11eb-9dab-42010afa015a', 'e6c5ab80-f049-11eb-9dab-42010afa015a', 'fc22e4ca-f049-11eb-9dab-42010afa015a']


timeIt(clientFirst, 'mus', tags)
timeIt(clientLoop, 'mus', tags)

pturmel · February 14, 2024, 4:14pm

Yep, I'd bet this is it. The API is probably rate limiting by session cookie. If this isn't your own API server, you might be inadvertently violating its terms of service.

PGriffith · February 14, 2024, 4:24pm

If that's the case, you might be able to work around it by specifying a cookie policy of ACCEPT_NONE in the httpClient call so that each request appears to be unique. May also want to/have to specify a unique user agent per request.
I am not a lawyer, but this is definitely a "grey area" depending on who owns the server(s) you're talking to. That said, your API key should be enough for a cranky server admin to send you a nasty email if they really have a problem, so .

Clayton.Balfour · February 14, 2024, 6:14pm

Thanks all!
I'll ask some questions to see if there are any concerns with the frequency of requests. I'm also told there will soon be a way to get data from multiple tags in one request, which should solve this problem.

I tried changing the cookie policy to ACCEPT_NONE did not appear to have an effect in this case.
I'm not really sure what "specify a unique user agent per request" means. Would this entail having multiple API keys? I'm currently not logging in with a user/password, just api key. Either way, I'll hold off for now until I get more info on the multiple tag per request option.

PGriffith · February 14, 2024, 6:28pm

User-Agent is just a standard HTTP header. If you don't specify a more specific one, the User-Agent of system.net.httpClient is always "Ignition". It's possible the server is batching requests by the user agent header.