Native Client Launcher Improvements

The native client launcher is a great addition. Here are a couple things that would make it even better and relatively intuitive/gotcha free to use:

1 - Check Gateway availability before attempting client auto logon: Have clientlauncher (OR the client itself, which probably makes more sense) check for Ignition gateway ready for logon before attempting auto logon. Presently, if clientlauncher is started too soon on Windows startup, it tries to login before the Ignition gateway is ready (although the Ignition Windows service has been running for some time) and the auto logon fails, giving a logon prompt instead (see this thread–especially Jan. 9, 2014 2:01pm post for details). Fixing this would allow auto startup of a client on a Windows box as easily as putting an shortcut to clientlauncher.exe in the Windows Startup folder (with command line arguments, if necessary). It would eliminate the requirement to implement delays in batch/command files, an unnecessary complication as well as a possible future failure point (something else changed on the computer in the future may increase Ignition Gateway startup time making the delay too short, causing failed auto logins to return).

2 - Add GUI for clientlauncher command line parameters: this is an insignificant improvement compared to the above, but would save new users a trip or two to the help file.

This post is based on the Windows Native Client Launcher in Ignition 7.6.4 (b2013112117) | 32-bit, but may apply to Linux and OSX client launch as well (have not tested them).

1 - Check Gateway availability before attempting client auto logon
For what it’s worth, the native client launchers were designed in part for your type of situation (where the launcher runs upon machine startup). Normally, the launcher waits until the Gateway has reached RUNNING status before attempting to start a client. In your case, the client’s login function is taking longer than the client is expecting, which bounces you to the login page. Are you using something like Active Directory to authenticate the user? The big question to figure out is why login is taking so long. If that can be resolved, then you should not need to use a separate .bat file to run the launchers.

2 - Add GUI for clientlauncher command line parameters
That could be a possibility in the future, depending on demand.

Thanks Matt. We are using the default authentication profile:
[attachment=0]User Sources.jpg[/attachment]
It has been like this since we installed Ignition (7.6.4 (b2013112117) | 32-bit). Installation is on a 1.86GHz dual core (4 CPU, usually 1 or more parked) PC running Windows 7 32-bit on an SSD with 4GB RAM (normally running at 50% RAM available). The auto login failure occurs if auto login is attempted too soon after Gateway is started (typically on Windows restart, but same thing with Gateway restart via gcu.exe, though in this latter case it doesn’t seem to need quite as much time, likely because it is not competing with other Windows processes to get running).

I’m not sure what would be taking so long for the login other than perhaps the Gateway service–though running–is not fully ready for login yet. Perhaps we are throwing things at it faster than a HDD based installation; with everything on SSD the processor may be the bottleneck during startup. Or maybe the check for running is good, but there needs to be another check that the Gateway is actually ready to process logins.

We would love to ditch the command file; please let us know what we can do to help figure this out.

I don’t think the following is related as the failure for the local client (same PC as Gateway) to auto login occurs whether any other (external) clients or designers are open or not. However, I’m throwing it in here just in case it has any relevance: I did notice that when the Gateway is starting up, clients and designers already running elsewhere connect, disconnect, and reconnect (all within a couple seconds). They always follow this pattern. It looks like the first reconnect fails as session is no longer valid, with errors something like this:

ERROR [Gateway-TagManager-thread-1] No session found. You must re-login. (message type=199, func=SQLTags.poll) WARN [DesignerContextImpl-GatewayConnection-thread-1] Unable to obtain lock in synchronization routine. INFO [GatewayConnectionManager-GatewayConnection-thread-1] Stopping reconnect thread.
After that last INFO log, the designer is reconnected.

Ok, something just isn’t adding up here.

The client launcher, as Matt mentioned, was designed explicitly to handle this situation gracefully. It does check that the gateway is not just started up, but ready to handle logins.

A few things:

  1. Can you “tail” wrapper.log during the startup to make sure that the client launcher isn’t letting the client start until you see:
    [tt]Ignition["/main", state=STARTING] ContextState = RUNNING[/tt]

  2. I’m assuming that once the client launches, it takes some time before it fails back to the login screen? Is this assumtion true? If so, can you grab a gateway thread dump during this time?

Thanks Carl,

  1. Edit: I’ll test this and post back with results.
  2. Yes, there is a delay. I’ll grab a gateway thread dump and post back with it.

Regarding Carl’s Point 1 Above: We removed the command script with delay and put a shortcut to the client launcher in the Windows Startup menu along with a shortcut to baretail.exe (configured to highlight the line you note) with a command line parameter to load wrapper.log. 1 designer and 3 external clients (not on the gateway PC) were already running when we did a Windows Restart on the gateway PC. Below is the wrapper.log tail. In all cases the client launcher (not client) was displayed until the first seven (of eight total) lines were shown in the log below (note lines 2-7 varied in order with each restart). During this time it says, “[color=#008000]Connect error, trying again…[/color]” Then the client launcher briefly states “[color=#008000]Downloading projects…[/color]” and then disappears. A moment after that the client shows up with the status box displaying startup progress in the middle of the screen (it gets to “[color=#008000]Establishing Session[/color]”), followed by the login screen. No more lines were added to the log below after the login screen came up, and when testing without a designer running, the last line is omitted.:

[quote=“wrapper.log”]INFO | jvm 1 | 2014/01/15 15:56:55 | INFO [SRContext ] [15:56:55,816]: [color=#0000BF]Ignition["/main", state=STARTING] ContextState = RUNNING[/color]
INFO | jvm 1 | 2014/01/15 15:56:56 | INFO [CompactLogixDriver[L33ER]] [15:56:56,863]: Processor info: Vendor=1, Product Type=14, Product Code=107, Revision=20.13, Product Name=1769-L33ER/A LOGIX5333ER
INFO | jvm 1 | 2014/01/15 15:56:57 | INFO [TagHistoryDatasourceSink ] [15:56:56,964]: SQLTags history tables verified successfully.
INFO | jvm 1 | 2014/01/15 15:56:58 | INFO [CompactLogixDriver[L33ER]] [15:56:57,985]: [L33ER] Using cached tag data, processor edit number(48640) matches cached edit number(48640).
INFO | jvm 1 | 2014/01/15 15:57:00 | INFO [ActivateSessionService ] [15:57:00,609]: User “{schedule=Always, username=opcuauser, lastname=, language=null, firstname=, notes=}” connected.
INFO | jvm 1 | 2014/01/15 15:57:02 | INFO [Projects$ProjectChangeMonitor ] [15:57:02,123]: Starting up client project monitor. project=A, uuid=148df963-8779-ccd2-d75e-30f732efd43c, editCount=193, scope=4, version=Published
INFO | jvm 1 | 2014/01/15 15:57:02 | INFO [Projects$ProjectChangeMonitor ] [15:57:02,142]: Starting up client project monitor. project=A, uuid=148df963-8779-ccd2-d75e-30f732efd43c, editCount=193, scope=4, version=Published
INFO | jvm 1 | 2014/01/15 15:57:10 | WARN [SQL_Bridge-A ] [15:57:10,151]: Tried to perform RPC operation on an unrecognized session. May indicate that the previous session was lost. Creating RPC listener for new session id.
We tested as stated above a few times and then retried with no Designer (just the 3 external clients) running during the gateway Windows restart. All produced the results detailed above.

After all of the above tests (each repeated two or more times), I closed an external client, so we were down to no designers and 2 external clients. This time the log was as above except that the second “[color=#0000FF]Projects$ProjectChangeMonitor[/color]” came after the client started, and auto login did NOT fail. I tested this three times with the same results. I then re-opened the third external client and re-tested several times with the same results noted prior to this paragraph (auto login failure and log as noted above). And later on–after everything noted further down this post, I tried this one more time: auto login failed, but the “[color=#0000FF]Projects$ProjectChangeMonitor[/color]” line did not occur at all (it always had on previous restarts and I’m not sure why the difference here). At least in this case, it appears the auto login failure does NOT occur unless 3 or more external (not on gateway PC) clients are open during the gateway PC Windows restart.

a) Throughout the testing I noticed there seems to be some very noticeable variation in how long it takes for the gateway to reach [color=#0000BF]Ignition["/main", state=STARTING] ContextState = RUNNING[/color].
b) The Ignition Gateway web interface does not show the client on the login screen as one of the connected clients.

All of the above tests and results occurred during Windows restarts (not just an Ignition Gateway restart). [color=#BF8000]After it all we tried the following (still with 3 external clients open, which caused a failed auto login consistently for the local gateway client on Windows restart as noted above):
1 - Close local client
2 - Stop gateway via Ignition Gateway Control Utility
3 - Start local client
4 - Start gateway via Ignition Gateway Control Utility
First Test Result: As in Windows restart tests above, except successful auto login with 3 external clients running and “[/color][color=#0000FF]Projects$ProjectChangeMonitor[/color][color=#BF8000]” line occured only one time in the log, after the client launched.
Second Test Result: As in Windows restart tests above with 3 external clients running (auto login failed). “[/color][color=#0000FF]Projects$ProjectChangeMonitor[/color][color=#BF8000]” line did not occur at all in the log.[/color]

Regarding Carl’s Point 2 Above: We’re working on getting this; it’s a bit of a trick to get gcu.exe up and running soon enough after windows restart to catch this.

Regarding Carl’s Point 2 Above: We used the last test case (restarting with gcu.exe rather than Windows restart) which does not fail every time in order to get around the issue of things getting too far before gcu.exe is up and running after a Windows restart. GatewayThreadDump_1 was grabbed while the client status showed “Establishing Session”, just a moment before it went to the login screen. The save dialog came up after the login screen did. There is actually very little delay between client status “Establishing Session” and the login screen appearing.

Please let me know if you need an earlier thread dump or if we can help with anything else.

When the auto-login fails and brings you to the login screen:

  1. Is there an error message or is the failure silent
  2. about how long does it take between “establishing session” and the login screen appearing?

The fact that the # of other clients running at the same time is a determining factor is troubling to me. Is the gateway so over-taxed? I think it should have a full 60-second timeout before it gives up, does that seem accurate?

Carl, the delay between “Establishing Session” and the login screen appearing is very short–low single digit seconds. I have not noticed any error message popups (anything else I should check for?). I can test this a bit later and confirm CPU usage at the time.

Hmmm, when the auto-login fails, it logs a message to the console. Unfortunately, you can’t see the console at this point in time and it happens before the visual console you get through the client diagnostics screen kicks in.

Here’s what I want you to try. In the clientlauncher.log file, you should see the full command that it is using to launch the actual client.

I want you to run this command in a DOS prompt, that way, the logged error should show up in the command prompt window.

Here’s the final error:

[quote=“Command prompt”]ERROR [ClientLaunchHook-Thread-5] Auto login failed.
com.inductiveautomation.ignition.client.gateway_interface.GatewayException: Unable to log in, too many clients running.
at com.inductiveautomation.factorypmi.application.runtime.ClientGatewayConnection.doLogin(
at com.inductiveautomation.ignition.client.gateway_interface.AbstractGatewayConnection.loginEncoded(
at com.inductiveautomation.ignition.client.gateway_interface.AbstractGatewayConnection.login(
at com.inductiveautomation.factorypmi.application.runtime.ClientLaunchHook.setup(
at Source)
Caused by: com.inductiveautomation.ignition.common.LocalizedMessageException: Unable to log in, too many clients running.
… 7 more
This explains the very quick auto login failure, but brings up a new puzzle. We do plan to gradually upgrade to unlimited clients, but we are currently licensed for 4 clients:
[attachment=1]Ignition Gateway License.jpg[/attachment]
And we had 3 clients running:
[attachment=0]Ignition Gateway Status.jpg[/attachment]
At this point, the only reason we had 3 clients running (no one is there right now and no one but me is playing with the system) is that I opened an extra one (normally there are 2 left running). The weird thing is, immediately after I saw this error, I opened up the status page on the gateway and it showed 4 clients running. That’s weird, I thought as I clicked on clients to figure out where the 4th one came from. But when the detail screen came up it only showed 3 clients. Upon refreshing the status page, it now showed 3 too. I’m 99% sure I refreshed it immediately before it showed 4 clients. There’s a slight possibility my gray matter got mixed up on this, but then why would it fail with the above error. I am very confident that there have not been 4 external clients running during all of the tests with failed auto logins detailed in this thread, and 99% confident there have not been 4 external (not on gateway PC) clients running during any of the failed auto logins detailed in this thread. I am 100% confident there was no local (on gateway PC) client running already in any of these tests.

It looks like some client is getting counted and then dropped from the count. I don’t know if it is the one attempting auto login or one of the already running clients that connects, disconnects, and reconnects every time the server starts up. Is there anything further I can help with to get this sorted out?

Ah ha, now we’re getting somewhere.

Okay, here’s what I want you to try to do. When the gateway shuts down, it creates a file called “.sessions” in the data directory. I want you to delete this file before starting the gateway up (hopefully via a script?), and see if the behavior is corrected.

I think that the gateway is deserializing its old sessions and then waiting for them to timeout, and in the meantime, they’re counting against your limit.

Thanks for your help Carl! A *.cmd file set to run under saved administrative credentials as a scheduled startup task works to delete the file on windows startup:

:: Delete Ignition .sessions file to avoid auto logon failure due to delay in removing previous session :: Run as scheduled startup task set to run as a local admin with saved credentials @echo off del "C:\Program Files\Inductive Automation\Ignition\data\.sessions" /F
After that, a simple clientlauncher.exe shortcut (with desired command line parameters) works to start the local client without auto login failures due to the old local client’s session still be counted (I verified the ghost session was the old local client session by refreshing the gateway’s session details web page right when auto login failed: it showed the old local client along with up time that exceeded the amount of time since reboot–refreshing a minute later removed the old local client).

Deleting the .sessions file resolves the auto login failures due to past session counted in error until it’s timeout expires. This is great; eliminating the *.cmd file with delay speeds start-ups after reboot (from power outage, downtime, or whatever). The HMI is up at least a minute earlier than before.

Are there any bad side effects to deleting this file? Could the file be eliminated to avoid the need for this workaround?

Great, glad we got to the bottom of the issue. No, there are no bad side effects of removing this file. It is questionable whether or not we should even have the file there - or at the very least, it shouldn’t have client sessions in it, just designer sessions. But that is on us to resolve, so in the meantime, just keep your script around that removes the file.

Thanks for confirming Carl; will do. Resolving this will make already existing improvement #1 noted in initial post above fully functional/robust on limited client systems. Thanks again for your help!