OPC Sessions to Prevent State Corruption ./ Unexpected Results

cmaynes · June 24, 2021, 3:09am

Problem(s):

Certain PLCs should only be written to by just one process/computer/station at a time. Station configuration, specifically mapping PLCs to stations, is done by people and occasionally when switching computers or moving things around on the floor, people make mistakes and multiple stations are mapped to the same PLC, so multiple HMI’s try to communicate at the same time. This can lead to dangerous behavior from equipment which starts receiving conflicting commands.
There are some cases where multiple processes/threads/computers need to share networked equipment registered in Ignition. One process may be in the middle of multiple operations when another process starts writing, corrupting state or producing unexpected results. There is no way to temporarily lock access to equipment while one process performs multiple write operations. One process should wait for the other finish before it performs its operations.

Current Solution(s):

Use database to prevent registering networked devices on more than one station. Throw error to user when attempt made to register on a new station before removing from the other station.
this works to solve the first problem, but cannot handle the second where multiple stations need access.

Other Possible Solution(s):

Use global tags (not database because of high IO) to store access tokens and build a system around them to handle sessions with the networked devices in Ignition.
this will probably work ok, likely introducing <100ms of lag per IO operation for token management, but I wanted to see if anyone had a “cleaner” solution
run an OPC client on the server which controls access to each PLC.
this is ok, but leaves ignition which is not preferable.

Question(s):

Is there session management for OPC-UA server interactions with devices, which can minimally handle the multiple readers and one writer problem, holding a lock across multiple operations?

victordcq · June 24, 2021, 11:57am

Hmmm idk much about PLCs tbh.
But it could be possible to monitor if its bussy with a boolean tag like “…/tag/isWorking”
And only it perform actions if this boolean is on false and let the plc write it to false everytime its done doing something. ?
And ofc write it to true when it gets a new command or input

pturmel · June 24, 2021, 1:42pm

I think you need to re-examine this constraint. Because, with typical PLC communication protocols, there simply is no way to enforce it. Instead, treat this as a policy, and make your PLC code robust in the face of multiple writers. ("Robust" can be as simple as faulting when multiple writers produce an invalid combination.)

Mutual exclusion locks and related techniques simply aren't applicable to PLC communications, and trying to utilize them in the SCADA layer is meaningless if there are write paths outside the SCADA's one OPC connection.

That said, you can make your gateway perform application tasks as if in a critical section on behalf of clients. Such a "critical section" could be used to control batched writes to selected tags. An approach to consider:

Configure critical OPC tags as read-only.
Construct a gateway message handler that will mimic the functionality of system.tag.writeBlocking() or .writeAsync(), but using system.opc.writeValues() to bypass the read-only tag setting. (The implementation can simply read out the OPC Server and OPC Item attributes to construct the necessary lists of operations.
Submit lists of operations as Runnables to an Execution Manager that has a thread pool with a single thread. My later.py script module's emulation of invokeLater() in gateway scope is an example. This provides the emulation of a critical section for the submitted operations.
If emulating writeBlocking, use a CompletableFuture from the submitted runnable to report back to the message handler. Examples also in later.py.

Note that the mutual exclusion is obtained by policy, not by enforcement. If you have developers who violate the policy to use such a subsystem instead of calling system.opc.* functions themselves, that's on them (or you).

cmaynes · June 24, 2021, 3:50pm

Hey @pturmel, thanks for the detailed reply.

I have complete confidence all the PLC programs running can handle any kind of asynchronous writing from multiple writers without falling into a corrupted state. I don’t believe management through policy is a solution in this case because the issue is more a programming one than an access one. I have complete control of all IO to these devices and all of it is getting funneled through either an Ignition Client or a Gateway.

Per my original post, I am dealing with multiple processes trying to access the same equipment. Each process needs to do the following with the networked equipment:

start “session”
write a few tag values
do some other stuff
write some more tag values
…repeat…
close “session”

The queuing system you described is great, but it only serializes IO operations, rather than preventing interruption of a sequence of operations spread across multiple write calls which cannot be combined into one collection as you described.

The specific issue I am facing is that I have multiple clients which need to open on the same computer, accessing the same equipment. Forcing users to shutdown the clients and only have one open will not work because the programs need to be used together by the user and in some cases it is just inconvenient to wait for each client to open.

Another situation I am facing is that I need to share equipment between two stations/computers running at the same time because the equipment is expensive but only runs for short periods of time, allowing two stations to wait a short while for the equipment to be free before running.

There are many ways to skin this cat and perhaps using gateway messages or the WebDev module on the server is the way to go, with something similar to what you described.

From what you are saying, it sounds as though this is not a “solved problem” where I can use an existing solution. I am a little surprised others do not have this problem, but maybe this situation is uniquely complicated?

pturmel · June 24, 2021, 4:04pm

An execution manager with a single assigned thread does not have to run just OPC writes. That was just an example based on your initial description. Such an execution manager can be used to singulate access to any desired resource. You could create such an execution manager for each machine.

Your idea of what seems to be a long-lived client “session” seems extra-ordinarily fragile, IMNSHO. The only machine in an Ignition deployment that can do any such session management is the gateway. You will need to have the clients submit “session” requests to the gateway to then run on the single-threaded queue (in the gateway) for the specific machine. If the “session” needs further client interaction in the middle, you will have a complicated nightmare.

cmaynes · June 24, 2021, 5:11pm

Would it be possible to elaborate on what you mean by fragile? Where do you foresee problems?

pturmel · June 24, 2021, 5:17pm

Client crashes/disconnects/reconnects/restarts are the most concern. Client IDs can change, and with multiples on one client machine, you cannot trust IP or MAC addresses. Seems to me that ownership of a session would have lots of corner cases to avoid deadlock and/or perform cleanup. This would be a concern for any “session” that would need client input in the middle of the process.

cmaynes · June 24, 2021, 5:33pm

I see your point. Some sort of token management handled by the server would need to be done, independent of anything specific to any caller. As long as a caller provides the appropriate token, they have access. A timeout would need to be employed for each token as well, so that after X seconds of inactivity the token is invalidated and other tokens can access the machine.

This will require some thought. Thanks, @pturmel, for your insight. It is really appreciated.

pturmel · June 24, 2021, 5:45pm

Personally, I would implement the process state machines in the PLC. Let any authorized client start a process when idle, and let any authorized client see the current state and supply intermediate inputs until done. Let a lesser collection of authorized clients cancel a process in progress.