Second guessing architecture

peter · September 2, 2021, 6:04pm

I’m currently writing maintenance documentation for what is officially my first Ignition deployment in the wild. The system is a data concentrator that monitors multiple identical machines and writes a consistent set of data to two different DB tables - one table has one row per machine and is updated in real time with various state values. The second table archives other data on the completion of that machines current instruction, and has one row per instruction. There is no UI on this project, and it just sits there on a server and quietly does its job.

I’m documenting how someone can go about modifying the project by both adding new machines or by adding additional data points on each machine. I’ve optimized the project for ease of adding a new machine, but now I am second guessing the overall architecture as adding new data point involves editing multiple disparate parts of the project.

So I’m throwing open invitations to suggest how/if my architecture could be improved for ease of future maintenance. The current architecture does work, and I have no budget for a redesign, but I am looking to see what I could do better in the future. I’m even curious to know if all this could have been better done with a custom module (not that I know how to build one - but knowing the benefits of going down that path would be interesting)

OK … with that preamble over, here is a summary of the current architecture:

Each OPC UA connection to a machine has a well defined name.
A single UDT defines everything known about a machine.
Each instance of the UDT expects to read data from an OPC connector with the same name as its own instance name.
Data written is written into two separates tables in the DB through one of two Named Queries. An Update query writes the real time data into the “real time” table and an Insert query that writes the archive data into the “archive” table
A Gateway timer script iterates over every instance of the UDTs, marshals the status data of each machine, and sends it to the Update query via parameters.
A change value script (on the Instruction number) in the UDT marshals the archive data, and sends it to the Insert query via parameters.

Note that I chose to use Named Queries so that the SQL for the update and insert would be visible and in a well defined location in the project. I could have used Transaction Groups to define the SQL, but they (as of 8.1.3) apparently can’t be parameterized - so I would end up with one Transaction Group per machine per query type, which means that adding a new data item would involve individually editing every transaction group.

With this design adding a new machine involves

Create the OPC connector with the defined name
Add an instance of the UDT with the same name

And everything happens automagically.

However, if I want to add a new data item to the system, I need to:

Add the item to the variables of the UDT
Edit the value change script buried on one specific UDT variable to marshal the value of the new item to the Insert Named Query
Edit the GW timer script to marshal the value of the new item to the Update Named Query
Edit the Insert Named query to receive the new parameter and map it to the new SQL column
Edit the Update Named query to receive the new parameter and map it to the new SQL column

I am unhappy with all the fiddling needed to add a new data item. EG manually editing two different scripts in two different locations, and then manually editing the two different Named Queries.

I could possibly see creating dynamic SQL in the scripts in order to eliminate the Named Queries. But I’m not sure if that is a responsible idea or not.

Another improvement could be actually having a UI somewhere that defines the mapping of what UDT item goes to what table. Thus adding a new UDT item to the queries would only require configuration rather than editing (I believe that this would also require going down the dynamic SQL route) (also, we never paid for either Vision or Perspective modules!)

What other suggestions would people make?

Herbie · September 2, 2021, 9:27pm

If I understand your question, you are asking if there is a way to dynamically edit your SQL table, scripts, and UDT tags in one simple location?

While I think you could possibly find ways to simplify this, I don’t think there is any easy way to fully automate this. The only thing I can think that might simplify things would be to create an additional tag on the UDT to run the Named Query (that eliminates the two scripts), but there will always be the SQL table update (assuming you are adding more columns), the Named Query Update, and the UDT update.

However, you could rethink your design a little bit with how you store the data in SQL or how you have setup your scripts & Named Query. I have two thoughts, and I’m not sure which is more beneficial, so I will discuss them both for storing your parameters.

Parameter Option 1
To make some of this easier, you could break your SQL tables into 2, one for the machine specific data (i.e. name) and one for the parameters you want to store, with a unique ID between them (machine name if it’s unique) so you can join (relate) them back later. What you would be doing is breaking the script up into 2 Named Queries, one to update the Machine Table, and one to update the Parameter table. Instead of modifying the UPDATE Named Query to do all parameters, you would issue multiple calls per parameter. This would allow you to marshal the data (hopefully you store all parameters in a common folder to loop over) and easily execute 1 or many calls. Archive would effectively be the same.

Parameter Option 2
The other method would be to modify your SQL table to have a parameter column setup as a BLOB or some large string object so you can store the parameters as a JSON object. When doing it this way, I think you can use Ignition’s built-in system functions to get the tag path (hopefully a folder with all subsequent parameters as children) and store the values of all parameters in the DB table as a JSON string. This makes the SQL table fixed as well as the Named Query.

Machine Automation
On the machine part, you could do something very similar (if you truly want to fully automate this) and loop over the different UDT instances, and ensure your UPDATE query is one that gets changed to something like:

INSERT INTO table (id, name, age) VALUES(1, "A", 19) ON DUPLICATE KEY UPDATE    
name="A", age=19

so that it will insert new records if they don’t exist and update when they do. The other option is to use a try...catch statement that effectively changes which Named Query you want to call if an error occurs. In this manner, your script wouldn’t need to change (unless you add a completely new top level path), and the rest remains the same.

I’ve been thinking about this very same thing for storing recipe information and batch data that varies from machine to machine, and site to site.

Hopefully that helps get the juices flowing for you!

nminchin · September 2, 2021, 10:33pm

This was my first thought, which is technically termed converting the table from wide format to long format.

Using a blob would effectively destroy the ability to do any future queries on the data without post-processing the results. This alone for me would turn me away from this option.

The below assumes that you convert the table(s) from wide format to long format.

I think this part is unavoidable, unless you write scripts to use system.tag.configure to do this for you, but that would also mean adding a GUI. For the complexity this would add (the script isn't difficult, but the supporting interfaces to it all add up) and the insignificant time it takes to simply add a new tag into a UDT def, to me this doesn't make sense unless you're doing it all the time, and the operators will be performing it.

Firstly, move these scripts out into a script library and reference that. Then you have all relevant scripts defined in the one location.
Secondly, I would make inserting rows dynamic, whereby you would lookup the properties to insert based on either a configuration dataset that stores the name of each tag to store data for, or you could also use system.tag.browse to read the list of tags in the UDT (or UDT definition) and use those. It really depends how dynamic you want this. In this way, you would then call your named query for each tag.

These steps are made redundant by converting to long format.

You could also write a script to create the device and UDT instance for you as well. It really depends how often you think UDT instances will be created.

peter · September 3, 2021, 12:11am

@Herbie @nminchin
I may not have been totally clear on my current architecture. I currently have 2 separate SQL tables:

The real time data table that has one row per machine, and has the machine name as the primary (and only) key
The archive table that has one row per instruction, and has the instruction number as the primary (and only) key (and the machine name and other data are non-keyed columns)

This gives me a single row of truth at any given point of time. And as there is no more than 10 or so columns in each table it’s not unwieldy. And each Named Query is built to only populate its matching table type (EG insert into the archive or update the realtime)

Theoretically I could decompose these tables even to an ultimate long form by having each individual value in its own row and creating two mapping tables from “machine name” to “real time value Type ID” for the real time table and “instruction number” to “archive value Type ID”, but I feel that is getting unnecessarily complex. Especially as the people who are pulling the data out of these tables are (how can I say it politely?) … SQL challenged.

However there are a couple of things that I picked up from your replies that would definitely be improvements:

I do think I totally missed the boat by not having all my scripts in a library. That would simplify locating where to find things that need to be changed.
Dynamically creating the SQL would allow me to eliminate both of the Named Queries, and thus remove another step.
And if I had some configuration data somewhere (like a simple list of names of the fields in the UDT that need to be logged, and that can be easily hand edited) I could have the scripts dynamically build the SQL from those lists and eliminate the need to even edit my scripts. (while avoiding a fancy GUI that isn’t really needed, or even possible with the current project)

But I do have some questions:

Do Named Queries cache their actual queries? And if I move to dynamic SQL will I lose caching (or will even notice that I lost it)
A benefit of the Named Queries is that you have that handy SQL schema browser. I’m not sure if that is just a browser, or if the Named Queries stops you from screwing up the column names.
If I was relying on dynamic SQL, should I recreate it on the fly with every query invocation? Or should I cache the constructed query myself?
Where would be an appropriate place to store a list of tag names within the UDT that could be parsed by a script in order to build the dynamic SQL? I thinking of something like a text file stored as a resource in the project that can be edited in situ by whatever text editors are available in the project itself (maybe even a dummy script file?)

pturmel · September 3, 2021, 12:16am

I kinda like what you’ve done. Keep in mind that parameterization has its own cost, so I don’t know that I’d follow Nick’s advice (yet). Plus, I despise tall format (aka “long” aka interop hell). Wide is wonderful.

Probably the only thing I’d do different is to avoid the “real-time” table entirely. That’s what tags are for. If a DB consumer needs a real-time row, they can just query the latest insert. Make that a view with the current real-time table’s name, and you probably don’t need to change any current consumer.

peter · September 3, 2021, 12:21am

Thanks for the support!

And we have a real time table because it was sold to the customer as having a real time table because in the requirements the customer specified a real time table

Personally I would have loved to have built a nice responsive Perspective screen, but the display of the data wasn’t our scope, and for some reason the sales people and whatnot above me are obscuring that we used an HMI system merely as a data aggregation platform.

nminchin · September 3, 2021, 5:13am

I do agree with this, that tall/long tables are far more difficult to work with. For circumstances where you have variable "columns" though, they're the only way to go. This isn't really the case here, however it would mean not having to edit the sql table definition each time a new tag is added which is where I was coming from - although the frequency that this happens needs to be considered as well, more frequently e.g. every day might call for a more dynamic process, but if these changes especially to addition of columns happen every so often, who cares, make the process a little longer and make it static and wide.
For a dev who has a SQL background, editing the SQL table definition isn't really an issue, but for an operator or "lesser" dev, it might be.

peter · September 3, 2021, 12:04pm

When I started this process I was a bit disappointed that I couldn't parameterize a Transaction Group (which includes functionality for automatically modifying the table to keep it in synch with the query), and that Named Queries are parameterized (thus only having to deal with one SQL query), but don't include the table modifying functionality.

So for my application I seemed to have the worst of both worlds.

nminchin · September 3, 2021, 12:09pm

To be honest, I don’t think I’d ever use transaction groups. Maybe they have a place, but for the price, the fixed formats, and the slight learning curve (I actually still have no idea how to use them, so maybe that also sways my opinion), I would much rather just script my queries to insert data

I just don’t feel that I would have the flexibility I might need to do they things I’d want to do, and that I’d then end up doing it in script anyway.

peter · September 3, 2021, 12:11pm

I was lured by the empty promise of not having to maintain the SQL table.

I prioritize laziness over having to design something

nminchin · September 3, 2021, 12:13pm

Laziness often means KISS is being applied but sometimes the butter knife option just doesn’t cut the tomato

pturmel · September 3, 2021, 12:17pm

The purpose of transaction groups is to provide a drag-and-drop experience for people needing to set up wide-format recordings. And custom recipe handling (via drag-and-drop).

I use them for my clients’ sake. So they can take care of them without me.

The primary exception is when what would have been a transaction group has a handshake requirement with the machine cycle. The success/failure flags in transaction groups have to be reset externally, and that violates one of my rules to avoid race conditions. ):

jlandwerlen · September 3, 2021, 12:33pm

I use a state machine when I can, so there is never a chance for race condition.