MQTT and Ignition data questions

justin3 · January 7, 2026, 1:05am

I am still in the process of building it and I am not 100% sure I have it correct but I am a trial-and-error type of person generally. That said I have MQTT transmission going to IOT core in AWS and the MQTT Engine in the cloud gateway (EC2) is picking up the feed there. Then I use Kinesis to pick up the feed also in IOT Core and send it to s3 after a lambda function processes it to parquet file. The raw s3 bucket has just parquet files, the cleansed bucket has iceberg tables on top of parquet files after some minimal normalization, dedupe, and iso date standardization.

Context for this is - these scada systems are tied to a product that is installed in buildings and owned by the building, but the DBOM has a long-term contract for the OM part. In many cases these systems are installed in multiple buildings owned by the same management companies in multiple locations. That means there needs to be a way for say senior leadership of the DBOM to show ALL the sites for each management/owner group individually and with roll up reports and other items for that group.

At the same time if there are multiple systems for different management groups in a metro area the DBOM operators need to be able to see all of the sites they are assigned to for operations which could be across multiple management/owner groups.

So, how can we do that while also building data for use in multiple ways - regulatory compliance, AI, Finance, and even just plain integration with other BMS systems - while also making the introduction of a new site fairly seamless. Things like SSO for operators, one-click adhoc trending, long term trending, predictive analytics to drive maintenance and efficient operation of the system. Eventually, automated operations using ML to reduce the reliance of on-site operators.

Infrastructure setup I am planning is:

I am working on understanding and hopefully creating a successful Unified Namespace. I hope that I am thinking about it right but here is where I landed:

Enterprise = [Company]
Product = [Product Name]
Area = Commercial | Industrial | Residential
Production Type = PT1 | PT2 | PT3 (Could be a combination also)
Project = [Project Site]

Then in terms of the UDT setup I had to kind of build from what is already there because I have existing sites that have these tags already that I have to revamp.

|------------|-------------|----------|----------|

| UV | Ultraviolet | UV | UV |

Hopefully I understood what I was reading for the UDT and UNS so that when I setup the template project I do it right.

So - why did I not use the AWS Injector? I am not confident I understand how to make it work within the UNS concept. I think I know how to make the MQTT work correctly though.

The other things I am also working on trying to make sure I am doing right -

The template ignition project - I have a template project that is stored in Github enterprise which I am working on making the baseline starting point for each project. In order to make a new project you clone the repo into a new repo for the specific scada that is name for the site in question, and you modify the template to meet the needs of that project. This has all the base views, scripts, and UDT.
The docker images - Docker image deployment for an edge means that I can have a pretty automated install of that project on the local HMI PC and I can maintain it separately from the main OS of the panel PC, plus it makes it easily portable to a new piece of hardware in the event something goes wrong. A few other reasons as well - I can more easily bolt on something like local ML functionality. So, if we build an aggregate data set remotely and use cloud compute to do the heavy lifting to train the model, deploying a local copy of the model with docker is fast and easy and the execution of the AI against the local model to do whatever we need the ML to do doesn’t rely on the cloud being up 100% to still function.
Store and Forward for data - As far as I know this is still available with MQTT and is entirely dependent on the amount of storage locally for the SQLite database that is behind the scenes. So far, I haven’t focused a ton on this but ultimately my goal is to have a secondary drive in each of the HMI PCs that has ~2TB NVME with 50% allocated to store and forward in the case of an expected outage that is longer than a week. Now, the built in store and forward might not be able to use that, so I might have to write a jython script to offload the stored data every so often to ensure we don’t lose it. This might not even be necessary, but I am a planner.

That is probably a ton more information than you were after, but maybe it helps and maybe someone reads it and tells me why this is all a terrible idea.