Insights - Large Scale Ignition Project

Hi Community:

I'm seeking insights from anyone who has experience with a "large"-scale Ignition project integration that includes the following characteristics:

  • Purdue Model (all layers)
  • Enterprise Architecture (with some sites)
  • Perspective (~50 session OT & ~ +300 IT)
  • ~ 6 - 14 million tags
  • ~ 1k devices connections (spread on sites)
  • Ignition Edge (~ 0.6k nodes) with MQTT
  • DMZs MQTT Broker
  • Backends and Frontends

The core questions are about how many tags per server have you allocated successfully? Which are the specs of this servers BE and FE and how many? Tweaking the JVM GC to handle more load of tags and scripts?

Any comment or experience is more than welcome.

Send email if willing to talk in private:
jespin@asecuador.com
jespinmartin1@gmail.com

I can't speak to a project on that scale, but these resources are from IA themselves and might be helpful:

The Purdue Model And Ignition | Inductive Automation

Ignition Server Sizing and Architecture Guide | Inductive Automation

Also, in general keep in mind that Perspective uses more resources on your Gateway then a Vision client. So that means you will need more RAM and perhaps more CPU depending on what you are doing.

1 Like

Thanks,

I have seen those resources several times.
Agreed, not only RAM or CPU but v-cores when using Perspective. I have made testing with load balancer for frontends.

I would recommend a scale-out architecture for a client load that high. At the very least it allows you to scale the visualization load without impacting the backend systems.

Totally agree.

In the comments I have mentioned that Backend and Frontend is under our scope. How many tags per backend would you recommend?
60% OPC, 10% SQL, 10% Expressions else memory, for about a million set of tags.

One of our questions, if we can theoretically allocate 128-250GB RAM in a single server, would Ignition and the JVM heap take advantage of large server specs?

That depends on things such as what type of PLCs you're connecting to and the way you're connecting to them.

Are you connecting to AB, Modbus, Siemens or something else?

If you're connecting to an AB PLC and pulling a large number of tags it's good to keep an eye on layer 3 comms on each PLC to make sure you're not overloading them. If you overload layer 3 comms it can impact use of PLC programming software. I know this happens on AB PLCs with large tag counts.

@pturmel has drivers for optimizing various types of PLC communications. It would be good to look into that for sure.

Be careful with SQL tags. It's ok to use some but you want to use them sparingly because they can limit your scalability. Also, you really should optimize the queries they are running because a long load time on a SQL tag can be a very bad thing.

What database technology are you planning to use?

If you're using MS SQL Server you should look into SGAM contention and make sure you don't have queries that create SGAM locks in SQL tags. This can cause a deadlock in some circumstances. It's good to design around those problems rather than stumble into them.

3 Likes

Most of them Modbus TCP, some others using Kepware other from Engine MQTT.

SQL tags are event driven, not frequent changes. We are using SQL Server. I would definitely give a look to SGAM. However, we struggle to figure out this mentioned above:

If we can theoretically allocate 128-250GB RAM in a single server, would Ignition and the JVM heap take advantage of large server specs?

Maybe. Ignition gets its biggest boost from more cores. (In VMs, be sure they are isolated from other VMs, so idle time really is idle.)

1 Like

The JVM is capable of it, but just because the memory and CPU is there doesn't mean a single Ignition instance would handle 6-14 million tags. That's beyond the scale I think we've ever really tested or considered. You'd have to be willing to be a bit of a pioneer.

1 Like

A lot of the things that could cause this problem are probably not allowed in a SQL tag but it is something to consider when you're doing large scale projects.

Long ago I didn't know about this problem and I wrote an app that ran on 4 fork trucks which called a stored procedure that used a table variable to get their workload. They each refreshed at 1 second. It deadlocked the database. I should have used a subquery.

1 Like

Good catch, we feel comfortable with ESXi and Proxmox for CPU affinity. any other that you might suggest?

I find VMware difficult to properly configure, as it seems to be wholly devoted to squeezing every CPU cycle out of idle VMs into heavily loaded VMs. Proxmox is Linux KVM which I find relatively easy to understand and configure, but I do it via libvirt and virt-manager. I understand larger deployments would use something like oVirt to manage libvirt, but I haven't seen a physical site that needed more than two or three physical hypervisors.

1 Like

Yes, I agree that there are addition considerations beyond sizing. Maybe HW brands and categories, the frequency of CPU, cache memories, network constrains, just to mention a few. But sizing still important. Do you think that tweaking the JVM is mandatory when the heap is high? Is there any type of CG that performs better for backends than G1GC available in JDK 17?

Great information. Which isolation level is recommended for SQL Tags? Does SQL tags use the same connection pool of a database connection and the same execution thread? How that works under the hood?

Nice, great information. I would need to check those.

Is oVirt to vms what portainer is for containers?

Not sure. Never used portainer. oVirt is a web interface to running and coordinating many hypervisors that expose libvirt for management. More like vSphere for VMware.

If I recall correctly the guidance at the really big building outside of Reno NV was to spin up a new tag server once you got to around 500K tags per gateway. They had their gateways pretty well separated into tags only, clients only, etc. and utilized the gateway network connections pretty heavily.

You could probably go above that in terms of raw tag count if you were more detailed with scan classes, direct vs. leased, etc.

3 Likes

Valuable experience. We are using different tag groups for different use causes.