What are some strategies for automating data collection in a constantly changing manufacturing process?

I'm looking for advice, techniques, strategies, and approaches for automating data collection in a fast paced and constantly changing manufacturing environment. It's a problem that I've been struggling with for the past year and a half, when my company started increasing their production, adding to the product line, and changing manufacturing processes.

I'm a relatively inexperienced software developer and I'm interested to know how others might handle a situation like this, specifically within Ignition or more generally. If you have books, articles, podcasts, personal anecdotes, or rants about this issue, I'm interested in hearing them!

Our Ignition project only reads from the PLCs, it's role is data collection and reporting. Recently the reports have been failing because I cannot update the data collection, aggregation, and reporting scripts as fast as new manufacturing processes are used in our facility.

Historian or transaction groups are the obvious pre-built solutions.

Lesser used but viable are using a timer script and manually grabbing the data you want and saving it to a database, or doing so on a tag change script.

What do you mean your reports are failing? How are you collecting data now? Seems like someone is messing with the database so frequently your data sources queries are failing when they used to work?

As much as possible ...

  • Decide on a naming convention for your machines. You'll be using this to name the UDT below and as a parameter in the views and reports.
  • Decide on a common set of data to be collected from each machine. This might include:
    • Cycle count1.
    • Reject count1.
    • Start / stop events.
    • Stop cause / first-out alarm.
    • Product / recipe.
  • Create a UDT containing each of these. Set each one as a memory tag and its value to 0 or 'n/a' as appropriate.
  • Create an instance of the UDT for each machine using the naming convention above. Edit the instance and convert the individual tags to OPC tags and point them to the PLC tags. Where PLC tags don't exist leave the memory tags and set them to null or 'n/a' as appropriate so that your views and reports know how to handle them.

1 Use never-resetting counters for these if possible. This makes it much easier to calculate production between any two dates.

  • I prefer to create a production log table and write to that using gateway events on the run / stop tag change. I also schedule a record every 15 minutes.

Create a standard view and a standard report that can poll your data history and extract the information.

Now for each new machine you just need to configure the UDT and add the machine name to the view / report selector dropdown. This can be automated by retrieving a list of unique machine names from the UDT configurations.


When all you have is a hammer everything looks like a nail.
Conversely, when all you have is a nail everything looks like a hammer...

The first strategy is to separate the problem into chunks. In this case, separate your production system from your reporting system. HMI/SCADA is about machine control / process control / supervisory control / data acquisition. Once the data has been acquired and put into a database reporting and analytics can be done with another tool.

The second strategy is to abstract the data you are collecting enough that new production lines and new processes don't require new scripts / databases / screens / reports. Just add another option to the List of Values for the dimension (production line, product, rollup timeframe, inspection item, ...).

After that you can start picking technologies and tools. Star Schemas or Kafka or MongoDB or whatever. Not everything is a tag or a relational database (Boyce-Codd isn't the answer to everything.)

1 Like

The reports are typically missing data (rarely they'll run into an error from a database query returning None). Each batch has an ID, but the IDs change if they switch to a different recipe mid-production, blend batches, split batches. Then its a scattered mess with reports trying to generate for batch IDs that only have packaging data but everything else is missing.

Our controls and monitoring software is separate from Ignition. Ignition can only read from the PLCs, it collects and aggregates data for reports, typically excel or CSV files. Excel is used for analytics.

Would you mind elaborating on this?

I'll have to look into this, I'm not familiar with them. Would these be for reporting data or for analytics?

To add my two pennies, it is easy enough to handle resets and rollovers. If you decide to use non-resetting counters, what would you do if you download an older program copy? There's a real good chance the historical value will be overridden with an incorrect one. If you can handle rollovers, on first scan of the program you can reset and should be covered.

Sure, this:

Is related to this:

Using a standard naming convention and UDT for your tags enables you to build standard views (embedded or otherwise) and Ignition reports without having to redo them for each production line.

It will also enable you to save to a single table in the database, with foreign keys to the minor entity tables (production line, product type, etc.). When you present your user with a picklist (List of Values) to select the line or product or inspection or whatever... you get the list from these configuration tables. Adding a line is as easy as adding a record to the table. (That's the dream of course...)

Your production database is probably relational and optimized for transactional use. Most of us learned about normalizing our data, and then denormalizing as necessary for performance.
Over the years other ways to store data have been developed for reporting and analytics and logs and cold storage of history and event processing and big data and other stuff.
Many teams resist having a second data store for reporting (or whatever use isn't "transactional"). After all, they have a perfectly good hammer that they learned to use in their second year. Eventually they make it work in the relational database or they relent and use a different tool that is appropriate for the task.

A star schema is a way to organize data for data warehouses. It opens up your data to lots of powerful reporting tools. You may have reached the point in your project where it is worthwhile to have one. Or maybe not.

The others that I mention are also different technologies that have different paradigms on how to organize and store data. The universe is bigger than a database oriented 3NF paradigm. You probably don't need them now, but read about them and keep them in the back of your mind. Automated data collection can generate lots of data and your transactional database probably isn't the right place for most of it.

And if you are still storing your data in Excel spreadsheets then start with a good relational database and forget the rest of it for a while...

1 Like