Document Storage and Ignition

kgamble · June 3, 2020, 12:29pm

I have quite of a bit of experience with MSSQL and other traditional relational databases, but the concept of Document storage is rather new to me. Working so consistently in perspective for the past year has really opened my eyes to the wonders of JSON and documents, and started to question my view of the best way to store certain things.

For instance we are working on storing machine data, and are realizing that creating hyper flexible structures in a relational DB just doesn’t always make sense. For instance today I may have 12 sensors worth of data to log at the end of every cycle, but tomorrow I might have 20, and my database and report just need to “figure it out” with minimal changes. (Note, the tag historian may work in this case, but I’m heavily simplifying my example for sake of clarity, so I’m avoiding the tag historian here, at the end of the day I’m going to have something more like a Factory Talk Batch Journal)

So I have a few options:

Store the data in a table with a column for each sensor, and a time stamp for each cycle (“bad” database practice, as scaling horizontally is more difficult)
Store it in an EAV format where each sensor is a different attributes of the machine, and pivot the data into the format the customer is looking for when querying it out (“good” database practice, but pretty complex queries that aren’t easy to maintain and scale)
Create a JSON object that includes header information like BatchNumber, Recipe, Start, End. And then in that object have a list of dictionaries for each cycle, creating all of the information for the batch in one place and then adding into the database in one chunk, as a json datatype column. (“Easiest” solution, as creating the JSON in a memory tag type Document is minimal effort, and then store it as is)
Taking the aforementioned JSON and storing it in a document storage platform like mongodb (“harder” but only because I don’t know much about mongodb, so there’s some info I would need to learn first, but sounds like the “best practice” route)

All 4 of these options sound like they would work, with varying levels of complexity either in creating the data, storing it, querying it, and reporting it. I’m sitting in the camp for option 3 right now and leaning towards option 4, but looking for any info from others with more experience in this area.

Especially when you start to bring into the mix that the application will all be visualized in Perspective, and all perspective properties and datasets are just JSON, it feels like it would make everything much easier to build as well?

I am really curious if anyone has dealt with this same conundrum and how they have handled it, what experiences they have had, and what they’ve learned along the way.

Thanks for any advice!
Keith G.

pturmel · June 3, 2020, 12:36pm

I wouldn’t call #4 best practice. I am also in the #3 camp, though I don’t do much batch work. As a PostgreSQL partisan, I like the fact that its native json and jsond column types enforce legal JSON syntax:

https://www.postgresql.org/docs/10/datatype-json.html

kgamble · June 3, 2020, 12:49pm

The benefit I can see with Postgres is the ability to have relational and document data in the same place. However depending on where I end up on the internet, it looks like Postgres has had some problems with performance and those two data types. I know storing them in MSSQL as an nvarchar can slow things down a little but, but in your experience how well are they handled in Postgres? Have you tried storing large documents or just smaller simpler json?

pturmel · June 3, 2020, 9:32pm

Only trivial stuff so far.

mcgheeiv · June 4, 2020, 2:21pm

I have been using an “advanced” EAV model for awhile with the only real performance issue being when a user wants to see “all the data in one place”. One of our applications is exactly your example where sensor data is being recorded once per minute. We store sensor specific information in one table and the recorded data, one record per sensor per reading, in another related table. A few Ignition windows provide canned reports and one dynamically builds the SQL pivot for ad-hoc investigations. About 100 readings are recorded each minute for the last 8 years; no data has been deleted.