Multiple databases

zacharyw.larson · August 5, 2022, 3:45pm

If I have 100 machines, the Microsoft SQL database cpu is spiking 75%, and I have multiple projects using both the same database, tag provider, and historian-

Then, would I benefit from having a second SQL database for some of the queries?
Would it be possible to split my tables in this manner to reduce CPU strain?

I want to learn about scaling up what I can do with queries, and how fast I can do those things.

Matrix_Engineering · August 5, 2022, 3:58pm

Would help, but would also be a nightmare to maintain. Throw more cores at your current DB.

PGriffith · August 5, 2022, 5:34pm

Scaling vertically is far, far, far easier than scaling horizontally.

Get familiar with profiling SQL server and the diagnostic tools you have available, also. It’s very likely you’re leaving performance on the table in some way(s).

zacharyw.larson · August 5, 2022, 5:56pm

When you say scaling vertical, you mean adding cores?

I want to better understand the options and tradeoffs.
What is the nightmare part for more than one database?

I thought it was typical to have more than one database.

Matrix_Engineering · August 5, 2022, 9:20pm

Have you checked out the server sizing guide? Search the main IA site or this forum, should be a link to it.

Multiple databases are common, but they are generally designed for architectural reasons, net performance issues.

Say you have two sites, both have Ignition both have their own DB. Site A DB has site A data, and also a second SQL server which is a replication of Site B. Site B has the opposite.

Maybe you would have one DB for core Ignition, another for MES and another for ERP.

DBA is a full time job. I wouldn’t claim to be one, especially for MS.

zacharyw.larson · August 8, 2022, 12:56pm

Thanks

What is an example of a nightmare situation that people want to avoid with having a second database?

Locations having different databases makes sense to me.

I am wondering about situations where someone wants to use a database in a more traditional way.
Another person wants to stretch the boundaries of the realtime data collection and display while not disrupting the core usages.

So I want to learn about the nightmare situations, and tradeoffs. I want to be informed about the tradeoffs, advantages, and disadvantages such that I can take a position for or against different options and that I can guide away from perilous options, or at least be aware of intelligent ways to mitigate those.

edited for streamlining

bmeyers · August 8, 2022, 1:13pm

Stupid question but are you using SQL Server Express?

Some things you should be doing now:

Don’t use star queries (SELECT * FROM table;)
Don’t let your tables grow excessively; separate active tables from historical tables.
Call queries from the gateway; not the clients.

zacharyw.larson · August 8, 2022, 3:36pm

I appreciate the help.
For the purpose of this thread, I want to learn about having more databases.

Matrix_Engineering · August 8, 2022, 4:05pm

More work. With one, you have the default DB, and only have to specify the DB connection based on what scope you’re working in.

Potential for error. Writing to the wrong DB, dropping a table on the wrong DB, deleting data on the wrong table.

More management, your DB should be backed up and or replicated. Add more DBs’ and your backups and replication needs grow too.

I’m trying to be constructive here, I hope it comes across that way - but your OP said DB CPU is spiking 75%. The easiest way to solve that is to throw more CPU/vCPU at that DB server.

zacharyw.larson · August 8, 2022, 4:21pm

Thanks, yes.

Goal
Have one database to really push the limits of.
Have one database to use more conservatively.
Avoid the two impacting their separate performance.

Concerns
If I understand correctly, potential errors in designating the proper database are the primary concern, deleting and dropping specifically.

Second concern is backups and replication. Needs more memory.

Mitigation
Projects can be isolated from using each other’s database while still sharing the same tag provider?
I think I should know this, but I am not certain.

Feels like walking on ice, unsure of solid foundation or thin ice. Sorry if my question or logic is silly.

Matrix_Engineering · August 8, 2022, 4:42pm

Sure, one tag provider, just select on a per tag basis in the history option, which DB it is going to.

If you use transaction groups, the same thing, you can only select one.

Note that, to my knowledge, a named query etc can only select from one DB.

If you have data A in DB A and data B in DB B, and you want to run a “join”, you have to do this in a script, by running two separate selects, and then appending your own dataSet.

zacharyw.larson · August 8, 2022, 4:44pm

Can I block one project from seeing one of the databases?

Matrix_Engineering · August 8, 2022, 4:47pm

Don’t think so, DB connections are global. Willing to be sat corrected if someone else wants to chime in.

You can, however, in your DB, give permissions that if you are using that connection (JDBC driver), they can only read/write to those tables, if that helps - don’t think it does because if User A on Project A wants to read/write DB B, if it’s a global connection as above they can go ahead and do so.

Matrix_Engineering · August 8, 2022, 4:51pm

tom.leijen · August 11, 2022, 10:12pm

If you are looking to push databases to their boundaries, have you considered using a cloud-based database service like Azure - they allow massive scaling, load balancing etc with less administration overhead than running on-prem DBs (cloud based servers handle backups and guarantee uptime and can be configured to scale automatically), you could then also take things even further by adding cloud based data analytics and presentation, possibly even AI systems (predictive maintenance for your plant etc).

If your budget doesn’t allow for the level of cost associated with a cloud-based database, I would suggest running two separate database servers (on different VMs or different hardware), one to push the limits of and another to use more conservatively.
As already mentioned in this thread, there is a LOT to be gained from setting up your DB structure right and writing efficient queries. Think about things like, how much detail is an operator really needing to visualise and how often would this need to be refreshed, how quickly does your data really change, what level of precision in data do you need to capture etc etc.
For example:

Temperature data is slow to change, rather than querying your database every 100ms, or for every 0.1 degree change, you could take an average every 5 minutes and only query the database every 5 minutes to insert the data.
When you have a graphic on a page to show some data from your database, you only need to refresh it when someone is looking at it, you could write some scripts to only query the database when the graphic is visible, or caching data by running a query that appends to a local dataset object, so that you only ever query new data rather than continuously querying the full dataset from the database.