Speeding Up My Project / Database Interaction

bkarabinchak.psi · January 4, 2019, 2:13pm

The application I am working on can be very slow at times. I want to do all I can to optimize the database performance (This application is a GUI for database CRUD operations).

It uses Ignition 7.9.3. Does upgrading to 7.9.8 and using Named Queries faster? Currently, most queries are sort of hacked together with a bunch of different custom variable expressions into a string, and then that string is used as the data query for our tables and such. I know Named Queries with parameters at the very least would make the project much more maintainable, is there a noticeable performance difference as well?

We also use MySQL 5.6. Would upgrading to MySQL 8.0 actually improve query speed performance in any meaningful way?

The last idea I personally had was to turn some of our bigger and commonly used queries to functions/procedures.

So besides upgrading Ignition/Using named queries, upgrading MySQL, and using more MySQL functions/procedures, are there any other ideas or guidelines to speed things up?

KathyApplebaum · January 4, 2019, 3:19pm

There are definitely a number of things you can do. Some are simpler than others.

Named queries will only speed up performance if you cache the results. Results are cached per query and per set of parameter values, and allow multiple clients to use the same set of results without rerunning the query. Otherwise, they are essentially identical in performance to legacy queries.

Stored procedures can be a little faster, but only because for most dbs the query plan is already done and saved with the stored procedure.

Usually when databases are slow, I start looking at the database itself. Does it have enough memory? Database queries are memory hogs, and if you’re having to go to virtual memory to complete a query, your response time will suffer a lot. Do you have a fast disk with plenty of free space? Disk access is the slowest part of any query. Is the db on a machine that’s doing other things, and so it can’t get CPU time (or other resources) when needed?

Assuming these things didn’t give you enough improvement, you need to look at whether your problem is individual queries or the sheer number of queries. If the problem is the sheer number of queries, and you can’t reduce that through caching, look at making the db into something with higher availability. If it’s individual queries, the db status page can tell you what the recent long running queries are – take a couple of them and look at their execution plans (use your db’s version of the EXPLAIN command.) Things like full table scans are signs that you need to rethink your indexes. But don’t just add indexes all over the place – indexes speed some things up and slow some things down. If you’re doing a lot of calculations like sum and count on historical data, consider data warehousing – this is where you preprocess those calculations into another db, which only has the historical calculations. This keeps from having to repeat calcs that won’t change (or only rarely change, say once a day).

Looking forward to Phil popping in with a dozen things I forgot.

dkhayes117 · January 4, 2019, 3:27pm

I’m sure the G1GC garbage collecting change will pop up at some point

bkarabinchak.psi · January 4, 2019, 3:42pm

Kathy, thank you for all the suggestions. The one you mentioned that I think would help me the most preprocessing calculations - the query in question takes 18 seconds to run and at most it’s updated a few times a day which I suppose could be a trigger to reprocess. Could you point me to the right direction on how to implement that or what to google? I’m can’t find something explaining the methodology of this, I am still relatively new to databases (compared to the people here at least).

Edit: Think I found an answer in case any one viewing this needs to know - https://stackoverflow.com/questions/2698401/how-to-store-mysql-query-results-in-another-table

jpark · January 4, 2019, 4:01pm

Make sure you have the correct size InnoDBBufferPoolSize. By default it is set to 8 megabytes.
Also make sure you have the correct indexes on the tables.

ryanjmclaughlin · January 4, 2019, 4:41pm

Could you give some more examples of queries, amount of data, and table structure? With that could prob provide some more targeted help.

bkarabinchak.psi · January 4, 2019, 6:06pm

Sure, here is my current conundrum which I think I understand now after learning more about how MySQL operates under the hood.

The main table on my opening main window does a query that takes about 18 seconds, making joins from 8 different tables and 3 different views. It takes 18 seconds to run and because it itself is not a table but a bunch of joins, it never get’s stored in the “Open Table” cache I believe (though the other tables are commonly used and must be there) so even though the query is cached and what not, since any of those other tables can be edited elsewhere and separately, it doesn’t seem to recieve any caching benefits.

So I am trying to make a datawarehouse type situation, but the issue is that this table allows editing of the rows which affect some of those 8 tables. Which means now the data warehouse table is out of date. Now the only upside to all this is that the table only shows 350 rows, so maybe after an edit or an addition to any one of these tables, it’d be possible to have the gateway run a script to drop and recreate this table in the background? Or maybe it should be edited in place? I’m not really sure where to go from here. Using this premade results table has cut down the load time of my opening screen from 35 seconds to 17 seconds, but I don’t know how to handle the CRUD operations on any of the 8 dependent tables, so I may have to scrap it?

PGriffith · January 4, 2019, 6:10pm

Not a DBA, but isn’t this exactly what views are for?

ryanjmclaughlin · January 4, 2019, 6:14pm

Is this an external system your interfacing to or something custom built?
If its custom built, can you modify the DB structure?
How are you joining? Is your SQL specifying the fields to join on?
What type of indexes are on the tables? Are all you joins making used of PK indexes?

Honestly 18 seconds is way off for something with less than thousands of rows. We have queries running a dozen joins returning several thousand rows in well less than a second. It really seems like you need to tune you queries and DB structure before looking at a Data Warehouse type implementation.

bkarabinchak.psi · January 4, 2019, 6:14pm

Yea honestly I think I went down a rabbit hole earlier of differnet MySQL optimizations and got so stuck in optimizing what was already written by someone else I forgot to consider turning it into a view.

bkarabinchak.psi · January 4, 2019, 6:26pm

Custom built, its our inhouse software we use for billing etc.

Yes I can modify the db structure tables etc. I had admin access to the DB.

It’s all left joins.

The main table that is getting joined on by everything else has 1 PK and multiple MUL keys from the main table on PK’s of the supporting tables.

After looking doing a DESCRIBE on the views that are joined, I realized there are absolutely no primary keys, unique, or MUL keys in them.

I just looked at the EXPLAIN of this and the initial Full Index Scan is of the largest table of 110,000 rows, then does a bunch of unique key look ups and 4 NON-unique key look ups.

I was just thinking, would first just querying limit 350 from that initial huge table as a temporary table, and then running all these joins on that would be faster.

bkarabinchak.psi · January 4, 2019, 6:41pm

I forgot to mention a 2 of the three views are also built on top of other very similar views ha. I think just going directly to the table for those specific columns would be faster than going to a view that goes to a view that goes to a table right?

ryanjmclaughlin · January 4, 2019, 6:58pm

The #1 thing that has hurt our programming in the past is poorly performing views! We all but got rid of views and only use them if our DBA optimizes and reviews them prior to use.

Limiting the results from some of the tables should not make much of a difference provided you are using the proper indexes.

I would think this is hurting you. Why do you have so many table with non-unique keys? This seems like a flaw in the DB design that causing a performance hit.

Ahhh... DB Viewception... hehe. If the views were properly crafted it could be less code maintenance but it sounds that the poorly written views are causing you to struggle.

I assume your SQL is specifying the join columns using the ON clause like below?

SELECT t1.name, t2.salary
  FROM employee AS t1 INNER JOIN info AS t2 ON t1.name = t2.name;

bkarabinchak.psi · January 4, 2019, 7:04pm

Yea unfortunately I completely inherited this project, the person who worked on it before is no longer here.

I think when the tables were created (just my hunch), instead of saying specifically this column is a FK, they just said it’s an INT and used other tables PK in it.

There’s a lot of view usage including multiple views like “view_salesorder2” which is basically just “view_salesorder” with a few additional column haha, all three views in this query are like that.

Yes, that is how the JOIN clause is written though it is a left join.

Also, there was a GROUP BY statement, but no aggregate function so it was pointless, simply deleting that cut seconds off my querying time ha.

Edit: In fact, without the Group By Statement, when I run it in workbench with SQL_NO_CACHE, the original query with the group by took 15 seconds, the second statement with the pointless group by removed said it took 0.000 seconds, which can’t be right but it was obviously much quicker.

pturmel · January 4, 2019, 9:52pm

Sorry. It's MySQL. I don't use it much. In general, slow databases are generally either:

Insufficient/inefficient indices, or
Too much frick'n data.

The former can be reworked. The latter needs better hardware and/or custom caching.

Nah, sane people don't write databases in Java.

dkhayes117 · January 4, 2019, 9:59pm

I was referring to the project performance in general, not just the db. I figured the name of the post included overall performance in conjunction with the main focus of db optimizing? Either way, I don't see how it could do anything but help imo

pturmel · January 5, 2019, 2:05am

I thought I didn't need a <sarc/> tag for that. /-: