Trend performanence depeneds on number of pens

Robert · February 15, 2011, 9:58pm

setup:
Ignition 7.2.2
historical chart. History database (postgres) has partitions of 5 to 8 million rows each. approx 1200 tags

Trying to track down the cause of trend performance issues.
If I use the trend chart with three pens the performance is ok. It’s running this query:

SELECT "tagid","intvalue","floatvalue","stringvalue","datevalue","t_stamp","dataintegrity" FROM sqlt_data_1_968 WHERE "t_stamp">=? and "t_stamp"<=? and ("tagid"=? OR "tagid"=? OR "tagid"=?) ORDER BY "t_stamp" ASC

If I add a forth pen. this query is run which usually times out.

SELECT "tagid","intvalue","floatvalue","stringvalue","datevalue","t_stamp","dataintegrity" FROM sqlt_data_1_968 WHERE "tagid" IN (?) and "t_stamp"<? ORDER BY "t_stamp" DESC LIMIT 1

Explain plan on the first shows a single index scan."Index Scan using sqlt_data_1_974t_stampndx on sqlt_data_1_974 (cost=0.00..9.00 rows=1 width=556)"
Explain plan on the second shows a reverse index scan following by a limit." -> Index Scan Backward using sqlt_data_1_974t_stampndx on sqlt_data_1_974 (cost=0.00..30.07 rows=7 width=556)"*Note the big difference in the cost

Questions:
Why does the second query have a limit 1 clause?
Why is the query with more then 3 pens different?

Just for fun I added a forth ‘OR “tagid”=’ to the first query and got the same explain plan as the first query.
Conventional wisdom has it that the only query that in more expensive than IN is NOT IN. IN is certainly more expensive then a bunch of ORs

Colby.Clegg · February 15, 2011, 10:29pm

That query, the “order desc limit 1” query, is the “seed value query”. The difference between your two executions is that one is querying the “seed” values, and the other isn’t.

The strange thing is that whether or not that query is executed is dictated by the return size, not a property of the tag. Is is possible you’re also changing something there between your two tests? You can see the exact parameters being used for the query by looking for the logger that ends in “QueryExecutor” and turning it to DEBUG.

A return size of “-1”, or “raw” in the UI, won’t run that query. The others modes will.

That particular query is a well known performance culprit. Whenever someone complains about SQLTags history performance, I can be fairly certain it’s due to the seed value query. The purpose of the query is to retrieve the first value previous to the requested query start. Due to the fact that we only log on-change, it’s necessary to get this value, otherwise we might not have anything to return if the value stays constant over the span of the query.

Perhaps that query can be improved by getting rid of “IN” and actually listing the tag ids that are being queried. Beyond that, SQLTags History performance is going to be a key focus of mine for 7.3, and this issue is definitely to be at the top of the list to work on.

Regards,

Robert · February 16, 2011, 2:48am

I can pretty much guarantee that the only thing that is changing is the pens(tags).

I’ll have a look tomorrow at what the complete query is in both cases.

Robert · February 16, 2011, 4:49pm

Rereading this in the morning (in a new light you might say) I noticed this key phrase

My 4 pen graph does indeed have one pen with a non-changing value.

I will play around with sql to try and get a faster query for you guys to implement.

Colby.Clegg · February 16, 2011, 5:01pm

Ah, I see- so perhaps it’s running that query for all tags, it’s just that they’re not noticeably slow because the previous value was very close to the start time. With the non-changing value it has to scan more, possibly into another partition.

Let me know what you find, I’m open to suggestions.

Regards,

Robert · February 16, 2011, 10:08pm

WARNING SQL tuning can be dangerous to your health. Don’t try this at home unless you’re a DBA. (Which I’m not )

So what I have so far:
The ‘IN’ clause should be replace with an ‘=’ and you should only be checking for one tag at a time. It make no sense to select for more then one tag and then limit the result set to 1. Unless a SQL guru can craft a query to return one row per tag for a bunch of tags.

You can do a very quick test to see if a tag is even in a partition with:select count("tagid") from sqlt_data_1_974 where tagid = ? (edit: see below)

The big killer is the reverse sort. Depending on the number of actual records in the partition, It may be faster to get them all and then find the one with the biggest timestamp yourself.

The biggest killer is the limit 1. Let me explain (pun intended)

SELECT "tagid","intvalue","floatvalue","stringvalue","datevalue","t_stamp","dataintegrity" FROM sqlt_data_1_968 WHERE "tagid" IN (44) and "t_stamp"<1297728000000 ORDER BY "t_stamp" DESC limit 1
when EXPLAIN’ed gives

"Limit (cost=0.00..12.92 rows=1 width=556)" " -> Index Scan Backward using sqlt_data_1_968t_stampndx on sqlt_data_1_968 (cost=0.00..350178.72 rows=27094 width=556)" " Index Cond: (t_stamp < 1297728000000::bigint)" " Filter: (tagid = 44)"
This is the query that is taking over a minute if the data is not cached. Note that the index being used is the timestamp index. The index with 9000000 unique entries (near enough anyway).
If we remove the limit 1 from the query, the EXPLAIN looks like this:

"Sort (cost=66772.95..66840.68 rows=27094 width=556)" " Sort Key: t_stamp" " -> Bitmap Heap Scan on sqlt_data_1_968 (cost=591.49..51254.06 rows=27094 width=556)" " Recheck Cond: (tagid = 44)" " Filter: (t_stamp < 1297728000000::bigint)" " -> Bitmap Index Scan on sqlt_data_1_968tagidndx (cost=0.00..584.71 rows=27097 width=0)" " Index Cond: (tagid = 44)"
This time we are using the tag index (with 1200 unique entries) to find the two records of interest and return them in milliseconds.

Just for fun I ran the above queries using a tag that had 30000+ row in the table. The explain plans did not change but the run times reversed. ie: the limit 1 run in milliseconds and the non limited query run in 16 seconds.

So then I thought this whole thing could be sped up just by counting the number of rows for a tag first and then running the correct query. The count query took ms for the tag with 2 rows and 10 seconds for the tag with 30000 rows

I think the answer is to assume that if you get a lot of data points back for a tag, you can use the limit 1. If you don’t get any data points backs you should not use the limit clause.

Of course this is for a Postgres DB. Mysql/Oracle,SQLServer etc. probably optimize differently.

Colby.Clegg · February 16, 2011, 10:32pm

Thanks for all the info. I was also confused about the “IN” statement with the limit at first, but here’s the deal: The same tag path can be registered multiple times. So, it is only 1 tag per query, but it’s possible that there will be multiple ids for that tag. And we only want 1 result. At any rate, we can definitely change it to be “=” instead of “IN”.

There may be some room to run with calculating how much a tag changes and doing different operations based on the result. The biggest thing we can do is reduce the number of situations that require the use of this query. For example, right now we run the query for all tags in a query that isn’t set to “raw” return mode. However, if not raw, there’s a certain window size. If we grab the first values for the query and see which tags have values inside the first window, we don’t need to run seed queries for those.

Anyhow, between improved logic like that and better caching on our side, we hope to make a big impact for 7.3.

Robert · February 16, 2011, 10:56pm

Ok, the multiple tag things makes sense. I think it's more of a clarity thing. The EXPLAIN produced the same plan regardless of whether is was = or IN.

Soooooooooo,
When is 7.3 due again?

Carl.Gould · February 24, 2011, 6:14pm

We’re shooting for July

Robert · February 24, 2011, 8:14pm

That would be Awesome.

Especially if it includes everything that’s been mentioned would/could be in it.

Any chance of an early beta?

Carl.Gould · March 3, 2011, 9:56pm

There will be a beta, but we’re not even close to that yet.

Colby.Clegg · May 6, 2011, 11:02pm

I wanted to follow up here to say that we’ve recently been looking a lot at the performance benefits of using clustered indexes in SQL Server, and it seems significant. By changing the timestamp indexed to be “clustered”, you can drastically improve query speed (and I think we’ve found it’s even better if you make a dual clustered index on tstamp and tagid).

Unfortunately, this idea varies dramatically between databases. Postgres, for instance, doesn’t have clustered indexs in the same way SQL Server and MySQL (which I’ll talk about ) do. They have the idea of “clustering a table” according to the index, but this is a one time operation to sort the data according to the index order. After that, data will still be stored out of order until it is re-clustered. However, with our partitioning, it might make sense to cluster a partition once it is no longer being written to for performance on bigger queries. But, we haven’t evaluated the boost for postgres yet.

On MySQL, you can’t specify which index is the cluster. However, apparently it uses the first “unique” index it finds (this is with InnoDB, by the way). So, you can create a unique index (or primary key) on the combination of t_stamp and tagid. Again, we have done any time tests yet, but it makes such a big difference for SQL Server that it’s worth investigating.

On SQL Server, we saw history queries that previously took 15 seconds drop to milliseconds.

Robert · May 9, 2011, 8:49pm

PostgreSQL was never my first choice. It’s just what we were already using.

I’ll see if I can get some free cycles to test our setup with different db’s.

Robert · June 13, 2011, 4:38pm

So I’ve spent the last couple of days playing around with MySql and changing the index types and I’m underwhelmed. It doesn’t seem to matter what I do with the index, (b-tree, hash, primary-auto-indexing,combined tag/time) it returns the trend data in constant time. About 45 seconds from a table with 7,000,000 rows (and growing).

It is definitively disk bound so some SSDDs would help a lot I’m sure.

Colby.Clegg · June 17, 2011, 9:53pm

Hi Robert,

How much data are you querying from that 7m rows? That is, what’s the range of the chart?

Also, if you go to the console and turn the loggers under “History.*” to “Debug”, you’ll see the break down between the seed queries and the data partition queries. That might help identify where the problem is at. If it’s the seed values, for example, I’ve found a few different index situations that seem to help. For data reading, I haven’t found the magic bullet yet, but have been working on it.

You might know this (I think I’ve mentioned it elsewhere), but it’s easy enough to play with different ideas: you can create a seperate table to try something new, and just update the sqlth_partitions table to point to it. That way, you don’t have to modify your original table.

You might try the following: use “create table xxx like yyy” to clone a table structure. In the administrator, edit the new table. Remove all indexes. Create a new primary key index that consists of the t_stamp and the tag id. Make sure t_stamp is listed first. Next use the query “insert into xxx select * from yyy” to copy the data over. Update the partition table to point to xxx where it used to point to yyy. Also, I’m assuming InnoDB data engine here.

Using a index scheme like that against a test data set of 28 million rows, I’m getting very fast seed value queries, and reasonable (<5 second) data queries for time spans under 8 hours. (Also, worth noting that I’m using Mysql 5.5)

Let me know if it makes a difference,

Robert · June 22, 2011, 4:44pm

Just for fun I tried Oracle. It seems to be very cache sensitive. As I got more millions of rows, the first time I ran a trend query, it would be very slow or time out. Refreshing the trend ie: requerying, would produce the data in a second or two. I’m sure It can be tuned to give better results. I’m not the guy to do it.

@Colby
The chart data I retrieving is for 24 hours for 3 or 4 pens. If it’s 4 pens, one will always be 0.
Each tag is changing every 10 seconds. So my math says 25,920 rows.

Thanks for the logger tip. I know it’s timing out on the data reading. I have never seen the seed sql running so it must be running very fast. As this is all demo data I’m not concerned about saving it. I just modify the index on the active table. (I learned that from Kyle )

Postgres 8.4
Mysql 5.5 (InnoDB)
Oracle 10g