Inefficient Queries - No Lower Time Bound - Bug (Feature Request?)

chris.morris · August 28, 2024, 1:40pm

I have included a brief overview of the issue we're facing in regards to inefficient queries. At the bottom of my post is a more detailed explanation of our Ignition historian architecture that might answer more questions.

We often see this query in the database stats:

SELECT "tagid","intvalue","floatvalue","stringvalue","datevalue","t_stamp","dataintegrity" FROM sqlth_1_data WHERE "tagid" IN (?) and "t_stamp" <= ? ORDER BY "t_stamp" DESC LIMIT 1

I believe this query originates from the chart and easy chart component in Ignition, but I am not positive. I believe it is trying to get the furthest right bound t_stamp for the tags in a given real time trend to fill in the red bar:

The issue is without a lower bound on the t_stamp the query planner, specifically for PostgreSQL but I would assume this is true for other engines, often makes poor execution plans.

It would seem to me that adding a lower bound t_stamp that corresponds to the max range window of the trend would be reasonable.

On the historical trend it would be whatever the range was, the preferred now aggregate minus the selected interval. In the case below the current unix epoch in ms for the upper bound and current unix epoch in ms - 1 day for the lower:

That being said, if that is not where this query originates from, where is the source and is there a way to modify it?

Since it is hard to tell the originating source, does the "Manual" mode for trends add a lower bound to the query? In my testing it appears that it does, if that is the case then we could transition all of our trends to manual mode to work around this issue.

You can see the query performance summary and details here for an example query:
https://app.pgmustard.com/#/explore/b24abef0-df41-44b9-aaad-a879cea80075

Here is performance summary of an optimized version where I manually added a lower bound timestamp of 24 hours (the default for most of our trends is 24 hours) before the upper bound:
https://app.pgmustard.com/#/explore/7a81e5dc-fa26-428e-a070-7abec041ccc0

The result is discarding 3.7 million less rows and going from 2 GB of buffer utilization to 2 MB of buffer utilization.

If we were able to change this our server VM for our historian could be downsized significantly, a cost savings of around 36,000 USD/yr.

Additionally, the inclusion of every column in the query means that an effective index must cover all the columns in the table. This is not as big of a problem but if the query asked only for the columns applicable to the tags included in the trend (e.g. only intvalue and floatvalue) column based off of the type of tag in Ignition or the datatypes listed in sqlth_te for those tags this would allow for smaller indexes which would improve performance as well, but this is a much smaller factor.

Ignition historian architecture:

We are using a self managed instance of PostgreSQL on Ubuntu with the Timescale DB extension (GitHub - timescale/timescaledb: An open-source time-series SQL database optimized for fast ingest and complex queries. Packaged as a PostgreSQL extension.) to handle partitioning. Our partitions are 24 hours and Ignition partitioning is off. Timescale creates a daily partition that is transparent to Ignition (and other things querying the SQL database).

The query planner excludes chunks based off of the bounds of the time series column. Without a lower bound it will perform an index only scan on all chunks. By adding a lower bound it will only perform the index only scan on chunks within the time range.

chris.morris · October 1, 2024, 1:29pm

I worked with support on this and we were able to solve the problem. The source of this query is the seeded values (Configuring Tag History | Ignition User Manual).

To remove these queries from the system we changed the tag history resolution mode on the charts to "Raw" and turned interpolation off. We had quite a few charts to make this change on but overall it was worth it.

It looks like system.tag.queryTagHistory() also causes this same type of query to fire under certain circumstances but we're working to correct that in our various scripts but it is taking a bit more time.

This does make the charts look a little more jagged but we were able to reduce our monthly server cost by a bit over 50% and sped up our average query time to under 20ms.

osdcastillo02 · May 8, 2025, 8:10pm

Thanks for this info. I was having the same problem. Maybe you found another solution? When I set the resolution mode to Raw and removed interpolation, the graphics look really bad.

Another solution found is to put flags=1 in the sqlth_partitions table but this causes some values to be cut off because the bounding values cannot be found.

chris.morris · May 8, 2025, 8:23pm

We eventually undid the reason that some of the charts looked bad.

But where did you get the information for the flag on the partition table? What is it doing exactly? Where are you setting it?

osdcastillo02 · May 8, 2025, 8:27pm

I got the info from here, that seems to remove support for seed queries, which causes the graph to be cut off at the beginning. I understand this is because it cannot extract the first values to interpolate the graph.

osdcastillo02 · May 8, 2025, 8:34pm

Have you found another way to solve the problem? or how are u using timescaledb now?

Since we have several tags and this causes the queries to the database for easycharts to take approximately 6 seconds, which is horrible. When I set the flags value to 1, it took a maximum of 1 second to load the charts. I find it curious that vanilla postgresql (without timescaledb) did not have this problem, however, we need timescaledb due to the large amount of data we have and it is solved with timescaledb compression.

chris.morris · May 8, 2025, 8:56pm

We didn't solve the problem for trends, but we solved in other places where we were using the built in tag history query functions and instead switched to writing our own historic queries in scripts. Timescale also recently introduced some newer features that have sped up queries when scanning large sets of data.

We will probably try out this query mode flag though. Thank you.