Timeseries DB for PostgreSQL

I have been running Postgres with TimescaleDB for several years now. Not with Ignition (yet), but to collect application telemetry from several applications servers. I am a big fan.

Here are two use cases.

  1. Log file searching

Table size 137M records. 11 columns. Mostly text. Max text message around 2048 bytes. Table growth is about 800K records per day. I concatenated the text fields into a tsvector via a trigger and placed a GIN index on it.

Searching small windows is insanely fast. Tens of MS for a 10 days. Three seconds for 40 days. Searching through 3.5 months of information takes about 28 seconds.

  1. Application Metrics

Table size 137M records. 10 columns. Mostly int32 and int64. Table growth is 800K/day too. I only have an index on the datetime column. Query performance suffers compared to above because I do not have indexes on the other fields.

During our review, MS SQL Server Standard did not even compare. It stayed in sleep mode. PG with TS was 30% - 50% faster than standard PG.

Hope this is helpful.

7 Likes

It's been a while, any update on your experience of using timescaleDB with Ignition?
I'm really curious to know if is it still stable and if you can query your data faster than Ignition SQL Historian.
We suffer a lot for reading data when it comes to a month range.

have you tried Pre-ProcessedPartitions for large range query.
It seems to be a solution to improve respnse time for large query.

https://docs.inductiveautomation.com/display/DOC81/Tag+History+Providers#TagHistoryProviders-Pre-ProcessedPartitions

Yes I use it it help but in general it is not good as TSDB solution and also this method use a lot of space.
I realise IA integrated at least one free TSDB engine into the product

Yes, we have two applications running TSDB with Ignition. One project is over a year old and the other was started Oct 2022 and to date - no issues. Performance seems fine, but unfortunately, neither of these projects have users that query extended lengths of data.

1 Like

I wanted to ask about storage, does using TSDB require less disk storage? Or it doesn't include much.
On the other hand, if I already have several sqlth_1_data type tables since I have the Ignition partition configured, in this case do I have to convert each of the tables into a hyper table?

IIUC, to get the most benefit from Timescale, you turn off Ignition's partitioning and put all the raw data into a single hypertable. I'm not sure what trouble you will have moving existing data into that architecture.

Hey!

Has anyone done migrating old data into timescale DB.

I have data from 2019 with monthly partitions. The data is about 5 TB with all the indexes in postgreSQL. And would like to test out the migration process, any tips?

Yes. We are using NiFi to transfer from MS-SQL historian to TimescaleDB. During the process we transfer tag definitions (Path, ID, Gateway, Provider) and Values (TagID, timestamps, floatvalue, and intvalue). I end up with two tables in Timescale: Tags and TagValues. Tags is about 60K records, TagValues is about 65B records. This merges data from about 45 gateways.

Queries against Timescale are insanely fast, given the number of records in the table. However, you HAVE to be careful about query construction...mostly with open date ranges.

2 Likes

Hi @dtompkins , Could you please share an example of how you are using NiFi to transfer data from Ignition to timescaledb
we are trying something very similar

I have two flows. One moves the Tags, the other moves the tag values:

Tags:

Its important to note that each tag has one or more tag IDs. The IDs are tied to scan class as well as tag path, so if you edit the tag you will end up with multiple Tag IDs for the same tag.

Steps:

  1. Perform an ExecuteSQLRecord to retrieve the largest TagID from the target system
  2. Execute EvaluateJSONPath to move the result to an attribute. This moves the single scaler value from the flow file to an attribute so that it can be used in subsequent queries. There may be other ways to do this.
  3. Perform an ExecuteSQL to grab any records in the source system sqlth_te table where the TagID is larger than the TagID attribute retrieved above
  4. Perform an Insert/Update to append the record(s) to the target table.

I have added various RetryFlowFile blocks to handle scenarios where the target database is down for maintenance.

TagValues:

In our system, Ignition is managing the local historian pruning and partitions.

Steps:

  1. Perform an ExecuteSQL to read recent partitions from the partitions table sqlth_partitions. Make sure that you set Rows/FlowFile to 1.
  2. Perform ConvertAvroToJSON. This helps with the next step
  3. Perform EvaluateJSONPath. This will move the flow file to attributes for use in later queries
  4. Perform GenerateTableFetch from the partition table. This will creat a SQL statement that is used in subsequent processors. It helps keep flow files to a maximum size and limit query times.
  5. Perform ExecuteSQLRecord. this executes the above mentioned SQL statements.
  6. PutDatabaseRecord Pushes the data into the target system

I have added various RetryFlowFile steps over time to reprocess data that originally failed.

Comments:

  1. Make sure you lag your tag values by some reasonable time. I think the FORWARD portion of the store-and-forward mechanism works independently from the real time logging. This allows NEW data with later timestamps to be inserted and transferred before the older data can be inserted. This will result in missing data in the target system.
  2. We are migrating to a local historian using TimescaleDB and moving the pruning/partitioning from Ignition to TimescaleDB. This will allow us to add an InsertTimeStamp column to the data tables. The InsertTimeStamp will be more fault tolerant.