Data Quarantining

I’m a bit confused about when data is quaratined …

[quote]Quarantined data is data that has errored out multiple times during attempts to forward it. It has been removed from the forward queue in order to allow other data to pass. The most common reason for data quarantining is an invalid schema in the database for the data that is being stored.[/quote]In my development environment, if I force an invalid schema error my data goes directly into quarantine. If I kill my database service, my data is cached into the local store until it eventually overflows and is dropped.

Why wouldn’t my data be quarantined in my second scenario, (dead database) … ? Can I configure Ignition to always quarantine data instead of dropping it?

Another question … how is quarantined data stored? Can I get access to the quarantined data? … Can I query my quarantined data?

Thanks in advance!

Bryan,

Once the connection is re-established you can go in store and forward and push data out to the DB.

[quote=“AxisIt”]Once the connection is re-established you can go in store and forward and push data out to the DB.[/quote]I’m aware of the ability to retry the forward. In my second question, I’m asking if I can access the raw quarantined data before I retry the forward … I’d like to view the quarantined data, perhaps run a query on it …

Thanks for the response!

The quarantined data is stored in an internal database that you can’t query. Data only gets quarantined when there is a problem (error) when sending it to the database. If the service drops out it is the same as if the connection went down so data just gets cached.

[quote=“Travis.Cox”]The quarantined data is stored in an internal database that you can’t query. Data only gets quarantined when there is a problem (error) when sending it to the database. If the service drops out it is the same as if the connection went down so data just gets cached.[/quote]Ok, thanks for the response … I suspected that was the case …

I’m a fan of the data quarantining feature … it’d be nice if it could be applied to scenarios where the cache is overflowing.

Another question … in the User Manual under Gateway Configuration > Store & Forward > Engine Configuration, the paragraph for Memory Buffer Size says (see bold text) …

[quote]The number of records that can be stored in the memory buffer, the first stage of the store and forward chain. Other settings define when the data will move from the memory buffer forward, this setting only determines the maximum size. If the max size is reached, additional data will error out and be discarded. The memory buffer cannot quarantine data, so if there are errors and the disk cache is not enabled, the data will be lost.[/quote]… what (and where) are these settings?

I find this page of the manual a tad confusing so I apologize if the answer is staring me in the face …

1 Like

Overflowing the cache into the quarantine doesn’t make sense. If you don’t want the cache to overflow just make it bigger.

Let me see if I can clear some things up. There are 3 terms I’m going to use:
buffer: The memory buffer that all data is initial placed in
cache: The local cache of pending and quarantined records that is stored in an internal database on the Gateway’s hard drive
sink: The actual target database that you’d like to store the records in
available: Whether or not a connection to the sink is currently working or not

All records initially go into the buffer. When they move out of the buffer depends on 1) the Write Size/Time settings and 2) Whether or not the cache is enabled and 3) whether or not the sink is available.

Lets assume we’re dealing with a cache-enabled scenario.

Records move out of the buffer after the Write Size has been reached or the Write Time has elapsed, whichever comes first. If the cache has records in it that are being written, the records from the buffer go into the cache so that ordering is maintained. If the sink is unavailable, the records will go into the cache. If the sink is available and the cache is empty, the records go straight to the sink.

Heres the important part.
If the sink is available but spits back an error when trying to store a record, that record is quarantined. The reason is that the connection to the sink was up, but the sink said “hey - these records are no good, I can’t store them”. So rather than constantly erroring out, we put them aside (in the quarintine) for a human to make a decision: “Ok, try again - I’ve changed whatever was causing them to fail” - OR - “Just forget about it - they can be dropped”

So the quarantine has very different semantics than the cache. The cache is simply a non-volatile temporary storage area for records that will be stored to the sinks as soon as it is available.

Hope this helps,

2 Likes

i read it 12 years after your post and found it helpful thank Carl