Bad_CommunicationError: A low level communication error occurred

Bad_CommunicationError: A low level communication error occurred.
8.1.21

Hi folks,
Having a comms issue with a particular RS485 device.
I am fairly certain this is not an end device issue as if i use a 3rd party software, hitting the device at 1 second rate, I am getting 0 comms drop offs.

However today, when investigating the issue i get the above error on the tag, and the logs are showing some checksum errors at times too.

Architecture is an Onlogic IPC with ignition edge installed.
We have 5 device connections utilised.
We have a modbus 485 device connection to com port 1 of IPC.

Slave address 1 with no other slaves on daisy chain.

I was seeing alot of comms issues last week until i changed the tag group data mode to polled, i got 0 comms drop offs in over 1 week.

IPC was shut down, and has been power back up today and we now have the problem occurring again.

Below is the mix of errors I am seeing.

Below is my Device Config:

On getting comms drop offs now (Tag going to Bad quality), the timestamp is quite strange showing a date of 1601-01-01

image

Any tips greatly appreciated.

The Bad_CommunicationError quality/status is because of the checksum failures. The timestamp you're seeing is the null/0 value for an OPC UA timestamp.

Other than checking that your serial settings are correct I'm not sure what else you can do besides find some kind of serial port sniffer or tap and look at the serial traffic as a sanity check.

edit: you might try turning of the "reconnect after consecutive timeouts" setting as well. If nothing else this will clean up the logs a little, but I'm also seeing a bug where a checksum mismatch response doesn't reset the counter that counts timed out requests, so eventually what looks like just one request timing out could cause a reconnect, which causes all outstanding requests to fail.

Thanks for coming back Kevin.
I didn't think a sniffer would be useable in a 485 scenario with mulitiple master issues?

I already have reconnect after 3 consecutive disabled which left me surprised to be getting the disconnects aligned with the timeouts.

I'm not really familiar with the difference between 232 and 485, but is there something about 485 that makes multiple masters possible?

I don't believe so.
Any time I would have needed multiple masters before I would have always put in a TCP gateway and hit its from whatever server needed to poll from.

Well I'm not sure what you're asking about then. If you have multiple masters on this line then it's not surprising you get inconsistent results. If you don't, then I don't know what you were referencing.

I think there is some confusion on the original ask.
Let me see can I explain.

I do not have multiple masters.
You suggested using a sniffer, which I thought would not work with RS485 RTU only allowing 1 master.

You also suggested turning off Reconnect after 3 Consecutive timeouts.
I replied saying it was already turned off and that raised another question, why is the device connection showing disconnected in the logs.

I highlighted I don't think this is field related because if I use a 3rd party software instead (Not in parallel because this is the same as 2 masters and does not work) of ignition, I am not getting any errors.

Still a mystery. Without the actual logs I can't confirm if you're just seeing logs from other Modbus devices or there's just some bug causing that.

I'm talking about using something like this, which just analyzes traffic on an already open port. Unfortunately this particular software does not support RS485 in the free version.

What software? Running on the same server or a different machine? Are you certain it checks the CRC? Is it requesting the same registers as Ignition?

So that log is filtered for the Device name so they are all coming from that particular device connection.

The 3rd party software I am using is below.

You can see on its features on right hand side that it does offer Reverse CRC.
Is this the same as what you are asking when you say "checks the CRC"?

Yes I am requesting the same registers.
I am only polling 6 registers.
Ranging from HR6 - HR34

I have Span gaps enabled, I have tried previously to drop the Max number of holding regs per request but does not change to outcome.

The register HR34 is 1 registers I want to write to, however because of my lack of experience with how this behaves, I has disabled this tag to remove it from the question for now.

Have you tried to access the requested registers with tools like QModMaster in order to ensure that Modbus RTU connectivity is working properly. Of course you need to temporarily disable Ignition as it's acting as RTU Master in the 485 serial chain.

Set the log level for the logger you find searching "ReadHoldingRegistersRequest" to TRACE and let's see if the request/response bytes are making sense.

Not that this is necessarily an issue, but you're actually polling more registers than you think in this configuration. It's going to ask for 28 registers starting at offset 6 (5, really, at the protocol layer, unless you have zero-baed addressing on). This is because span gaps is on. If you turned it off then you'd have more requests for less registers in each request.

I have used a different software, just not you exact suggestion.
Details of trying this on a replies above.

Ok, I have turned off Span Gaps and it actually makes the problem worse.
Now i seem to be getting vlaues back on HR6 mush more consistantly than the rest. However, it does still drop off in data quality so doesn't seem to have fixed anything.

Let me try the logger as you have defined.

Just after i grabbed above Screenshots,
I turned back on allow span gaps,
We can see la number of good requests and the checksum issue starts again.

Can you export and upload these logs somewhere?

From the logs it just looks like there's something wrong with your device. Looking at where (one of) the first CRC mismatch occurs, you see this request:

01 03 00 05 00 18 55 C1 

SlaveId=0x01, FC=0x03, Offset=0x0005, Quantity=0x0018

It's met with this response from the device:

01 03 03 30 01 F1 00 39

SlaveId=0x01, FC=0x03, Byte Count = 0x03, Data = 0x30, 0x01, 0xF1, CRC == 0x0039

Starting at the byte count this is a bogus response. A register response should never indicate an odd number of bytes, and we requested 24 registers.

I don't know if something is wrong with the device itself or at the serial layer, but the driver appears to be doing the best it can with the bad data being returned.

Processing: Ignition_USV_IPC_Ignition_logs_20221114-1510.idb...
Thanks for the help here Kevin, let me see what I can trace with the information above.

Hi again Kevin,
As a temporary fix, have I any way of ignoring this type of response?

Does this device use Linux as its OS? If so, you might find the advice in my own Modbus Module's user manual helpful:

https://www.automation-pros.com/modbus/UserManual.pdf

Its Windows Phil