Modbus Connection

peadair · July 13, 2012, 1:34pm

I recently lost connection to a modbus driver on a project and did not realise i had done so for a week. When i realised i had lost connection i restarted the OPC-UA and the connection came back. What could have cost this connection to be lost until i restarted the OPC-UA?

Kevin.Herron · July 13, 2012, 2:08pm

Can’t tell much without the logs. What version are you running? Can you export the logs.bin.gz file and post it here? There may still be info in it.

Restarting the UA module forces the Modbus driver module to restart, which forces a reconnect. The real question is why didn’t the connection to your Modbus device come back up.

peadair · July 13, 2012, 2:55pm

I am currently running 7.3.3 (b570) and please see logs as attached.
Wexfordlogs.bin.gz (411 KB)

Kevin.Herron · July 13, 2012, 3:27pm

I don’t see anything other than the connection to the device going up and down periodically.

Do you remember what day/time the problem started? Or when you restarted the UA module?

peadair · July 13, 2012, 3:31pm

We lost connection on the 29th june at 14:30 and re-started on the 6th july at 11:30

Kevin.Herron · July 13, 2012, 3:42pm

Which device lost the connection? There’s no indication that anything went down for the duration of that timeframe. Were the tags in the designer/client of bad quality while this happened?

peadair · July 13, 2012, 3:52pm

The Device name is EMR, I cannot remember what state the tags were in exactly but they did have a red overlay on them.

Kevin.Herron · July 13, 2012, 3:57pm

OK. Your device wasn’t down that entire time. But it does periodically get disconnected and have to reconnect.

For whatever reason, the device you’re talking to occasionally starts returning Exception Code 0x02 (IllegalDataAddress) in response to read requests. After returning these responses it closes the TCP connection and the device has to reconnect. Sometimes this cycle just continues until eventually whatever you’re talking to decides that the addresses are valid again.

That you restarted the UA module and they happened to come back to valid at that time is coincidence.

You need to start troubleshooting whatever gateway or device you’re talking to and find out why it periodically returns the IllegalDataAddress response.

peadair · July 13, 2012, 4:01pm

Thanks I will look into that, but if my device was not down for that entire time then how come I have no Data for that time frame what so ever?

Kevin.Herron · July 13, 2012, 4:15pm

Because the logs are full of store and forward errors that I’m passing to another developer to look at.

Colby.Clegg · July 13, 2012, 4:50pm

Hi,

There are many errors in there for “duplicate primary key” problems. What this means is that as values were being inserted, the database claimed that values for the same tag id and timestamp were already there. This would cause the data to become quarantined.

Changes were made in 7.3.6 to prevent this type of error from happening. If the duplicates were in the same transaction, the transaction would be rolled back, and you would likely end up with no data (or sporatic) for that time.

The data (or most of it) should be present as quarantined data in the store and forward system. I would recommend the following:

Stop the server
Locate the store and forward database under “{InstallDir}\data\datacache”. It will be the folder with the name of your database connection.
Zip that folder up, and the rename or delete the original. Upload the zip to ticket number 7665. I should be able to take a look, and hopefully extract the data into a sql file that you can load up.
Upgrade to the latest 7.3 ( 7.3.8 )

If you upload the cache I can give you a better idea of what data is there. I think that it should have just about everything- though there are a few strange error messages in the log that might indicate some corruption.

Regards,

Colby.Clegg · July 13, 2012, 9:02pm

Also, even though this shouldn’t happen or cause a problem, it may be possible that multiple sqltags exist in the internal database with the same path/name. Try this:

Go to Console>Advanced in the gateway config area, and bypass the warning.
Run:

SELECT path,name,count(*) FROM sqltag GROUP BY path, name, providerid, tagtype HAVING count(*)>1

Does anything show up? If so, I’d recommend deleting all but the most recent. You can run the following:
WARNING: These are custom built for this specific user’s case. If you think you have stumbled on this thread with what you think is a similar issue, please contact us separately.

[code]delete from sqltagprop where tagid in (select min(sqltag_id) from sqltag group by path,name, providerid, tagtype having count(*)>1)

delete from sqltag where sqltag_id in (select min(sqltag_id) from sqltag group by path,name, providerid, tagtype having count(*)>1)[/code]

You’ll then need to restart the gateway for the changes to take effect.

Regards,