The database connection FAULTED

Hi all,
I am getting an error in scripting task where i am executing multiple queries.

I am getting at line 15 and line 17 at some instance that the query is not able to execute due to database connection faulted.

I have lots of other tags which runs every 1, 5 and 10 sec but those queries are running properly. but the query in scripting gives me a database connection faulted error. in script there is also more queries but this two queries is at the start of the script.

any advice what should i looked into and how can i debug further what is causing this problem.

any help is appreciated to resolve this issue.

Below is the detail message of those error.

Error executing global timer script: QBit_2019/QBit_IgnToMES @5,000ms . Repeat errors of this type will be logged as 'debug' messages.
 	
Traceback (most recent call last):
File "<TimerScript:QBit_2019/QBit_IgnToMES @5,000ms >", line 17, in <module>

java.lang.Exception: java.lang.Exception: Error executing system.db.runQuery(SELECT def.id FROM [IgnitionServer].[dbo].[defectflag], def.id ASC, IgnitionLocalhost, )

Caused by: java.lang.Exception: Error executing system.db.runQuery(SELECT def.id FROM [IgnitionServer].[dbo].[defectflag] , def.id ASC, IgnitionLocalhost, )
Caused by: com.inductiveautomation.ignition.gateway.datasource.FaultedDatabaseConnectionException: The database connection 'IgnitionLocalhost' is FAULTED. See Gateway Status for details.
more
ERROR	2020-11-03 5:18:50 PM	IgnitionLocalhost	Database connection FAULTED.
 	
com.microsoft.sqlserver.jdbc.SQLServerException: The connection to the host helmssql04, named instance instanz1 has failed. Error: "java.net.SocketTimeoutException: Receive timed out". Verify the server and instance names, check that no firewall is blocking UDP traffic to port 1434, and for SQL Server 2005 or later verify that the SQL Server Browser Service is running on the host.

The gateway log may provide a clue.

@JordanCClark

it give me following message.

com.microsoft.sqlserver.jdbc.SQLServerException: The TCP/IP connection to the host helmssql04, port 50472 has failed. Error: "connect timed out. Verify the connection properties, check that an instance of SQL Server is running on the host and accepting TCP/IP connections at the port, and that no firewall is blocking TCP connections to the port.".

another error is

com.microsoft.sqlserver.jdbc.SQLServerException: The connection to the host helmssql04, named instance instanz1 has failed. Error: "java.net.SocketTimeoutException: Receive timed out". Verify the server and instance names, check that no firewall is blocking UDP traffic to port 1434, and for SQL Server 2005 or later verify that the SQL Server Browser Service is running on the host.

There is no firewall active.

any idea where to look or how i can narrow down to more specific problem.

What i found out is that it happens every 20 minute. I have checked SQL jobs which runs on server but there is no any job which executes during production time.

Reards,
T.Patel

If there are errors at the same time in the SQL server event logs, then you have a configuration problem in SQL Server. if there isn’t then you have a network issue.

@JordanCClark

Thanks for you feedback.

I have checked the sql log and i find some message as following.

SQL Server has encountered 2 occurrence(s) of I/O requests taking longer than 15 seconds to complete on file [E:\Program Files\Microsoft SQL Server\MSSQL11.INSTANZ1\MSSQL\DATA\IgnitionServer_log.ldf] in database [IgnitionServer] (8).  The OS file handle is 0x0000000000000ACC.  The offset of the latest long I/O is: 0x00000224e34400

i am trying to understand what it means.

Regads,
Tarun Patel

I/O requests are a pain in the backside. Use perfmon to monitor the process.

If this is happening on the same database file I'd add the SQLServer:Databases:Log Growth and SQLServer:Databases:Log Shrinks counters as well. It may be its growth rate is set too low.

If the log file is large you may have to set the log type to Simple Transaction (or something like that) and manually shrink it.

Some good information here, as well. It turns into a plug for their software, but some good stuff at the top. :wink:

One more thing…

If it’s only happening during this script, you’re queries within the script may be taking a long time. Don’t try to pull in all the rows if you don’t need them.

@JordanCClark
Thanks for your reply.

It turns out that there is issue with Disc or Network Related issue.
using performance monitor tool we can observed the wait statistics during writing the log files and its takes 28 second.

we can also see some warning messages

1. *The system failed to flush data to the transaction log. Corruption may occur in VolumeId: [\\?\Volume{849ef063-224a-11eb-80ec-005056bd5f3c}](file://%3f/Volume%7b849ef063-224a-11eb-80ec-005056bd5f3c%7d), DeviceName: \Device\HarddiskVolume243.*

*(The specified request is not a valid operation for the target device.)*

we are investigating further and hope to resolve this issue soon. I will update the post as soon as i have any update.

Regards,
Tarun