Salesforce

How does the TCP KeepAlive mechanism work?

« Go Back

Information

 
TitleHow does the TCP KeepAlive mechanism work?
URL NameP98399
Article Number000152388
EnvironmentProduct: Progress OpenEdge
Version: All supported versions
OS: All supported platforms
Question/Problem Description
How does the TCP KeepAlive mechanism work?
How does OS KEEPALIVE know when a TCP connection is dead?
Does the KeepAlive mechanism disconnect idle TCP/IP connections?
Steps to Reproduce
Clarifying Information
Error Message
Defect Number
Enhancement Number
Cause
Resolution
The KeepAlive mechanism does not disconnect idle TCP/IP connections:

When there is an established socket connection, and the connection is idle, no packets are transmitted. There is therefore no way to tell if the connection is still valid without sending some data and seeing if an error is returned.  

Firewalls on the otherhand have a timeout on inactive connections feature. For example, after some idle time, when the firewall times out the socket connection, a WebSpeed Agent is unaware that its connection to the database(s) has been broken until the next request occurs. That communication attempt then fails with errors 778 and 735 that are reported when it is informed (by the TCP/IP stack, through the KeepAlive mechanism) that the connection has been dropped.

KeepAlive detects situations where one side of the connection is no longer listening:

The KeepAlive mechanism does this by sending low-level probe messages to see if the other side responds.  If it does not respond to a certain number of probes within a certain amount of time, then it assumes the connection is dead and the process using the socket will then detect this through an error indication.

The system-wide timeout parameter that controls how long a connection has to be idle before it starts probing and how often probes are sent, is TCP_KEEPALIVE.  The default value of the idle time before probes begin is two hours (7,200,000 ms).  It can probably be lowered to 5 minutes without too many unwanted side effects, but be aware that it affects the whole system.

For example:
  • The Database Broker will write an error message to the database log when it is informed (by the TCP/IP stack) that the connection has been dropped. 
  • In the OpenEdge Replication view, this is in essence when the error message is written to the database log when the RPLS or RPLA is informed (by the TCP/IP stack) that the connection has been dropped then based on the configured parameters, tries to re-connect. It is also why the repl-keep-alive feature was introduced as a logical/application implementation when the Replication Server or Agent blocks when trying to send the message and the failure is not recognized. This is discussed further in Article  How long does RPLS take to detect network outage to the RPLA?    
Workaround
Notes
Keyword Phrase
Last Modified Date11/20/2020 7:16 AM

Powered by