Setting the SO_KEEPALIVE
Option (so_keepalive setting)
When connections
are used, they can sometimes be idle for long periods. For example, a telnet session
can be established to access a stock quotation service by a portfolio manager
of a mutual fund company. He
might perform a few initial inquiries and then leave the connection to the
service open in case he wants to go back for more. In the meantime, however,
the connection remains idle, possibly for
hours at a time.
Any server that
thinks it has a connected client must dedicate some resources to it. If the
server is of the forking type, then an entire Linux process with its associated
memory is dedicated to that client. When things are going well, this scenario
does not present any problem. The difficulty arises when a network disruption
occurs, and all 578 of your clients become disconnected from your stock quotation
service. After the network service is restored, an additional 578 clients will
be attempting to connect to your server, as they
re-establish connections. This is a real problem for you because your server
has not yet realized that it lost the idle clients earlier— option SO_KEEPALIVE
to the rescue!
The following
example shows how to enable SO_KEEPALIVE on a socket s so that a disconnected idle
connection can eventually be detected:
Example
#define TRUE 1
#define FALSE 0
int z; /* Status code */
int s; /* Socket s */
int so_keepalive;
. . .
so_keepalive =
TRUE;
z = setsockopt(s,
SOL_SOCKET,
SO_KEEPALIVE,
&so_keepalive,
sizeof
so_keepalive);
if ( z ) {
perror("setsockopt(2)");
}
The preceding
example enables the SO_KEEPALIVE option so that when the socket connection is idle
for long periods, a probe message is sent to the remote end. This is usually
done after two hours of inactivity.
There are three possible responses to a keep-alive probe message. They are
- The peer responds appropriately to indicate that all is well. No indication is returned to the application, because this is the application's assumption to begin with.
- The peer can respond indicating that it knows nothing about the connection. This indicates that the peer has been rebooted since the last communication with that host. The error ECONNRESET will then be returned to the application with the next socket operation.
- No response is received from the peer. In this case, the kernel might make several more attempts to make contact. TCP will usually give up in approximately 11 minutes if no response is solicited.
The error ETIMEDOUT
is returned with the next socket operation when this happens. Other errors such
as EHOSTUNREACH can be returned if the network is unable to reach the host any
longer, for example (this can happen
because of bad routing tables or router failures).
The time frames
involved for SO_KEEPALIVE limit its general usefulness. The probe message is sent
only after approximately two hours of inactivity. Then, when no response is
elicited, it might take another 11
minutes before the connection returns an error. Nevertheless, this facility
does eventually allow idle disconnected sockets to be detected, and then closed
by the server. Consequently, servers that support potentially long idle
connections should enable this feature.
See Also
SO_PASSCRED & SO_PEERCRED, SO_BROADCAST, SO_LINGER, SO_REUSEADD, SO_OBINLINE, SO_TYPE
No comments:
Post a Comment