Understanding
Sockets
It is important
that you have an understanding of some of the concepts behind the socket
interface before you try to apply them. This section outlines some of the high
level concepts surrounding the sockets themselves.
Defining a
Socket
To communicate
with someone using a telephone, you must pick up the handset, dial the other
party's telephone number, and wait for them to answer. While you speak to that
other party, there are two endpoints of communication established:
• Your telephone,
at your location
• The remote
party's telephone, at his location
As long as both
of you communicate, there are two endpoints involved, with a line of
communication in between them. Figure 1.2 shows an illustration of two
telephones as endpoints, each connected to the other, through the telephone
network.
Figure
1.2:
Without the
telephone network, each endpoint of a telephone line is nothing more than a
plastic box. A socket under Linux, is quite similar to a telephone.
Sockets represent endpoints in a line of communication. In between the
endpoints exists the data communications network.
Sockets are like
telephones in another way. For you to telephone someone, you dial the telephone
number of the party you want to contact. Sockets have network addresses instead
of telephone
numbers. By
indicating the address of the remote socket, your program can establish a line
of communication between your local socket and that remote endpoint. Socket
addresses are discussed
in my post,"Domains and Address Families." You can conclude then, that a socket
is merely an endpoint in communication.
Using Sockets
You might think
that Linux sockets are treated specially, because you've already learned that
sockets have a collection of specific functions that operate on them. Although
it is true that sockets have some special qualities, they are very similar to
file descriptors that you should already be familiar with.
NOTE
Any reference to
a function name like pipe(2) means that you should have online documentation
(man pages) on your Linux system for that function. For information about
pipe(2) for example, you can enter the command:
$ man
2 pipe
where the 2
represents the manual section number, and the function name can be used as the
name of the manual page. Although the section number is often optional, there
are many cases where you must specify it in order to obtain the correct
information.
For example, when
you open a file using the Linux open(2) call, you are returned a file
descriptor if the open(2) function is successful. After you have this file
descriptor, your program uses it to read(2), write(2), lseek(2), and close(2)
the specific file that was opened. Similarly, a socket, when it is created, is
just like a file descriptor. You can use the same file I/O functions to read,
write, and close that socket. You learn in Chapter 15, "Using the inetd
Daemon," that sockets can be used for standard input (file unit 0),
standard output (file unit 1), or standard error (file unit 2).
NOTE
Sockets are
referenced by file unit numbers in the same way that opened files are. These
unit numbers share the same "number space"— for example, you cannot
have both a socket with unit number 4 and an open file on unit number 4 at the
same time. There are some differences, however, between sockets and opened
files. The following list highlights some of these differences:
- You cannot lseek(2) on a socket (this restriction also applies to pipes).
- Sockets can have addresses associated with them. Files and pipes do not have network addresses.
- Sockets have different option capabilities that can be queried and set using ioctl(2).
- Sockets must be in the correct state to perform input or output. Conversely, opened disk files can be read from or written to at any time.
Referencing
Sockets
When you open a
new file using the open(2) function call, the next available and lowest file
descriptor is returned by the Linux kernel. This file descriptor, or file
unit number as it is often called, is a zero
or positive integer value that is used to refer to the file that was opened.
This "handle" is used in all other functions that operate upon opened
files. Now you know that file unit numbers can also refer to specific sockets.
NOTE
When a new file
unit (or file descriptor) is needed by the kernel, the lowest available unit
number is returned. For example, if you were to close standard input (file unit
number 0), and then open a file successfully, the file unit number returned by
the open(2) call will be zero. Assume for a moment that your program already
has file units 0, 1, and 2 open (standard input, output, and error) and the
following sequence of program operations is carried out. Notice how the file
descriptors are allocated by the kernel:
- The open(2) function is called to open a file.
- File unit 3 is returned to reference the opened file. Because this unit is not currently in use, and is the lowest file unit presently available, the value 3 is chosen to be the file unit number for the file.
- A new socket is created using an appropriate function call.
- File unit 4 is returned to reference that new socket.
- Yet, another file is opened by calling open(2).
- File unit 5 is returned to reference the newly opened file.
Notice how the
Linux kernel makes no distinction between files and sockets when allocating
unit numbers. A file descriptor is used to refer to an opened file or a network
socket. This means that you, as a programmer, will use sockets as if they were
open files. Being able to reference files and sockets interchangeably by file unit
number provides you with a great deal of flexibility. This also means that
functions like read(2) and write(2) can operate upon both open files and
sockets.
No comments:
Post a Comment