
Today I look at the last remaining aspect of TCP/IP this course covers: the socket
interface for programming. This information is intended to convey the process needed to
integrate an application with TCP/IP and as such involves some basic programming
functions. It is not necessary to understand programming to understand this information.
The functions involved in the socket programming interface help you understand the steps
TCP/IP goes through when creating connections and sending data.
Understanding the socket interface is helpful even if you never intend to write a line
of TCP/IP code, because all the applications you will work with use these principles and
procedures. Debugging or troubleshooting a problem is much easier when you understand what
is going on behind the user interface. Today I don't attempt to show the complete socket
interface. Instead I deal only with the primary functions necessary to create and maintain
a connection. This chapter is not intended to be a programming guide, either.
Because the original socket interface was developed for UNIX systems, today's text has
a decidedly UNIX-based orientation. However, the same principles apply to most other
operating systems that support TCP/IP.
TCP/IP is fortunate because it has a well-defined application programming interface
(API), which dictates how an application uses TCP/IP. This solves a basic problem that has
occurred on many other communications protocols, which have several approaches to the same
problem, each incompatible with the other. The TCP/IP API is portable (it works across all
operating systems and hardware that support TCP/IP), language-independent (it doesn't
matter which language you use to write the application), and relatively uncomplicated.
The Socket API was developed at the University of California at Berkeley as part of
their BSD 4.1c UNIX version. Since then the API has been modified and enhanced but still
retains its BSD flavor. Not to be outdone, AT&T (BSD's rival in the UNIX market)
introduced the Transport Layer Interface (TLI) for TCP and several other protocols. One of
the strengths of the Socket API and TLI is that they were not developed exclusively for
TCP/IP but are intended for use with several communications protocols. The Socket
interface remains the most widespread API in current use, although several newer
interfaces are being developed.
The basic structure of all socket programming commands lies with the unique structure
of UNIX I/O. With UNIX, both input and output are treated as simple pipelines, where the
input can be from anything and the output can go anywhere. The UNIX I/O system is
sometimes referred to as the open-read-write-close system, because those are the
steps that are performed for each I/O operation, whether it involves a file, a device, or
a communications port.
Whenever a file is involved, the UNIX operating system gives the file a file
descriptor, a small number that uniquely identifies the file. A program can use this
file descriptor to identify the file at any time. (The same holds true for a device; the
process is the same.) A file operation uses an open function to return the file
descriptor, which is used for the read (transfer data to the user's process) or write
(transfer data from the user process to the file) functions, followed by a close function
to terminate the file operation. The open function takes a filename as an argument. The
read and write functions use the file descriptor number, the address of the buffer in
which to read or write the information, and the number of bytes involved. The close
function uses the file descriptor. The system is easy to use and simple to work with.
TCP/IP uses the same idea, relying on numbers to uniquely identify an end point for
communications (a socket). Whenever the socket number is used, the operating system can
resolve the socket number to the physical connector. An essential difference between a
file descriptor and a socket number is that the socket requires some functions to be
performed prior to the establishment of the socket (such as initialization). In
techno-speak, "a file descriptor binds to a specific file or device when the open
function is called, but the socket can be created without binding them to a specific
destination at all (necessary for UDP), or bind them later (for TCP when the remote
address is provided)." The same open-read-write-close procedure is used with sockets.
The process was actually used literally with the first versions of TCP/IP. A special
file called /dev/tcp was used as the device driver. The complexity added by networking
made this approach awkward, though, so a library of special functions (the API) was
developed. The essential steps of open, read, write, and close are still followed in the
protocol API.
There are three types of socket interfaces defined in the TCP/IP API. A socket can be
used for TCP stream communications, in which a connection between two machines is
created. It can be used for UDP datagram communications, a connectionless method of
passing information between machines using packets of a predefined format. Or it can be
used as a raw datagram process, in which the datagrams bypass the TCP/UDP layer and
go straight to IP. The latter type arises from the fact that the socket API was not
developed exclusively for TCP/IP.
The presence of all three types of interfaces can lead to problems with some parameters
that depend exclusively on the type of interface. You must always bear in mind whether TCP
or UDP is used.
There are six basic communications commands that the socket API addresses through the
TCP layer:
All six operations are logical and used as you would expect. The details for each step
can be quite involved, but the basic operation remains the same. Many of the functions
have been seen in previous days when dealing with specific protocols in some detail. Some
of the functions (such as open) comprise several other functions that are available if
necessary (such as establishing each end of the connection instead of both ends at once).
Despite the formal definition of the functions within the API specifications, no formal
method is given for how to implement them. There are two logical choices: synchronous, or blocking,
in which the application waits for the command to complete before continuing execution;
and asynchronous, or nonblocking, in which the application continues executing
while the API function is processed. In the latter case, a function call further in the
application's execution can check the API functions' success and return codes.
The problem with the synchronous or blocking method is that the application must wait
for the function call to complete. If timeouts are involved, this can cause a noticeable
delay for the user.
The Transmission Control Block (TCB) is a complex data structure that contains details
about a connection. The full TCB has over fifty fields in it. The exact layout and
contents of the TCB are not necessary for today's material, but the existence of the TCB
and the nature of the information it holds are key to the behavior of the socket
interface.
The API lets a user create a socket whenever necessary with a simple function call. The
function requires the family of the protocol to be used with the socket (so the operating
system knows which type of socket to assign and how to decode information), the type of
communication required, and the specific protocol. Such a function call is written as
follows:
socket(family, type, protocol)
The family of the protocol actually specifies how the addresses are interpreted.
Examples of families are TCP/IP (coded as AF_INET), Apple's AppleTalk (AF_APPLETALK), and
UNIX filesystems (AF_UNIX). The exact protocol within the family is specified as the
protocol parameter. When used, it specifically indicates the type of service that is to be
used.
The type parameter indicates the type of communications used. It can be a
connectionless datagram service (coded as SOCK_DGRAM), a stream delivery service
(SOCK_STREAM), or a raw type (SOCK_RAW). The result from the function call is an integer
that can be assigned to a variable for further checking.
Because a socket can be created without any binding to an address, there must be a
function call to complete this process and establish the full connection. With the TCP/IP
protocol, the socket function does not supply the local port number, the destination port,
or the IP address of the destination. The bind function is called to establish the local
port address for the connection.
Some applications (especially on a server) want to use a specific port for a
connection. Other applications are content to let the protocol software assign a port. A
specific port can be requested in the bind function. If it is available, the software
allocates it and returns the port information. If the port cannot be allocated (it might
be in use), a return code indicates an error in port assignment.
The bind function has the following format:
bind(socket, local_address, address_length)
socket is the integer number of the socket to which the bind is completed; local_address
is the local address to which the bind is performed; and address_length is an
integer that gives the length of the address in bytes. The address is not returned as a
simple number but has the structure shown in Figure 14.1.
Figure 14.1. Address structure used by the socket API.
The address data structure (which is called usually called sockaddr for socket
address) has a 16-bit Address Family field that identifies the protocol family of the
address. The entry in this field determines the format of the address in the following
field (which might contain other information than the address, depending on how the
protocol has defined the field). The Address field can be up to 14 bytes in length,
although most protocols do not need this amount of space.
The use of a data structure instead of a simple address has its roots in the UNIX operating system and the closely allied C programming language. The formal structure of the socket address enables C programs to use a union of structures for all possible address families. This saves a considerable amount of coding in applications.
TCP/IP has a family address of 2, following which the Address field contains both a
protocol port number (16 bits) and the IP address (32 bits). The remaining eight bytes are
unused. This is shown in Figure 14.2. Because the address family defines how the Address
field is decoded, there should be no problem with TCP/IP applications understanding the
two pieces of information in the Address field.
Figure 14.2. The address structure for TCP/IP.
After a local socket address and port number have been assigned, the destination socket
can be connected. A one-ended connection is referred to as being in an unconnected
state, whereas a two-ended (complete) connection is in a connected state. After
a bind function, an unconnected state exists. To become connected, the destination socket
must be added to complete the connection.
Connectionless protocols such as UDP do not require a connected state to function. They can, however, be connected to enable transfer between the two sockets without having to specify the destination address each time. Connection-based protocols such as TCP require both ends of the connection to be specified.
To establish a connection to a remote socket, the connect function is used. The connect
function's format is
connect(socket, destination_address, address_length)
The socket is the integer number of the socket to which to connect; the destination_address
is the socket address data structure for the destination address (using the same format as
shown in Figure 14.1); and the address_length is the length of the destination
address in bytes.
The manner in which connect functions is protocol-dependent. For TCP, connect
establishes the connection between the two endpoints and returns the information about the
remote socket to the application. If a connection can't be established, an error message
is generated. For a connectionless protocol such as UDP, the connect function is still
necessary but stores only the destination address for the application.
The open command prepares a communications port for communications. This is an
alternative to the combination of the functions shown previously, used by applications for
specific purposes. There are really three kinds of open commands, two of which set a
server to receive incoming requests and the third used by a client to initiate a request.
With every open command, a TCB is created for that connection.
The three open commands are an unspecified passive open (which enables a server to wait
for a connection request from any client), a fully specified passive open (which enables a
server to wait for a connection request from a specific client), and an active open (which
initiates a connection with a server). The input and output expected from each command are
shown in Table 14.1.
Table 14.1. Open command parameters.
| Type | Input | Output |
| Unspecified |
local port |
local connection name |
| passive open |
Optional: timeout, precedence, security,
maximum segment size |
local connection name |
| Fully specified passive open |
local port, remote IP address, remote port
Optional: timeout, precedence, security, maximum segment size |
local connection name |
| Active open |
local port, destination IP address,
destination port Optional: timeout, precedence, security, maximum segment size |
local connection name |
When an open command is issued by an application, a set of functions within the socket
interface is executed to set up the TCB, initiate the socket number, and establish
preliminary values for the variables used in the TCB and the application.
The passive open command is issued by a server to wait for incoming requests. With the
TCP (connection-based) protocol, the passive open issues the following function calls:
The active open command is issued by a client. For TCP, it issues two functions:
If the exact port to use is specified as part of the open command, a bind function call
replaces the connect function.
There are five functions within the Socket API for sending data through a socket. These
are send, sendto, sendmsg, write, and writev. Not surprisingly, all these functions send
data from the application to TCP. They do this through a buffer created by the application
(for example, it might be a memory address or a character string), passing the entire
buffer to TCP. The send, write, and writev functions work only with a connected socket
because they have no provision to specify a destination address within their function
call.
The format of the send function is simple. It takes the local socket connection number,
the buffer address for the message to be sent, the length of the message in bytes, a Push
flag, and an Urgent flag as parameters. An optional timeout might be specified. Nothing is
returned as output from the send function. The format is
send(socket, buffer_address, length, flags)
The sendto and sendmsg functions are similar except they enable an application to send
a message through an unconnected socket. They both require the destination address as part
of their function call. The sendmsg function is simpler in format than the sendto
function, primarily because another data structure is used to hold information. The
sendmsg function is often used when the format of the sendto function would be awkward and
inefficient in the application's code. Their formats are
sendto(socket, buffer_address, length, flags, destination, address_length) sendmsg(socket, message_structure, flags)
The last two parameters in the sendto function are the destination address and the
length of the destination address. The address is specified using the format shown in
Figure 14.1. The message_structure of the sendmsg function contains the information
left out of the sendto function call. The format of the message structure is shown in
Figure 14.3.
Figure 14.3. The message structure used by sendmsg.
The fields in the sendmsg message structure give the socket address, size of the socket
address, a pointer to the iovector, which contains information about the message to be
sent, the length of the iovector, the destination address, and the length of the
destination address.
The sendmsg function uses the message structure to simplify the function call. It also has another advantage: the recvmsg function uses the same structure, simplifying an application's code.
The iovector is an address for an array that points to the message to be sent. The
array is a set of pointers to the bytes that comprise the message. The format of the
iovector is simple. For each 32-bit address to a memory location with a chunk of the
message, a corresponding 32-bit field holds the length of the message in that memory
location. This format is repeated until the entire message is specified. This is shown in
Figure 14.4. The iovector format enables a noncontiguous message to be sent. In other
words, the first part of the message can be in one location in memory, and the rest is
separated by other information. This can be useful because it saves the application from
copying long messages into a contiguous location.
Figure 14.4. The iovector format.
The write function takes three arguments: the socket number, the buffer address of the
message to be sent, and the length of the message to send. The format of the function call
is
write(socket, buffer_address, length)
The writev function is similar to write except it uses the iovector to hold the
message. This lets it send a message without copying it into another memory address. The
format of writev is
writev(socket, iovector, length)
where length is the number of entries in iovector.
The type of function chosen to send data through a socket depends on the type of
connection used and the level of complexity of the application. To a considerable degree,
it is also a personal choice of the programmer.
Not surprisingly, because there are five functions to send data through a socket, there
are five corresponding functions to receive data: read, readv, recv, recvfrom, and
recvmsg. They all accept incoming data from a socket into a reception buffer. The receive
buffer can then be transferred from TCP to the application.
The read function is the simplest and can be used only when a socket is connected. Its
format is
read(socket, buffer, length)
The first parameter is the number of the socket or a file descriptor from which to read
the data, followed by the memory address in which to store the incoming data, and the
maximum number of bytes to be read.
As with writev, the readv command enables incoming messages to be placed in
noncontiguous memory locations through the use of an iovector. The format of readv is
readv(socket, iovector, length)
length is the number of entries in the iovector. The format of the iovector is
the same as mentioned previously and shown in Figure 14.4.
The recv function also can be used with connected sockets. It has the format
recv(socket, buffer_address, length, flags)
which corresponds to the send function's arguments.
The recvfrom and recvmsg functions enable data to be read from an unconnected socket.
Their formats include the sender's address:
recvfrom(socket, buffer_address, length, flags, source_address, address_length) recvmsg(socket, message_structure, flags)
The message structure in the recvmsg function corresponds to the structure in sendmsg.
(See Figure 14.3.)
A server application that expects clients to call in to it has to create a socket
(using socket), bind it to a port (with bind), then wait for incoming requests for data.
The listen function handles problems that could occur with this type of behavior by
establishing a queue for incoming connection requests. The queue prevents bottlenecks and
collisions, such as when a new request arrives before a previous one has been completely
handled, or two requests arrive simultaneously.
The listen function establishes a buffer to queue incoming requests, thereby avoiding
losses. The function lets the socket accept incoming connection requests, which are all
sent to the queue for future processing. The function's format is
listen(socket, queue_length)
where queue_length is the size of the incoming buffer. If the buffer has room,
incoming requests for connections are added to the buffer and the application can deal
with them in the order of reception. If the buffer is full, the connection request is
rejected.
After the server has used listen to set up the incoming connection request queue, the
accept function is used to actually wait for a connection. The format of the function is
accept(socket, address, length)
socket is the socket on which to accept requests; address is a pointer to
a structure similar to Figure 14.1; and length is a pointer to an integer showing
the length of the address.
When a connection request is received, the protocol places the address of the client in
the memory location indicated by the address parameter, and the length of that address in
the length location. It then creates a new socket that has the client and server connected
together, sending back the socket description to the client. The socket on which the
request was received remains open for other connection requests. This enables multiple
requests for a connection to be processed, whereas if that socket was closed down with
each connection request, only one client/server process could be handled at a time.
One possible special occurrence must be handled on UNIX systems. It is possible for a
single process to wait for a connection request on multiple sockets. This reduces the
number of processes that monitor sockets, thereby lowering the amount of overhead the
machine uses. To provide for this type of process, the select function is used. The format
of the function is
select(num_desc, in_desc, out_desc, excep_desc, timeout)
num_desc is the number of sockets or descriptors that are monitored; in_desc
and out_desc are pointers to a bit mask that indicates the sockets or file
descriptors to monitor for input and output, respectively; excep_desc is a pointer
to a bit mask that specifies the sockets or file descriptors to check for exception
conditions; and timeout is a pointer to an integer that indicates how long to wait
(a value of 0 indicates forever). To use the select function, a server creates all the
necessary sockets first, then calls select to determine which ones are for input, output,
and exceptions.
Several status functions are used to obtain information about a connection. They can be
used at any time, although they are typically used to establish the integrity of a
connection in case of problems or to control the behavior of the socket.
The status functions require the name of the local connection, and they return a set of
information, which might include the local and remote socket names, local connection name,
receive and send window states, number of buffers waiting for an acknowledgment, number of
buffers waiting for data, and current values for the urgent state, precedence, security,
and timeout variables. Most of this information is read from the Transmission Control
Block (TCB). The format of the information and the exact contents vary slightly, depending
on the implementation.
The function getsockopt enables an application to query the socket for information. The
function format is
getsockopt(socket, level, option_id, option_result, length)
socket is the number of the socket; level indicates whether the function
refers to the socket itself or the protocol that uses it; option_id is a single
integer that identifies the type of information requested; option_result is a
pointer to a memory location where the function should place the result of the query; and length
is the length of the result.
The corresponding setsockopt function lets the application set a value for the socket.
The function's format is the same as getsockopt except that option_result points to
the value that is to be set, and length is the length of the value.
Two functions provide information about the local address of a socket. The getpeername
function returns the address of the remote end. The getsockname function returns the local
address of a socket. They have the following formats:
getpeername(socket, destination_address, address_length) getsockname(socket, local_address, address_length)
The addresses in both functions are pointers to a structure of the format shown in
Figure 14.1.
Two host name functions for BSD UNIX are gethostname and sethostname, which enable an
application to obtain the name of the host and set the host name (if permissions allow).
Their formats are as follows:
sethostname(name, length) gethostname(name, length)
The name is the address of an array that holds the name, and the length
is an integer that gives the name's length.
A similar set of functions provides for domain names. The functions setdomainname and
getdomainname enable an application to obtain or set the domain names. Their formats are
setdomainname(name, length) getdomainname(name, length)
The parameters are the same as with the sethostname and gethostname functions, except
for the format of the name (which reflects domain name format).
The close function closes a connection. It requires only the local connection name to
complete the process. It also takes care of the TCB and releases any variable created by
the connection. No output is generated.
The close function is initiated with the call
close(socket)
where the socket name is required. If an application terminates abnormally, the
operating system closes all sockets that were open prior to the termination.
The abort function instructs TCP to discard all data that currently resides in send and
receive buffers and close the connection. It takes the local connection name as input. No
output is generated. This function can be used in case of emergency shutdown routines, or
in case of a fatal failure of the connection or associated software.
The abort function is usually implemented by the close() call, although some special
instructions might be available with different implementations.
UNIX has two system calls that can affect sockets: fork and exec. Both are frequently
used by UNIX developers because of their power. (In fact, forks are one of the most
powerful tools UNIX offers, and one that most other operating systems lack.) For
simplicity, I deal with the two functions as though they perform the same task.
A fork call creates a copy of the existing application as a new process and starts
executing it. The new process has all the original's file descriptors and socket
information. This can cause a problem if the application programmer didn't take into
account the fact that two (or more) processes try to use the same socket (or file)
simultaneously. Therefore, applications that can fork have to take into account potential
conflicts and code around them by checking the status of shared sockets.
The operating system itself keeps a table of each socket and how many processes have
access to it. An internal counter is incremented or decremented with each process's open
or close function call for the socket. When the last process using a socket is terminated,
the socket is permanently closed. This prevents one forked process from closing a socket
when its original is still using it.
Today you have seen the basic functions performed by the socket API during
establishment of a TCP or UDP call. You have also seen the functions that are available to
application programmers. Although the treatment has been at a high level, you should be
able to see that working with sockets is not a complex, confusing task. Indeed, socket
programming is surprisingly easy once you have tried it.
Not everyone wants to write TCP or UDP applications, of course. However, understanding
the basics of the socket API helps in understanding the protocol and troubleshooting. If
you are interested in programming sockets, one of the best books on the subject is UNIX
Network Programming, by W. Richard Stevens (Macmillan).
What is the socket interface used for?
The socket interface enables you to write applications that make optimal use of the
TCP/IP family of protocols. Without it, you would need another layer of application to
translate your program's calls to TCP/IP calls.
What is the difference between blocking and nonblocking functions?
A blocking function waits for the function to terminate before enabling the application
to continue. A nonblocking function enables the application to continue executing while
the function is performed. Both have important uses in applications.
What does binding do?
Binding makes a logical connection between a socket and the application. Without it,
the application couldn't access the socket.
What happens when an active open command is executed?
An active open command creates a socket and binds it, then issues a connect call to
identify the IP address and port. The active open command then tries to establish
communications.
What is the difference between an abort and a close operation?
A close operation closes a connection. An abort abandons whatever communications are
currently underway and closes the connection. With an abort, any information in receive
buffers is discarded.