The
Berkeley sockets application programming
interface (API) comprises a
library for developing applications in
the
C programming language
that perform
inter-process
communication, most commonly for communications across a
computer network.
Berkeley sockets (also known as the
BSD socket API) originated
with the 4.2BSD
Unix operating system (released in 1983) as an
API.
Only
in 1989, however, could UC
Berkeley
release versions of its operating system and
networking library free from the licensing constraints of AT&T's copyright-protected Unix.
The Berkeley socket API forms the
de
facto standard
abstraction for
network sockets. Most other programming
languages use an interface similar to the C API. The API is also
used for
Unix domain sockets,
which are an interface to interprocess communication (IPC) channels
within a single computer.
The
STREAMS-based
Transport Layer Interface (TLI)
API offers an alternative to the socket API. However, recent
systems that provide the TLI API also provide the Berkeley socket
API.
The
Berkeley socket interface, an
application programming
interface (API), allows communications between
hosts or between processes on one
computer, using the concept of an
Internet socket. It can work with many
different
I/O devices and
drivers, although support for these depends on
the
operating-system
implementation. This interface implementation is implicit for
TCP/IP, and it is therefore
one of the fundamental technologies underlying the
Internet.
It was first developed at the University of
California, Berkeley
for use on Unix systems. All modern
operating systems now have some implementation of the Berkeley
socket interface, as it became the standard interface for
connecting to the Internet.
Socket interfaces are accessible at three different levels, most
powerfully and fundamentally at the
raw
socket level. Very few applications need the degree of control
over outgoing communications that this provides, so raw sockets
support was intended to be available only on computers used for
developing Internet-related technologies. In recent years, most
operating systems have implemented support for it anyway, including
Windows XP.
Header files
The Berkeley socket interface is defined in several header files.
The names and content of these files differ slightly between
implementations. In general, they include:
- ;
socket.h>
- : Core BSD socket functions and data structures.
- ;
in.h>
- : AF_INET and AF_INET6 address families and their corresponding
protocol families PF_INET and PF_INET6. Widely used on the
Internet, these include IP addresses and TCP and UDP port
numbers.
- ;
un.h>
- : PF_UNIX/PF_LOCAL address family. Used for local communication
between programs running on the same computer. Not used on
networks.
- ;
inet.h>
- : Functions for manipulating numeric IP addresses.
- ;
- : Functions for translating protocol names and host names into
numeric addresses. Searches local data as well as DNS.
Socket API functions
This list is a summary of functions or methods provided by the
Berkeley sockets API library:
socket() creates a new socket of a certain socket
type, identified by an integer number, and allocates system
resources to it.
bind() is typically used on the server side, and
associates a socket with a socket address structure, i.e. a
specified local port number and IP address.
listen() is used on the server side, and causes a
bound TCP socket to enter listening state.
connect() is used on the client side, and assigns
a free local port number to a socket. In case of a TCP socket, it
causes an attempt to establish a new TCP connection.
accept() is used on the server side. It accepts a
received incoming attempt to create a new TCP connection from the
remote client, and creates a new socket associated with the socket
address pair of this connection.
send() and recv(), or
write() and read(), or
sendto() and recvfrom(), are used for
sending and receiving data to/from a remote socket.
close() causes the system to release resources
allocated to a socket. In case of TCP, the connection is
terminated.
gethostbyname() and gethostbyaddr()
are used to resolve host names and addresses.
select() is used to prune a provided list of
sockets for those that are: ready to read, ready to write or have
errors
poll() is used to check on the state of a socket.
The socket can be tested to see if it can be written to, read from
or has errors.
getsockopt() is used to retrive the current value
of a particular socket option for the specified socket.
setsockopt() is used to set a particular socket
option for the specified socket.
Further details are given below.
socket()
socket() creates an endpoint for communication and
returns a
file descriptor for the
socket.
socket() takes three arguments:
- domain, which specifies the protocol family of the
created socket. For example:
PF_INET for network protocol IPv4 or
PF_INET6 for IPv6.
PF_UNIX for local socket (using a file).
- type, one of:
SOCK_STREAM (reliable stream-oriented service or
Stream Sockets)
SOCK_DGRAM (datagram service or Datagram Sockets)
SOCK_SEQPACKET (reliable sequenced packet
service), or
SOCK_RAW (raw protocols atop the network
layer).
- protocol specifying the actual transport protocol to
use. The most common are
IPPROTO_TCP,
IPPROTO_SCTP, IPPROTO_UDP, IPPROTO_DCCP. These protocols are specified in
in.h>. The value “0” may be used to select a
default protocol from the selected domain and
type.
The function returns -1 if an error occurred. Otherwise, it returns
an integer representing the newly-assigned descriptor.
- Prototype:
int socket(int domain, int type, int protocol);
bind()
bind() assigns a socket an address. When a socket is
created using
socket(), it is only given a protocol
family, but not assigned an address. This association with an
address must be performed with the bind() system call before the
socket can accept connections to other hosts.
bind()
takes three arguments:
sockfd, a descriptor representing the socket to
perform the bind on
my_addr, a pointer to a sockaddr
structure representing the address to bind to.
addrlen, a socklen_t field specifying
the size of the sockaddr structure.
Bind() returns 0 on success and -1 if an error occurs.
- Prototype:
int bind(int sockfd, const struct sockaddr *my_addr, socklen_t
addrlen);
listen()
After a socket has been associated with an address,
listen() prepares it for incoming connections.
However, this is only necessary for the stream-oriented
(connection-oriented) data modes, i.e., for socket types
(
SOCK_STREAM,
SOCK_SEQPACKET). listen()
requires two arguments:
sockfd, a valid socket descriptor.
backlog, an integer representing the number of
pending connections that can be queued up at any one time. The
operating system usually places a cap on this value.
Once a connection is accepted, it is dequeued. On success, 0 is
returned. If an error occurs, -1 is returned.
- Prototype:
int listen(int sockfd, int backlog);
accept()
When an application is listening for stream-oriented connections
from other hosts, it is notified of such events (cf.
select function) and must initialize the
connection using the
accept() function. Accept()
creates a new socket for each connection and removes the connection
from the listen queue. It takes the following arguments:
sockfd, the descriptor of the listening socket
that has the connection queued.
cliaddr, a pointer to a sockaddr structure to
receive the client's address information.
addrlen, a pointer to a socklen_t
location that specifies the size of the client address structure
passed to accept(). When accept() returns, this
location indicates how many bytes of the structure were actually
used.
The accept() function returns the new socket descriptor for the
accepted connection, or -1 if an error occurs. All further
communication with the remote host now occurs via this new
socket.
Datagram sockets do not require processing by accept() since the
receiver may immediately respond to the request using the listening
socket.
- Prototype:
int accept(int sockfd, struct sockaddr *cliaddr, socklen_t
*addrlen);
connect()
The
connect() system call
connects a socket,
identified by its file descriptor, to a remote host specified by
that host's address in the argument list.
Certain types of sockets are
connectionless, most commonly
user datagram protocol
sockets. For these sockets, connect takes on a special meaning: the
default target for sending and receiving data gets set to the given
address, allowing the use of functions such as send() and recv() on
connectionless sockets.
connect() returns an integer representing the error code:
0 represents success, while -1 represents an error.
- Prototype:
int connect(int sockfd, const struct sockaddr *serv_addr, socklen_t
addrlen);
gethostbyname() and gethostbyaddr()
The
gethostbyname() and
gethostbyaddr()
functions are used to resolve host names and addresses in the
domain name system or the local
host's other resolver mechanisms (e.g., /etc/hosts lookup). They
return a pointer to an object of type
struct hostent,
which describes an
Internet
Protocol host. The functions take the following
arguments:
- name specifies the name of the host. For example:
www.wikipedia.org
- addr specifies a pointer to a struct
in_addr containing the address of the host.
- len specifies the length, in bytes, of
addr.
- type specifies the address family type (e.g.,
AF_INET) of the host address.
The functions return a NULL pointer in case of error, in which case
the external integer
h_errno may be checked to see
whether this is a temporary failure or an invalid or unknown host.
Otherwise a valid
struct hostent * is returned.
These functions are not strictly a component of the BSD socket API,
but are often used in conjunction with the API functions.
Furthermore, these functions are now considered legacy interfaces
for querying the domain name system. New functions that are
completely protocol-agnostic have been defined. These new function
are
getaddrinfo and getnameinfo, and are
based on a new
addrinfo data
structure.
- Prototypes:
struct hostent *gethostbyname(const char *name);struct hostent
*gethostbyaddr(const void *addr, int len, int type);
Protocol and address families
The socket API is a general interface for Unix networking and
allows the use of various network protocols and addressing
architectures.
The following lists a sampling of protocol families (preceded by
the standard symbolic identifier) defined in a modern
Linux or
BSD implementation:
PF_LOCAL, PF_UNIX, PF_FILE
Local to host (pipes and file-domain)
PF_INET IP protocol family
PF_AX25 Amateur Radio AX.25
PF_IPX Novell Internet Protocol
PF_APPLETALK Appletalk DDP
PF_NETROM Amateur radio NetROM
PF_BRIDGE Multiprotocol bridge
PF_ATMPVC ATM PVCs
PF_X25 Reserved for X.25 project
PF_INET6 IP version 6
PF_ROSE Amateur Radio X.25 PLP
PF_DECnet Reserved for DECnet project
PF_NETBEUI Reserved for 802.2LLC project
PF_SECURITY Security callback pseudo AF
PF_KEY PF_KEY key management API
PF_NETLINK, PF_ROUTE
routing API
PF_PACKET Packet family
PF_ASH Ash
PF_ECONET Acorn Econet
PF_ATMSVC ATM SVCs
PF_SNA Linux SNA Project
PF_IRDA IRDA sockets
PF_PPPOX PPPoX sockets
PF_WANPIPE Wanpipe API sockets
PF_BLUETOOTH Bluetooth sockets
A socket for communications using any family is created with the
socket() function (see above), by specifying the desired
protocol family (
PF_-identifier) as an
argument.
The original design concept of the socket interface distinguished
between protocol types (families) and the specific address types
that each may use. It was envisioned that a protocol family may
have several address types. Address types were defined by
additional symbolic constants, using the prefix
AF_ instead of
PF_. The
AF_-identifiers are intended for all data
structures that specifically deal with the address type and not the
protocol family.However, this concept of separation of protocol and
address type has not found implementation support and the
AF_-constants were simply defined by the
corresponding protocol identifier, rendering the distinction
between
AF_ versus
PF_
constants a technical argument of no significant practical
consequence. Indeed, much confusion exists in the proper usage of
both forms.
Options for sockets
After creating a socket, it is possible to set options on it. Some
of the more common options are:
TCP_NODELAY disables the Nagle algorithm.
SO_KEEPALIVE enables periodic 'liveness' pings, if
supported by the OS.
Blocking vs. non-blocking mode
Berkeley sockets can operate in one of two modes: blocking or
non-blocking. A
blocking socket will not return control
until it has sent (or received) all data specified for the
operation. This is true only in Linux systems. In other systems,
such as FreeBSD, it is normal for a blocking socket not to send all
data. The application must check the return value to determine how
many bytes have been sent or received and it must resend any data
not already processed
[12726]. It also may cause problems if a socket
continues to listen: a program may hang as the socket waits for
data that may never arrive.
A socket is typically set to blocking or nonblocking mode using the
fcntl() or
ioctl()
functions.
Terminating sockets
The operating system does not release the resources allocated to a
socket until a
close() call occurs on the socket
descriptor. This is especially important if the
connect() call fails and may be retried. Each
successful call to
socket() must have a matching call
to
close() in all possible execution paths. The header
file defines the close function.
When the
close() system call is initiated by an
application, only the interface to the socket is destroyed, not the
socket itself. It is the kernel's responsibility to destroy the
socket internally. Sometimes, a socket may enter a
TIME_WAIT state, on the server side, for up to 4
minutes.
Client-server example using TCP
The
Transmission Control
Protocol (TCP) provides the concept of a
connection,
which is a stateful network association between two hosts with a
variety of error correction and performance features. A process
creates a TCP socket by calling the
socket() function
with the parameters for the protocol family (
PF_INET,
PF_INET6),
SOCK_STREAM (
Stream
Sockets) and the IP protocol identifier
IPPROTO_TCP.
Server
Setting up a simple TCP server involves the following steps:
- Creating a TCP socket, with a call to
socket().
- Binding the socket to the listen port, with a call to
bind(). Before calling bind(), a
programmer must declare a sockaddr_in structure, clear
it (with memset()), and the sin_family
(AF_INET), and fill its sin_port (the
listening port, in network byte
order) fields. Converting a short int to network
byte order can be done by calling the function htons()
(host to network short).
- Preparing the socket to listen for connections (making it a
listening socket), with a call to
listen().
- Accepting incoming connections, via a call to
accept(). This blocks until an incoming connection is
received, and then returns a socket descriptor for the accepted
connection. The initial descriptor remains a listening descriptor,
and accept() can be called again at any time with this
socket, until it is closed.
- Communicating with the remote host, which can be done through
send() and recv() or write()
and read().
- Eventually closing each socket that was opened, once it is no
longer needed, using
close(). Note that if there were
any calls to fork(), each process must close the
sockets it knew about (the kernel keeps track of how many processes
have a descriptor open), and two processes should not use the same
socket at once.
/* Server code in C */
#include types.h>
#include socket.h>
#include in.h>
#include inet.h>
#include
#include
#include
#include
int main(void)
{
struct sockaddr_in stSockAddr;
int SocketFD = socket(PF_INET, SOCK_STREAM, IPPROTO_TCP);
if(-1 == SocketFD)
{
perror("can not create socket");
exit(EXIT_FAILURE);
}
memset(&stSockAddr, 0, sizeof(struct sockaddr_in));
stSockAddr.sin_family = AF_INET;
stSockAddr.sin_port = htons(1100);
stSockAddr.sin_addr.s_addr = INADDR_ANY;
if(-1 == bind(SocketFD,(const struct sockaddr *)&stSockAddr, sizeof(struct sockaddr_in)))
{
perror("error bind failed");
close(SocketFD);
exit(EXIT_FAILURE);
}
if(-1 == listen(SocketFD, 10))
{
perror("error listen failed");
close(SocketFD);
exit(EXIT_FAILURE);
}
for(;;)
{
int ConnectFD = accept(SocketFD, NULL, NULL);
if(0 > ConnectFD)
{
perror("error accept failed");
close(SocketFD);
exit(EXIT_FAILURE);
}
/* perform read write operations ... */
shutdown(ConnectFD, SHUT_RDWR);
close(ConnectFD);
}
return 0;
}
Client
Setting up a TCP client involves the following steps:
- Creating a TCP socket, with a call to
socket().
- Connecting to the server with the use of
connect(), passing a sockaddr_in
structure with the sin_family set to
AF_INET, sin_port set to the port the
endpoint is listening (in network byte order), and
sin_addr set to the IP address of the listening server
(also in network byte order.)
- Communicating with the server by using
send() and
recv() or write() and
read().
- Terminating the connection and cleaning up with a call to
close(). Again, if there were any calls to
fork(), each process must close() the
socket.
/* Client code in C */
#include types.h>
#include socket.h>
#include in.h>
#include inet.h>
#include
#include
#include
#include
int main(void)
{
struct sockaddr_in stSockAddr;
int Res;
int SocketFD = socket(PF_INET, SOCK_STREAM, IPPROTO_TCP);
if (-1 == SocketFD)
{
perror("cannot create socket");
exit(EXIT_FAILURE);
}
memset(&stSockAddr, 0, sizeof(struct sockaddr_in));
stSockAddr.sin_family = AF_INET;
stSockAddr.sin_port = htons(1100);
Res = inet_pton(AF_INET, "192.168.1.3", &stSockAddr.sin_addr);
if (0 > Res)
{
perror("error: first parameter is not a valid address family");
close(SocketFD);
exit(EXIT_FAILURE);
}
else if (0 == Res)
{
perror("char string (second parameter does not contain valid ipaddress");
close(SocketFD);
exit(EXIT_FAILURE);
}
if (-1 == connect(SocketFD, (const struct sockaddr *)&stSockAddr, sizeof(struct sockaddr_in)))
{
perror("connect failed");
close(SocketFD);
exit(EXIT_FAILURE);
}
/* perform read write operations ... */
shutdown(SocketFD, SHUT_RDWR);
close(SocketFD);
return 0;
}
Client-server example using UDP
The
User Datagram Protocol
(UDP) is a
connectionless protocol
with no guarantee of delivery. UDP packets may arrive out of order,
multiple times, or not at all. Because of this minimal design, UDP
has considerably less overhead than TCP. Being connectionless means
that there is no concept of a stream or permanent connection
between two hosts. Such data are referred to as datagrams (
Datagram Sockets).
UDP address space, the space of UDP port numbers (in ISO
terminology, the
TSAPs), is completely disjoint
from that of TCP ports.
Server
Code may set up a UDP server on port 7654 as follows:
- include
- include
- include
- include socket.h>
- include types.h>
- include in.h>
- include /* for close() for socket */
- include
int main(void){
int sock = socket(PF_INET, SOCK_DGRAM, IPPROTO_UDP);
struct sockaddr_in sa;
char buffer[1024];
size_t fromlen, recsize;
memset(&sa, 0, sizeof(sa));
sa.sin_family = AF_INET;
sa.sin_addr.s_addr = INADDR_ANY;
sa.sin_port = htons(7654);
if (-1 == bind(sock,(struct sockaddr *)&sa, sizeof(struct sockaddr)))
{
perror("error bind failed");
close(sock);
exit(EXIT_FAILURE);
}
for (;;)
{
printf ("recv test....\n");
recsize = recvfrom(sock, (void *)buffer, 1024, 0, (struct sockaddr *)&sa, &fromlen);
if (recsize 0)
fprintf(stderr, "%s\n", strerror(errno));
printf("recsize: %d\n ",recsize);
sleep(1);
printf("datagram: %s\n",buffer);
}
}
This infinite loop receives any UDP datagrams to port 7654 using
recvfrom(). It uses the parameters:
- socket
- pointer to buffer for data
- size of buffer
- flags (same as in recv or other receive socket function)
- address struct of sending peer
- length of address struct of sending peer.
Client
A simple demo to send a UDP packet containing "Hello World!" to
address 127.0.0.1, port 7654 might look like this:
- include
- include
- include
- include
- include socket.h>
- include types.h>
- include in.h>
- include /* for close() for socket */
int main(int argc, char *argv[]){
int sock;
struct sockaddr_in sa;
int bytes_sent, buffer_length;
char buffer[200];
buffer_length = snprintf(buffer, sizeof buffer, "Hello World!");
sock = socket(PF_INET, SOCK_DGRAM, IPPROTO_UDP);
if (-1 == sock) /* if socket failed to initialize, exit */
{
printf("Error Creating Socket");
exit(EXIT_FAILURE);
}
memset(&sa, 0, sizeof(sa));
sa.sin_family = AF_INET;
sa.sin_addr.s_addr = htonl(0x7F000001);
sa.sin_port = htons(7654);
bytes_sent = sendto(sock, buffer, buffer_length, 0,(struct sockaddr*)&sa, sizeof (struct sockaddr_in));
if (bytes_sent 0)
printf("Error sending packet: %s\n", strerror(errno));
close(sock); /* close the socket */
return 0;
}
In this code,
buffer provides a pointer to the data to
send, and
buffer_length specifies the size of the
buffer contents.
See also
References
- Ruby-Doc.org: Documenting the Ruby
Language
- UNIX Network Programming Volume 1, Third Edition: The
Sockets Networking API, W. Richard Stevens, Bill Fenner, Andrew M.
Rudoff, Addison Wesley, 2003.
- terminating sockets
The "de jure" standard definition of the Sockets interface is
contained in the POSIX standard, known as:
- IEEE Std. 1003.1-2001 Standard for
Information Technology -- Portable Operating System Interface
(POSIX).
- Open Group Technical Standard: Base Specifications, Issue 6,
December 2001.
- ISO/IEC 9945:2002
Information about this standard and ongoing work on it is available
from
the
Austin website.
The IPv6 extensions to the base socket API are documented in RFC
3493 and RFC 3542.
External links