A Quick Guide to TCP/IP
N.B. This article was originally written in 1996 and was intended as an introduction for mainframe personnel to the world of TCP/IP. Although it may not reflect the latest technology it is still relevant to the communication protocols currently in use. If you have no idea what TCP/IP is then it is (in the most general terms) the connection method used to transmit and receive data between computers linked by the Internet and related networks. If you would like to read a full selection of my network related documentation then please feel free to me. See the SNA page for a TCP/IPers guide to SNA.
- 1.0 General Background
- 2.0 Basic Internetworking Concepts
- 2.1 The Concept of "Layering"
- 2.2 Comparison to the OSI Model
- 2.3 Packet Movement within the TCP/IP Stack
- 2.4 Internetwork Addresses
- 2.5 Internet Gateway Connections
- 2.6 The TCP/IP Protocol Suite
- 2.7 TCP versus UDP
- 2.8 Sockets Programming, APIs and RPCs
- 2.9 Operator Commands in the TCP/IP Environment
1.0 General Background
Transmission Control Protocol/Internet Protocol (TCP/IP) is a suite of protocols designed to allow communication between networks regardless of the technologies implemented in each network. The concept that underlies the creation and development of TCP/IP is that of internetworking, a topic which has recently developed as a major focus of commercial, technological and cultural interest. However, within the area of business applications there are undoubted benefits and fundamental changes to be considered.
The explosion in network design in the late 1960s and 70s led to the development of multiple network models (e.g. packet switching, collision detection LANs, hierarchical networks, etc.), as well as the emergence of protocol layering (allowing data exchange between applications). This evolution has led to the creation of multifarious physical network designs as transports for many different protocols. Considered individually these networks are proven designs; unfortunately the downside of this vast proliferation has been the development of a great many diverse networks which are isolated from one another due to physical differences and non-standard protocols. These issues are addressed by the principle of internetworking, i.e. the definition of a set of protocols to allow inter-application communication regardless of underlying network technology and operating systems.
Contemporary to these developments the (American) Defense Advanced Research Projects Agency (DARPA) instigated research into network connectivity with a view to creating a large network. The requirement was for a network that could link sites of various platforms and operating systems and that could withstand the loss of major components (e.g. in the event of a "nuclear incident" as it is referred to in RFC 1000 - The History of the Internet & TCP/IP, more details on the history of the Internet are available here). The robust functionality of the network was provided by dynamic re-routing, i.e. automatic routing around lost hosts in the network.
In 1978 DARPA adopted TCP/IP as their standard protocol for the handling of internetwork communication, via a backbone known as ARPANET. In 1983 TCP/IP became the mandatory suite of protocols for anyone wishing to connect to the ARPANET. Originally this network could only support 256 addresses (the maximum number possible from 8 bit addressing). The adoption of different classes of networks (Class A, B or C) and addressing based on dotted decimal notation has greatly increased the number of possible addresses.
The original network has proliferated globally into a plethora of sites providing a huge array of information and services based on the ubiquitous Internet. The network itself is overseen by the Internet Activities Board (IAB) and this body sets the standards for TCP/IP. Under the IAB are several groups, two of which are charged with the future development of TCP/IP - the Internet Engineering Steering Group (IESG) and the Internet Research Task Force (IRTF). The standards that define the Internet and TCP/IP are referred to as RFCs (Request For Comments) and these are available online (go to the Internet Society website for full details). References to RFCs are commonplace in TCP/IP; we have already mentioned the text RFC 1000.
As well as achieving the original connectivity goals, TCP/IP concepts are at the core of the development of client/server based systems. The traditional mainframe has also found new areas of development along the lines of client/server relationships, as well as the emerging data-warehousing approach. TCP/IP also played a large role in the emergence of UNIX based systems and this networking capability is one of the most notable features of the UNIX systems. Whichever data-processing solution does emerge ahead of it's competitors in the next few years it would appear that TCP/IP will be fundamental to it.
2.0 BasicInternetworking Concepts
2.1 TCP/IP - The Concept of "Layering"
An important fundamental idea that defines the internetwork architecture is the concept of layering. Below is a model representing layering when applied specifically to TCP/IP.
This concept is also known as modularity. The above architectural model can be broken down as follows;
The International Standards Organisation (ISO) developed the Open Systems Interconnect (OSI) as a model for vendors. The ISO was (by the late 1970's) promoting OSI as the de facto standard for creating what it referred to as interoperability between systems between different network architectures. OSI failed to catch on in the way the ISO had hoped, partly due to the complexity of the technical specification but also due to the success of TCP/IP (TCP/IP actually had a ten year head-start in terms of development).. However, the modular approach that laid the foundation of OSI has given TCP/IP a definitive model to follow. The OSI model actually specifies more layers than TCP/IP, but both models are basically analogous (the shaded area in the table represents the TCP/IP suite of protocols).
TCP/IP Model OSI Model Function Example Protocols Application Application The "user" level, e.g. e-mail & FTP. Presentation Converts files from between formats (if required). RPC, SMTP, FTP, TFTP, DNS, NFS & telnet. Session Co-ordinates sessions between hosts in the internetwork. Transport Transport Verifies correct reception of packets & places them in the correct order (TCP does this "better" than UDP - see later). TCP, UDP & ICMP. Internetwork Network Routes packets to correct address. Also optimises routing when possible. IP, IPng, ARP & RARP. Data-Link Data-Link Splits data into packets & is hardware-based (e.g. Token-Ring or ethernet). n/a. Physical Physical Cable, radio-wave, satellite, etc. and network interface card. n/a.
As a packet is received by a host it enters the stack (i.e. the sequence of layers that make up the TCP/IP model) at the lowest point, i.e. the physical. Each layer then proceeds to "strip-out" the data it needs and passes the remaining packet (if it is required to) to the next layer up in the stack. Likewise, when the application layer creates a packet it will pass it down to the next layer where further information is added by it and every subsequent layer until it emerges onto the physical cable as a "full" packet.
Below is an example of an application to application connection, in this case it is the "put" command under FTP.
To be able to identify an internetwork host each host is assigned an address, the IP address. TCP/IP associates each host name with an IP address. In turn, the IP address is associated with a unique hardware address. This IP to hardware association is made by the Address Resolution Protocol (ARP) and can be supplemented by the Routing Information Protocol (RIP) and Open Shortest Path First (OSPF) protocols.
ARP stores this network information in ARP tables and compiles these from information held on other hosts network interface cards. This information is extracted from the network interface cards via an ARP request being generated. This is done when a physical address is found to be missing from a host's ARP table following a connection request. The ARP request is a broadcast message asking a host to respond with it's physical address. Each host maintains an ARP table and benefits from responses to all ARP requests, even ones it didn't issue.
Any one host's IP address (or internetwork address) consists of two parts;
IP Address = Network Address / Host Address
The Domain Name Systems (DNS) translates (or resolves) the host name into a numeric value (represented as dotted decimal notation). For example;
This website = www.francome.com
The same address when translated by DNS into a value recognisable to the hosts;
This website = 18.104.22.168
Each host in the internet must have access to either DNS or maintain its own hosts file (it is preferable to have both and update the hosts file with the most frequently used addresses from DNS).
Building on the previously discussed concepts, below is an example of how a LAN based user may connect to an external address through two physical networks. This can be compared with the previous architectural model. The diagram refers to IP datagrams, these are the fundamental units of information passed across the internetwork. They contain source and destination addresses, the actual data and several fields that flag the datagram disposition. As shown below, the actual gateway function is performed by the IP protocol.
The source address here is address A (network address X/local host A), the destination address B (network address Y/local host B).
When referring to TCP/IP it is important to bear in mind that it is a suite of protocols, services and applications. The TCP part, in fact, may not necessarily be involved in the transmission of a packet. Below are listed some of the more important protocols, however this list is by no means comprehensive;
There are also two gateway (or router) specific protocols;
At the transport layer TCP and UDP are usually the most active protocols. RPC (Remote Procedure Calls) traffic can use either protocol and each one has relative strengths. UDP messages are smaller and more efficient than TCP messages. UDP does not provide error checking or guaranteed delivery, under UDP these operations are handled by the application layer. TCP, by contrast, does provide these controls and therefore the application does not have to. However TCP packets are considerably larger and result in greater network traffic. Refer to the next section for further information regarding RPC and sockets programming.
When a TCP/IP is started on one host and asks for a connection to a remote host the following steps are invoked;
1. At the application level your client (the TCP/IP application running on your host) creates a socket. A socket is required on each host for a valid connection (i.e. two sockets per connection) and each host will keep track of the sockets it is currently using. Each socket contains -
2. The local host sends a connection request to the server application on the remote host, asking for permission to use the remote service. Within this initial request will be the remote host's IP address, the port number of the requested service and the local socket (i.e. the originating socket from the first host) number.
3. The remote host creates his own socket to use for the duration of the connection. The remote host then replies with an acknowledgement which contains it's own socket information, directed back to the originating socket address.
4. A connection has now been established and confirmed. The client application on the originating host can then use the remote server application. When the connection is terminated both hosts will then recycle the port addresses on which the sockets had existed.
APIs and RPCs
Once a connection is established the actual data transfer can take place. The Application Programming Interface (API) describes the interfaces a program must use to call TCP/IP functions. Functions are standard programs supplied with the TCP/IP software (although you can supplement these with your own bespoke functions). These functions are stored in libraries; under Windows and NetWare these are termed dynamic link libraries (DLLs). Parameters may be required for a chosen function. For each function the API description lists the number of parameters required, in which order the parameters must appear and the data-type of each parameter.
In addition to APIs, Remote Procedure Calls (RPCs) function in much the same way but on different hosts, i.e. the calling program and the called routine reside on different hosts. An RPC follows a similar routine to that described above for sockets with the communication optimised for better performance. RPCs lie at the heart of the client/server relationship.
Sockets, APIs and RPCs are very much part of the programmer's responsibility and unless you are involved in application development in any detail then knowledge of these topics will not be prerequisite to a good understanding of TCP/IP.
The TCP/IP suite does provide various commands and methods that can be used in problem determination exercises (which is handy, they are used frequently). Any user can employ these commands to check what the network is (or is not) doing. The most often used commands are;