Table of Contents
Assigned Reading
Chapter 10
Network Protocols
A protocol is the set of conventions (i.e. rules) governing the exchange of information between paired entities in two systems.
What Defines a Protocol?
Syntax
The syntax of a network protocol is the structure used in communicating information, such as the formatting of data and control information. Consider the transport control protocol or TCP as an example:
<tcp header image>
Lexicon
The lexicon of a network protocol is the collection of valid entries within the structure of the syntax. The lexicon is in other words, the vocabulary of the protocol. Continuing our TCP example, the TCP lexicon dictates that of the possible TCP flags, xor(RST, SYN, FIN) should be in a TCP header.
Semantics
The semantics of a network protocol is the meaning of the data or control information exchanged, including the actions that result from the data or control information. Building further on our TCP example, if a duplicate acknowledgement (i.e. ACK) is received in a TCP connection, the software implementing TCP on the end host will ignore the duplicate.
if (rx_tcp_packet->ack_num > TCP_CURRENT_ACK_NUM) { /* process packet */ } else { /* ignore packet -- it was a duplicate */ }
Timing
The timing of a network protocol is the use of a clock to internally generate events not initiated by an external stimulus, such as the arrival of a packet. An example of timing in TCP is the use of timers to detect TCP connection timeouts and to retransmit lost data.
Finite State Machine
Together, the semantics and timing of a network protocol define the protocol's behavior. This behavior can be represented by a finite state machine. Typically, the finite state machine of a network protocol is drawn as a state diagram. Consider a simplified state diagram for TCP:
<simple TCP diagram>
For those who are curious, this is the actual, far more complex state diagram for TCP:
<real TCP diagram>
Don't worry, we won't cover the actual state machine of TCP in ECE 4400/6400 However, if you're interested, ECE 4380/6380 is highly recommended.
What Features Should/Could a Protocol Provide?
Synchronization
The goal of synchronization is the maintain consistent state information between the communicating entities in two systems. In other words, synchronization allows both entities to maintain the same finite state machine. With regard to TCP, it keeps both end hosts synchronized using sequence numbers and acknowledgements.
Connection Control
Connection control maintains logical association between two communicating entities, such as:
- connection establishment
- data transfer
- flow control
- guaranteed delivery
- in-order delivery
connection termination
For example, IP only provides data transfer as a connection control mechanism. It delegates connection establishment and termination, flow control, and guaranteed and in-order delivery to other network protocols, such as TCP. TCP provides all connection control features listed above.
Transmission Services
The goal of transmission services are to provide those "nice to have" features of a network protocol that are not strictly required for basic data transfer. Think of them as options or packages when you buy a car – do you really need the M3 or will a 3 series do just fine? Transmission services can include:
- priority
- quality of service
- security
As an example, IP headers contain what is called the differentiated services code point (DSCP), formerly known as "type of service". DSCP allows communicating entities to classify packets for priority or special handling by network forwarding devices. Possible options within the DSCP field include but are not limited to "best effort", "priority", and "critical", used to signal the importance of the data in the packet. Real time streaming video and voice can use the "critical" bit to encourage forwarding devices to prioritize it.
As for security, the transport layer security protocol or TLS (often referred to by its predecessor secure sockets layer or SSL) is a protocol often used to encrypt sensitive data, as well as provide authentication prior to data transfer. It is common to find TLS used within TCP, to provide a reliable and secure end-to-end connection.
Addressing and Address Resolution
Addressing, as the name implies, provides a way to specify with whom the protocol should communicate. On the other hand, address resolution is a mechanism that allows addresses to be determined or looked up based on other information about the communicating entities.
For example, when we visit websites, a URL is typically entered into the browser, such as www.clemson.edu. This name is resolved to an IP address (assume IPv4) using the domain name system or DNS (i.e. the IP address of the domain www.clemson.edu is looked up). At this point, the end host wishing to connect to www.clemson.edu must relay it's message towards the IP address determined. Based on the routes installed on the end host, the next hop forwarding device's MAC (media access control) address is resolved using the address resolution protocol or ARP. Finally, the message to www.clemson.edu can be sent. In this example, the domain name www.clemson.edu is used to lookup the IP address of the server running the website. The IP address of the server is used to match a route and then lookup the next hop forwarding device's MAC address. Three different forms of addressing are used, and address resolution is employed twice. (Note this example has been simplified. In reality, more addressing and address resolution is likely required to actually send the message to the web server.)
Addresses can be permanent or temporary. Examples of permanent addresses include public IPs and MAC addresses. Temporary addresses can include IPs allocated using dynamic host configuration protocol (DHCP).
Some addressing schemes are flat, such as MAC addresses; others are typically hierarchical. At Clemson, if you connect to an ethernet jack, your IP address will be something like 130.127.W.X. If you connect over eduroam at Clemson, it will be 198.21.Y.Z. These are both public IP addresses that can be reached from anywhere in the world (assuming you've punched through the Clemson firewall already – automatic when you use the network). Network forwarding devices "know" that any IP that looks like 130.127.W.X or 198.2.Y.Z should be sent towards Clemson. Other institutions or Internet service providers have different IP address ranges that together create an IP address hierarchy.
Multiplexing
The act of combining data from multiple sources onto a single connection, link, or path is referred to as multiplexing. The opposite of this, or the removal of data from a single connection, link, or path and forwarding it on separate connections, links, or paths is called demultiplexing. As an example, your home WiFi router accepts data from multiple wirelessly connected hosts (e.g. your phone, laptop, toaster) and multiplexes them onto a single uplink to your modem leading to your internet service provider.
Segmentation and Reassembly
Any physical network has a minimum and maximum data transfer size. For example, most IP connections over Ethernet networks default to a maximum transmission unit or MTU of 1500 bytes. If an end host wishes to send more than 1500 bytes, the data must be broken up into chunks in a process called segmentation. This results in multiple data chunks being sent in place of the single oversized chunk. On the receiving end host, the multiple chunks of data are recombined into a single chunk once again. This process is referred to as reassembly. Note that some network protocols guarantee in-order reassembly (e.g. TCP), while others do not (e.g. UDP).
A Vast Sea of Protocols
There are many network protocols designed for a multitude of purposes. For example, as has been mentioned already, MAC, ARP, IP, and DNS help a user connect to a website. Expanding upon that example even more, HTTP (yet another protocol) is used to encode and relay website data between the web server and your browser. We've also discussed the use of the TCP and TLS protocols to reliably and securely send data between two end hosts.
There are also protocols designed specifically for transferring files, such as the file transfer protocol (FTP). Within the network, many protocols are also at work behind the scenes, such as routing information protocol (RIP), open shortest path first (OSPF), and border gateway protocol (BGP), all which assist in the synchronization and establishment of routes in the network. Diving even deeper into the network, protocols exist on single links, such as link layer discovery protocol (LLDP), which allows forwarding devices to detect and learn links. Lastly, we cannot forget about the lowest-level protocols responsible for physically transmitting the data, such as manchester encoding over twisted copper pair and frequency shift keying over coaxial cable, just to name a couple.
With the vast array of available network protocols, there is a need for logical classification and organization.
Network Architecture and Layering of Protocols
The network architecture defined the entities that enable data communications, the tasks performed by each entity, and the relationship among the entities. Layering is the division of the network architecture into logical groups or "layers" of related tasks. As a result, layering provides a hierarchical structure of network protocols and the entities that utilize them. Furthermore, layering allows for a "black box" approach, where different network protocols can be interchanged or otherwise updated or modified without modification of the adjoining higher or lower layers.
From the bottom up, each layer provides a defined set of services and guarantees to the next higher layer. Each layer communicates with the peer entity operating the same layer.
<layer and peering figure>
Protocol Layers
Physical Layer
- transmission of data symbols over physical medium
- includes bit synchronization between directly connected devices
- physical (electrical or mechanical) interface between devices
data unit = bit
Data Link Layer
- transfer of blocks of data across communication link between directly connected devices
- includes block synchronization, link error control, and link flow control
- data unit = frame
Network Layer
- transfer of data through communications network
- multiple data link protocols may occur in route between source and destination
- determines best path between source and destination (i.e. routing)
- data unit = packet
Transport Layer
- reliable end-to-end data transfer between source and destination end hosts
- end-to-end error detection and recovery
- end-to-end flow control and congestion response
- segmentation and reassembly of user data blocks
- data unit = segment
Session Layer
- establish and terminate data transfer sessions between applications
Presentation Layer
- data format translation
- encryption/decryption
Application Layer
- user/application software
How can I remember all these layers??
- Please
- Do
- Not
- Throw
- Sausage
- Pizza
- Around
[Shout out to Dr. Russell for this hard-to-forget mnemonic]
Standardization of Layers
The OSI Model
In general standardization of anything improves interoperability of products from different vendors by providing a common, "standard" baseline that all products that claim to follow the particular standard must satisfy. The International Organization for Standards (ISO) Open System Interconnection (OSI) Reference Model (commonly referred to as the OSI model) formally defines the aforementioned layers in a network protocol "stack".
<osi-model-figure>
Over time, the OSI model has proven to have many shortcomings:
- very large number of options
- different layers have various degrees of complexity
- limited "real world" experience/tests prior to establishment
As a result, strict adherence to the OSI model has had limited success in the marketplace. However, many protocols still fit nicely into it and are layered by design.
The TCP/IP Protocol Suite
TCP/IP is a set of protocols that evolved from the US Defense Advanced Research Projects Agency (DARPA) in the 1970s and was intended for use among researchers who wished to share computers across multiple universities. TCP/IP was the predecessor to the Internet and is still the predominant set of protocols used in the Internet today. The success of TCP/IP is largely attributed to the creation of the TCP/IP standards based on practical experience and on looser TCP/IP standards. The TCP/IP standards were developed through the Internet Engineering Task Force (IETF), and are less rigid than those outlined in the OSI model. TCP/IP standards are described in depth in Request for Comment (RFC) documents.
Below is a figure depicting how the TCP/IP protocol suite fits into the OSI model:
The follow figure demonstrates the use of TCP/IP from application-to-application across a network:
Note how the web browser and TCP on the end hosts traverse the entire network, while IP and Ethernet are utilized twice – once on each side of the router. The TCP/IP protocol suite is often referred to as a "stack", where each layer is pushed or popped as necessary in order to traverse each network.
Data Link Layer
- access to physical medium
- error recovery and flow control between hosts on the same subnet
- various protocols e.g. Ethernet
Internetwork Layer
- routing and forwarding of packets
- various protocols e.g. IPv4, IPv6, BGP, RIP
Transport Layer
- segmentation and reassembly
- end-to-end connection management
- end-to-end flow control
- end-to-end error detection and recovery
- various protocols e.g. TCP and UDP
Application Layer
- user application data handling
- various protocols e.g. HTTP, FTP
IEEE LAN Standards
IEEE 802 standards focus on lower layer protocols of shared media LANs. There are a number of 802 standards, each with a different purpose or designed for different types of physical LANs
IEEE 802 Standards
The following is a short list of some IEEE 802 standards you might recognize:
Standard | Description |
---|---|
802.1 | General LAN management |
802.2 | Logical Link Control (LLC) protocol |
802.3 | Ethernet |
802.5 | Token ring |
802.11 | Wireless LAN (i.e. WiFi) |
802.15.1 | Bluetooth |
802.15.4 | ZigBee |
802.16 | WiMAX |
These standards can be subdivided into Logical Link Control (LLC) and Media Access Control (MAC). The figure below shows how these categorizations fit into the OSI model:
Media Access Control (MAC)
The MAC sublayer controls access to a medium that is shared by several entities. It is designed to balance fairness and efficiency and includes link layer addressing and error detection (in the form of a checksum).
Logical Link Control (LLC)
The LLC sublayer is a point-to-point link layer protocol that resides above the MAC sublayer and the shared medium. LLC abstracts away from higher layers the contention that can be present when using a shared medium. It provides three types of service to higher layers:
- Unacknowledged connectionless mode. This mode allows an entity to send frames either to a single destination, to multiple destinations, or to all destinations on the LAN. The first is referred to as unicast, the second as multicast, and the last as broadcast.
- Connection mode. This mode uses sequence numbers to guarantee frames are (1) arrive to the destination and (2) are ordered correctly upon arrival.
- Acknowledged connectionless mode. This mode is for unicast only (otherwise there could be an undesirably large number of acknowledgements sent in return).
Placement of Layers
Each network protocol has both control information and data. The control information is typically implemented as a header with the data following this header. The data is often referred to as the payload of the header.
Note that some protocols also use a trailer at the end of the payload, such as Ethernet, which includes a trailing checksum.
f
When a network protocol belonging to a higher layer is implemented and placed within the data or payload of a lower layer network protocol, the higher layer protocol is said to be encapsulated within the lower layer. As such, a network protocol, if encapsulated is both a payload and a header + data. It is the payload of the layer below it and it is itself a network protocol with separate header and data.