CCN_Unit2_Slides (3).pdf
Document Details
Tags
Full Transcript
COMPUTER COMMUNICATION NETWORKS Principles of reliable data transfer: Stop and Wait protocols Department of Electronics and Communication Engineering 2 COMPUTER COMMUNICATION NETWORKS Principles of reliable data transfer Introduction Pac...
COMPUTER COMMUNICATION NETWORKS Principles of reliable data transfer: Stop and Wait protocols Department of Electronics and Communication Engineering 2 COMPUTER COMMUNICATION NETWORKS Principles of reliable data transfer Introduction Packet loss is said to occur due to: Corrupt packet discarded at the receiver Packet was discarded at a router due to lack of buffer space Packets experience long queuing delays in a router Reliable data transfer (RDT) It is a fool-proof mechanism for overcoming packet loss It requires a connection oriented approach (i.e., sender and receiver must agree to some parameters for monitoring packets using some handshaking) Throughput and delay may be compromised By default, UDP does not guarantee reliable data transfer TCP offers reliable rata transfer (the only QoS guaranteed) 3 COMPUTER COMMUNICATION NETWORKS Principles of reliable data transfer System model Consider two hosts A (Sender) and B (Receiver) which wish to communicate over a network reliably Reliable data transfer between hosts A and B is achieved when they agree to monitor the packets exchanged and notify one another if packet losses are detected This is accomplished by some handshaking between A and B before data packets are exchanged We build the principles of reliable data transfer systematically We start with an ideal case and incrementally add complexity to the transport layer protocol Hereafter, the transport layer protocol is referred to as RDT protocol 4 COMPUTER COMMUNICATION NETWORKS Principles of reliable data transfer Overview of the stages of development of the RDT protocol Stop and wait RDT protocols: Host A sends one packet at a time Host A waits for an acknowledgement from the host B to transmit the next packet We will see four versions of the stop and wait RDT protocols with each version addressing one limitation of its predecessor Pipelining RDT protocols: Host A sends multiple packets to host B at a time Host A waits for the acknowledgements from host B within a fixed time interval (referred to as a Timeout) Upon learning successfully delivery of the packets in previous round, the next batch of packets are transmitted 5 COMPUTER COMMUNICATION NETWORKS Principles of reliable data transfer Pipelining RDT protocols (contd.): We will see two versions of the pipelining RDT protocols (GBN and SR). Compared to the stop and wait RDT protocols, these pipelining RDT protocols provide better link utilization Transmission control protocol (TCP): It is a hybrid of the above pipelining RDT protocols but with sophistication of its own. TCP is robust compared the above pipelining RDT protocols. TCP makes host A adaptive to network congestion and packet overflow problems that may occur at host B. 6 COMPUTER COMMUNICATION NETWORKS Principles of reliable data transfer Stop and wait RDT protocol – Version 1 7 COMPUTER COMMUNICATION NETWORKS Principles of reliable data transfer Stop and wait RDT protocol – Version 1 (RDT1.0): No bit errors and no packet delays What should be the role of the RDT protocol in this context? An application in host A generates a message. Host A segments the message into several packets. Transport layer in host A inserts the source and destination port numbers (i.e., encapsulation) and passes it to network layer. The network layer protocol delivers the datagram to host B. 8 COMPUTER COMMUNICATION NETWORKS Principles of reliable data transfer Stop and wait RDT protocol – Version 2 (RDT 2.0): Bit errors in packet transmissions from A to B but no packet delays. How should RDT1.0 be modified to handle the above issue? Host A introduces checksum (i.e., error detection code) into the packet before passing it to the network layer. Host B verifies if packet is corrupt or not. If packet is corrupt, then host B sends an NAK packet; otherwise host B sends an ACK packet. If host A receives NAK packet, it retransmits the old packet. If host A receives ACK packet, it retransmits next packet with new checksum. 9 COMPUTER COMMUNICATION NETWORKS Principles of reliable data transfer Stop and wait RDT protocol – Version 3 (RDT2.2): Bit errors in two way packet transmissions but no packet delays. Now, it is difficult to distinguish between old packets and new packets exchanged between A and B. Two types of packets are used by host B (ACK and NAK) which is unnecessary given that they may get corrupted as well How should RDT2.0 be modified to handle the above issues? Hosts introduce sequence numbers to identify packets (0 and 1 used alternatively) besides using the checksum to detect errors. Host B just uses ACK packets with sequence number (0 or 1) Example: Host A sent packet 0, then it deems the transmission successful only when an ACK packet with sequence number 0. Otherwise, Host A retransmits the packet 0. 10 COMPUTER COMMUNICATION NETWORKS Principles of reliable data transfer Stop and wait RDT protocol – Version 4 (RDT3.0): Alternating Bit Protocol Bit errors and packet delays occur in two-way packet transmissions. Now, host A may end up waiting endlessly hoping the packet is stuck in some intermediate router’s queue How should RDT2.2 be modified to handle the above issues? Host A starts a timer as soon as it transmitted a packet using a sequence number and checksum. Host B replies just as in RDT2.2 depending on whether the received packet is corrupt, new or old.In case ACK packet is lost/corrupt/old, host A retransmits after timer expires. Otherwise, host A sends the next packet and starts timer 11 COMPUTER COMMUNICATION NETWORKS Principles of reliable data transfer 12 COMPUTER COMMUNICATION NETWORKS Principles of reliable data transfer 13 COMPUTER COMMUNICATION NETWORKS Principles of reliable data transfer: Pipelining, GBN and SR Department of Electronics and Communication Engineering 2 COMPUTER COMMUNICATION NETWORKS Pipelining Rather than operate in a stop-and-wait manner, the sender is allowed to send multiple packets without waiting for acknowledgments. If the sender is allowed to transmit three packets before having to wait for acknowledgments, the utilization of the sender is essentially tripled. Since the many in-transit sender-to-receiver packets can be visualized as filling a pipeline, this technique is known as pipelining. 3 COMPUTER COMMUNICATION NETWORKS Pipelining 4 COMPUTER COMMUNICATION NETWORKS Pipelining Concept of pipelining Stop-and-Wait RDT 5 COMPUTER COMMUNICATION NETWORKS Pipelining Concept of pipelining 6 COMPUTER COMMUNICATION NETWORKS Pipelining Changes due to pipelining. Range of sequence numbers is increased. The number of packets in the pipeline is fixed. Use transmit and receive buffers. Timers are used for ascertaining packet loss Types of pipelining 1. Go-back N protocol. 2. Selective repeat protocol 7 COMPUTER COMMUNICATION NETWORKS GBN 8 COMPUTER COMMUNICATION NETWORKS GBN Uses k-bit sequence numbers The range of sequence numbers is given by [0, 2k-1] 9 COMPUTER COMMUNICATION NETWORKS Pipelining Concept of GBN 10 COMPUTER COMMUNICATION NETWORKS Pipelining Concept of GBN 11 COMPUTER COMMUNICATION NETWORKS 12 COMPUTER COMMUNICATION NETWORKS Pipelining Timing diagram of Go Back N and SR (seq. nos. 0-7 and transit at most 4 packets) Scenario 1: No packet loss GBN: Given figure For SR: Same figure with individual timers for each packet 13 COMPUTER COMMUNICATION NETWORKS Pipelining Timing diagram of Go Back N (seq. nos. 0-7 and transit at most 4 packets) Corrupt packets: Case 1: In-order ACK repeated Case 2: Corrupt ACK followed by correct ACK is taken as cumulative acknowledgement 14 COMPUTER COMMUNICATION NETWORKS Pipelining Concept of SR GBN fills the pipeline When the window size and bandwidth-delay product are both large, many packets can be in the pipeline. A single packet error can thus cause GBN to retransmit a large number of packets, many unnecessarily. As the probability of channel errors increases, the pipeline can become filled with retransmissions. Selective-Repeat protocols avoid unnecessary retransmissions by having the sender retransmit only those packets that it suspects were received in error (that is, were lost or corrupted) at the receiver. 15 COMPUTER COMMUNICATION NETWORKS 16 COMPUTER COMMUNICATION NETWORKS 17 COMPUTER COMMUNICATION NETWORKS 18 COMPUTER COMMUNICATION NETWORKS Pipelining Timing diagram of SR (seq. nos. 0-7 and transit at most 4 packets) Only some timers depicted Corrupt packets: Case 1: In-order ACK repeated Case 2: Corrupt ACK followed by correct ACK but no cumulative acknowledgement 19 COMPUTER COMMUNICATION NETWORKS Pipelining Summary: Features of pipelining protocol Sequence numbers are used to distinguish between old packets and new packets. Checksum is used for error detection. Sliding windows and buffers are maintained by sender and receiver to keep track of packets in transit. Sequence number of last correctly received (in-order) packet is maintained. Timers are used by sender to limit the occurrence of packet retransmissions. Choosing small value for the timer causes more retransmissions while a large value causes less throughput 20 COMPUTER COMMUNICATION NETWORKS Pipelining Summary: Comparison of Go-back N and Selective Repeat GBN takes cumulative acknowledgements whereas SR doesn’t. Suppose some ACKs were missing, GBN retransmits all packets under the sliding window whereas SR retransmits selectively. A receiver using GBN does not store out-of-order packets whereas it does store out- of-order when using SR. Sender and receiver sliding windows are in sync under GBN whereas under SR they can be out-of-sync. Both of them retransmit unacknowledged packets upon timeout 21 COMPUTER COMMUNICATION NETWORKS Pipelining Summary: Limitations of Go-back N and Selective Repeat Timer values are fixed arbitrarily. Transmission rate (i.e., sliding window length) is fixed. Due to the above two limitations these protocols can neither exploit low network congestion nor can prevent causing network congestion. Message segmentation is performed arbitrarily. Buffer overflow may occur at the receiver. Two-way data transmission cannot be implemented. Only one pair of processes between sender-receiver can communicate at any time. Hence, we need an adaptive protocol which overcomes the above limitations.. The answer is Transmission Control Protocol (TCP) 22 COMPUTER COMMUNICATION NETWORKS Pipelining 1 A. Consider using a stop and wait protocol between two hosts who are connected by a direct link of rate 1 Mbps. Let the one way propagation delay be 99.5 ms. For a packet of size 1000 bits, determine the link utilization (%) and the throughput of the sender (bits/s). Explain your result with appropriate formulae. 23 COMPUTER COMMUNICATION NETWORKS Pipelining Link utilization is expressed as transmission delay transmission delay + RTT 1000 106 = 6 −3 = 0.5 % 1000 10 + 2 × 99.5 × 10 Throughput of the sender is packet size = 5000 bps transmission delay + RTT 24 COMPUTER COMMUNICATION NETWORKS Pipelining 1B. Suppose a pipelining protocol is used instead of a stop and wait protocol in the question 1A. Calculate the number of packets to achieve 100% link utilization. What will be the number of bits required to represent the sequence numbers in this case? 25 COMPUTER COMMUNICATION NETWORKS Pipelining One packet achieves only 0.5% link utilization. To achieve 100% utilization, the number of packets to be transmitted is given by 100% × 1 0.5% = 200 The number of bits required to represent the sequence numbers is log 2 200 = 7.6439 ⟹ 8 bits 26 COMPUTER COMMUNICATION NETWORKS TCP Segment Format - details Department of Electronics and Communication Engineering 2 COMPUTER COMMUNICATION NETWORKS TCP Segment Format - details The maximum amount of data that can be grabbed and placed in a segment is limited by the maximum segment size (MSS). It defines the largest size of the packet that can be transmitted as a single entity in a network connection. The size of the MTU dictates the amount of data that can be transmitted in bytes over a network. MSS is set to ensure that a TCP segment (when encapsulated in an IP datagram) plus the TCP/IP header length (typically 40 bytes) will fit into a single link-layer frame. 3 COMPUTER COMMUNICATION NETWORKS TCP Segment Format - details MSS is the maximum amount of application-layer data in the segment, not the maximum size of the TCP segment including headers. 4 COMPUTER COMMUNICATION NETWORKS TCP Segment Format - details 5 COMPUTER COMMUNICATION NETWORKS TCP Segment Format - details 32 bits source port # dest port # segment seq #: counting ACK: seq # of next expected sequence number bytes of data into byte stream byte; A bit: this is an ACK (not segments!) acknowledgement number head not length (of TCP header) len usedC E U A P R S F receive window flow control: # bytes Urg data pointer receiver willing to accept options (variable length) C, E: congestion notification TCP options application data sent by RST, SYN, FIN: connection data application into management (variable length) TCP socket 6 COMPUTER COMMUNICATION NETWORKS TCP Segment Format - details The 32-bit sequence number field and the 32-bit acknowledgment number field are used by the TCP sender and receiver in implementing a reliable data transfer service. The 16-bit receive window field is used for flow control. The 4-bit header length field specifies the length of the TCP header in 32-bit words. The TCP header can be of variable length due to the TCP options field. (Typically, the options field is empty, so that the length of the typical TCP header is 20 bytes.) 7 COMPUTER COMMUNICATION NETWORKS TCP Segment Format - details The optional and variable-length options field is used when a sender and receiver negotiate the maximum segment size (MSS) or as a window scaling factor for use in high-speed networks. A timestamping option is also defined. (See RFC 854 and RFC 1323 for additional details) 8 COMPUTER COMMUNICATION NETWORKS TCP Segment Format - details The flag field contains 6 bits. ○ The ACK bit is used to indicate that the value carried in the acknowledgment field is valid; that is, the segment contains an acknowledgment for a segment that has been successfully received. ○ The RST,SYN, and FIN bits are used for connection setup and teardown. ○ The CWR and ECE bits are used in explicit congestion notification. ○ Setting the PSH bit indicates that the receiver should pass the data to the upper layer immediately. ○ The URG bit is used to indicate that there is data in this segment that the sending- side upper-layer entity has marked as “urgent.” The location of the last byte of this urgent data is indicated by the 16-bit urgent data pointer field. TCP must inform the receiving-side upper layer entity when urgent data exists and pass it a pointer to the end of the urgent data 9 COMPUTER COMMUNICATION NETWORKS TCP SEQUENCE Numbers and ACKs Host A Host B NOTE: one byte segment is being exchanged between sender and receiver User types‘C’ Seq=42, ACK=79, data = ‘C’ host ACKs receipt of‘C’, echoes back ‘C’ Seq=79, ACK=43, data = ‘C’ host ACKs receipt of echoed ‘C’ Seq=43, ACK=80 simple telnet scenario 10 COMPUTER COMMUNICATION NETWORKS TCP SEQUENCE Numbers and ACKs Sequence numbers: A message may be divided based on the MSS into TCP segments. The 1st byte in each segment is assigned a unique sequence number before being transmitted. For the 1st segment a random 32-bit number is assigned. The received segments are reordered using these numbers 11 COMPUTER COMMUNICATION NETWORKS TCP SEQUENCE Numbers and ACKs Acknowledgement numbers Used by receiver to inform the sender of missing segments. Receiver acknowledges each received segment. Receiver stores out-of-order segments. Replies with acknowledgement of the last correctly received in-sequence segment. Cumulative acknowledgements are accepted by the sender in case out-of-order acknowledgements are received 12 COMPUTER COMMUNICATION NETWORKS TCP SEQUENCE Numbers and ACKs Suppose that Host A has received all bytes numbered 0 through 535 from B and suppose that it is about to send a segment to Host B. Host A is waiting for byte 536 and all the subsequent bytes in Host B’s data stream. So Host A puts 536 in the acknowledgment number field of the segment it sends to B. 13 COMPUTER COMMUNICATION NETWORKS TCP SEQUENCE Numbers and ACKs Suppose that Host A has received one segment from Host B containing bytes 0 through 535 and another segment containing bytes 900 through 1,000. For some reason Host A has not yet received bytes 536 through 899. In this example, Host A is still waiting for byte 536 (and beyond) in order to re-create B’s data stream. Thus, A’s next segment to B will contain 536 in the acknowledgment number field. Because TCP only acknowledges bytes up to the first missing byte in the stream, TCP is said to provide cumulative acknowledgments 14 COMPUTER COMMUNICATION NETWORKS TCP SEQUENCE Numbers and ACKs Host A received the third segment (bytes 900 through 1,000) before receiving the second segment (bytes 536 through 899). Thus, the third segment arrived out of order. The subtle issue is: What does a host do when it receives out-of-order segments in a TCP connection? Depends upon the people programming a TCP implementation. Two choices: (1) the receiver immediately discards out-of-order segments (which, can simplify receiver design), or (2) the receiver keeps the out-of-order bytes and waits for the missing bytes to fill in the gaps. (which is more efficient in terms of network bandwidth, and is the approach taken in practice. 15 COMPUTER COMMUNICATION NETWORKS TCP SEQUENCE Numbers and ACKs Not zero 16 COMPUTER COMMUNICATION NETWORKS TCP SEQUENCE Numbers and ACKs The acknowledgment for client-to-server data is carried in a segment carrying server-to-client data; this acknowledgment is said to be piggybacked on the server-to-client data segment. The segment has 80 in the acknowledgment number field because the client has received the stream of bytes up through byte sequence number 79 and it is now waiting for bytes 80 onward. Because TCP has a sequence number field, the segment needs to have some sequence number. This segment has an empty data field (that is, the acknowledgment is not being piggybacked with any client-to-server 17 data). COMPUTER COMMUNICATION NETWORKS Round trip time estimation and timeout Department of Electronics and Communication Engineering 2 TCP: Reliable data transfer Using concepts like GBN and SR, TCP we incorporate aspects such as Pipelining, Timers and Cumulative acknowledgements Therefore, the important parameters for reliable data transfer include Sequence numbers, Acknowledgement numbers, Sliding window size and Timeout interval However, TCP’s retransmission strategy is based on network congestion Further, the timers and transmission rate (i.e., sliding window aka congestion window variation) are adaptive TCP also considers flow control, support for multi- process communication and two way data transfer between hosts 3 TCP: Reliable data transfer Timeout interval estimation: Timeout interval depends on round trip time (RTT) RTT refers to the time taken to get an acknowledgement since the segment was passed to network layer RTT is measured for one of the TCP segments at a time Instantaneous value of RTT is denoted as SampleRTT RTT is measured only for freshly transmitted segments Segments are chosen randomly Time averaged statistics are generated for the RTTs Mean value is denoted as EstimatedRTT and standard deviation is denoted as DevRTT Timeout interval is chosen randomly based on the time 4 average value and standard deviation of the RTTs TCP: Reliable data transfer Timeout interval estimation (contd.): Figure: RTT versus transmission rounds 5 TCP: Reliable data transfer Timeout interval estimation (contd.): Exponential weighted moving average model to update mean and standard deviation of RTT Timeout interval for next round, following successful transmissions in the present round, is updated as Recommended values of α and β are 0.125 and 0.25 6 respectively TCP: Reliable data transfer Procedure for updating Timeout interval Initialize the Timeout interval (i.e., timer expiry) to 1 second Select a fresh segment and measure its SampleRTT If all acknowledgement are received before timer expired, then Timeout interval is updated for next round using the equations in previous slide If timeout occurs (i.e., timer expired but at least one acknowledgement missing), the Timeout interval is doubled Packets without acknowledgements in previous round are retransmitted and timer is set to the new Timeout interval The above steps repeat as TCP segments with payload 7 are sent Practice problems Numerical #18 Assume five samples of RTT were obtained as 106ms, 120ms, 140ms, 90ms and 115ms. Assume the initial values of EstimatedRTT and DevRTT as 100ms and 5ms respectively. Calculate the Timeout interval for each transmission round using the corresponding values of SampleRTT given above. Consider two cases: a) No packet loss in any round and b) Packet loss occurs only in 3rd round. 8 TCP: Reliable data transfer Transmissions during a TCP connection (simple scenarios): A congestion window indicates the batch of segments ready for transmission Assign sequence numbers to the segments The segments are transmitted and one timer is initiated Each transmitted segment should be acknowledged (cumulative acknowledgement is allowed) Acknowledgement number assigned at the receiver 9 COMPUTER COMMUNICATION NETWORKS Reliable data transfer under TCP Department of Electronics and Communication Engineering 2 TCP: Reliable data transfer Suppose acknowledgement was lost timeout Retransmit the segment and double the timer compared to previous value 3 TCP: Reliable data transfer Suppose acknowledgements arrived after timeout Retransmit the old segment and double the timer compared to previous value No need to retransmit 4 segment with seq. no. TCP: Reliable data transfer Suppose acknowledgement for 92 was lost but acknowledgement for 100 arrived before timeout. No retransmission : Cumulative acknowledgement! Errors happen due to network congestion. We invoke the congestion control algorithm to adapt congestion window (covered 5 later) TCP: Reliable data transfer Use of duplicate ACKs: Segment can arrive out-of-order, receiver follows conservative approach when acknowledging Receiver waits (e.g., 500ms) before sending ACK Still, if out-of-order segments arrivals occur, receiver sends duplicate ACK The duplicate ACK corresponds to the last correctly received segment If number of duplicate ACKs received by sender within timeout equals 3 then sender terminates its timer abruptly Sender begins fast retransmit of the missing segment 6 TCP: Reliable data transfer Fast retransmit: Retransmitting the missing segment before the segment’s timer expires 7 COMPUTER COMMUNICATION NETWORKS Flow control and TCP connection management Department of Electronics and Communication Engineering 2 TCP: Flow control Each host allots a buffer to store the incoming segments The segments are accessed by the application layer When application reads the data slower than the rate at which segments were received into the buffer then packet overflow happens This flow of incoming segments can be notified to the sender using the receive buffer field in the TCP header Because TCP is full-duplex, the sender at each side of the connection maintains a distinct receive window Consider host A is sending a large file to host B and B has allocated a receive buffer of size RcvBuffer 3 TCP: Flow control LastByteRead: seq. no. of the last byte read from the receive buffer by the application process in B LastByteRcvd: seq. no. of the last byte in the data stream that arrived from the network and placed in the receive buffer at B 4 TCP: Flow control Condition to prevent overflow at the receiver buffer of B The receive window, denoted by rwnd, is given by 5 TCP: Flow control The receiver sends the available buffer space (i.e., receive window rwnd) to the sender in its acknowledgements The sender maintains the following information about the TCP connection, denoted as LastByteSent and LastByteAcked LastByteSent – LastByteAcked gives the amount of segments in transit as per host A (sender) So to ensure flow control (i.e., preventing buffer overflow at host B (receiver) the below condition must be maintained When receive buffer becomes 0, host A (sender) sends 1 byte segments which will ensure host B keeps acknowledging 6 TCP connection management First and second segments have a TCP header > 20 bytes The maximum segment size (e.g., MSS is 1460 bytes for Ethernet) an the window scaling are decided in this stage When client prepares the SYN segment, it chooses a random sequence number (denoted by client_isn) but sets acknowledgement number to 0 When server prepares the SYNACK segment, it chooses a random sequence number (server_isn) but sets acknowledgement number to client_isn+1 After receiving SYNACK segment, client allocates buffer After receiving the ACK segment, the server allocates buffer and parameters for sliding window 7 TCP connection management SYN flag set SYN flag set & ACK flag set SYN flag reset & ACK flag set Client allocates buffer Server allocates buffer TCP connection is ready 8 TCP connection management FIN flag set by client FIN flag set by server Server releases buffer and closes TCP connection Client releases buffer and closes TCP connection 9 TCP connection management States seen by client 10 TCP connection management States seen by server 11 COMPUTER COMMUNICATION NETWORKS Principles of congestion control Department of Electronics and Communication Engineering 2 Principles of congestion control Congestion refers to a condition where the network experiences flow rates exceeding their transmission rates In such a situation, the routers could experience queue saturations leading to some packets being dropped TCP implemented on hosts detects the packet loss using timeout events (i.e., missing acknowledgements) A higher transmission rate by a host could lead to higher packet loss (i.e., low throughput) when congestion occurs in the network Luckily, TCP implements congestion control algorithm to minimize the impact of network congestion on the application performance 3 Principles of congestion control Let’s see some illustrative examples Scenario 1: Two Senders, a Router with Infinite Buffers Hosts A and B transmit at a rate λin of each and link rate is R Router allocates link rate equally to the two packet flows 4 Principles of congestion control When the router’s buffer is infinite, the maximum rate achieved (i.e., throughput) by any packet flow is R/2 Queuing delay increases with λin and saturates when λin ≥ R/2 No packet loss but queuing delay grows with λin 5 Principles of congestion control Scenario 2: Two Senders and a Router with Finite Buffers The two flows and the link rate are same as in Scenario 1 λ'in is called offered rate 6 Principles of congestion control Case 1: Senders ensure λ’in = λin is always below R/2 Case 2: Suppose senders transmit at λ’in ≥ R/2 but throughput λout = R/3 Case 3: Suppose retransmission occur for every transmitted packet (i.e., λin = λ’in/2) No packet loss in case 1, Packets are lost in cases 2 & 3 when λ’in ≥ R/2 7 Principles of congestion control Scenario3: Multi-hop flows, four routers with finite buffer 8 Principles of congestion control Scenario3: Multi-hop flows, four routers with finite buffer Consider 4 packet flows AC, BD, CA and DB In a multihop scenario, when a packet is dropped by a router, the work done by all upstream routers up to that point is wasted For example, at router R2, flow BD traverses one hop whereas flow AC traverses two hops. Suppose the flow AC was successfully delivered by router R1 but dropped at R2 due to the high transmission rate of BD Thus, while AC was not the cause of congestion, it still suffers packet loss and retransmit. 9 Principles of congestion control Scenario3 (contd.) In the worst case, retransmissions for the flow AC may never get through because packets from B are always present at R2. Thus throughput of AC can approach zero! This could have been alleviated had one of the following occurred R1 delays the transmission of the flow AC due to the congestion at R2 Alternatively, A defers the packet transmission due to the congestion at R2 For either of the cases to happen, we need some form of congestion notification or detection mechanism 10 Principles of congestion control Approaches to Congestion control: Two broad approaches End-to-end congestion control: The end system must infer a congestion event (where congestion occurs is irrelevant). The end system reduces the rate of transmission by decreasing the sliding window TCP uses this approach where congestion is inferred from timeout events or triple duplicate ACK events Network assisted congestion control: A router explicitly sends a notification packet to the sender about the congestion (direct) Alternatively, the router sends a choke packet to the receiver and the receiver in turn notifies the sender (indirect) 11 Principles of congestion control The network assisted congestion control was used in the ATM network architecture, IBM’s SNA architecture and DECs DECnet architecture It is not popular in TCP/IP model used in the internet 12 TCP: Congestion control Transmission rate depends on congestion window and RTT Congestion window in TCP is adaptive (i.e., it increases and decreases based on network congestion after each round) Congestion window (aka sliding window and denoted by cwnd) was fixed arbitrarily under both GBN and SR cwnd is expressed in bytes (LastByteSent – LastByteAcked) TCP segment length is called maximum segment size (MSS) Note that MSS is negotiated by sender and receiver using SYN and SYNACK segments via Options field in the TCP header Packet length (L) transmitted = MSS + IP header + link layer header 13 COMPUTER COMMUNICATION NETWORKS Classic TCP congestion control Department of Electronics and Communication Engineering 2 TCP: Congestion control Congestion is detected by a sender based on two events Timer expires and some acknowledgements were not received Triple duplicate ACKs were received Congestion control algorithm modifies the congestion window size based on the presence or absence of packet loss When no packet losses are detected, then the sender increases the value of cwnd (i.e., sender’s rate increases) When packet losses are detected, then the sender decreases the value of cwnd (i.e., sender’s rate decreases) Classical TCP uses a conservative strategy in exploring network congestion (i.e., probing available bandwidth) 3 TCP: Congestion control The congestion control algorithm is described in RFC5681 Congestion control algorithm has three phases: Slow start (present in TCP Reno and TCP Tahoe) Congestion avoidance (present in TCP Reno and TCP Tahoe) Fast recovery (present only in TCP Reno) Slow start and congestion avoidance phases are implemented one after the other upon occurrence of time out events Fast recovery is implemented in TCP Reno upon detecting triple duplicate ACK event TCP Tahoe is oblivious to triple duplicate ACK events, hence it does not implement fast recovery phase 4 TCP: Congestion control Slow start phase Note that the RTT(i.e., timeout interval set by the sender) is based on the equations given slides 220- 221 As long as there is no time out event, the slow start phase aggressively increases the sender’s rate up to some threshold 5 TCP: Congestion control Slow start phase 1. Initial cwnd is 1 MSS 2. For every new acknowledgement received in the transmission round, cwnd increases by 1 MSS In other words, when all packets in a given transmission round are acknowledged, then the cwnd value of the next round is double the value of cwnd of the current round (i.e., binary exponential increase) 3. When packet loss is detected due to timeout, a slow start threshold (ssthresh) is first calculated and then slow start phase repeats (i.e., go to step 1) ssthresh = 0.5×cwnd and then cwnd = 1 MSS 6 TCP: Congestion control Slow start phase (contd.): 5. During the slow start phase following the timeout event, when cwnd equals ssthresh then transition into congestion avoidance phase takes place 6. Suppose a triple duplicate ACK event is detected during any transmission round, then TCP Reno implements fast recovery phase after the following calculation ssthresh=0.5×cwnd and then cwnd=ssthresh+3×MSS Note that the +3×MSS is included to account for the increment in cwnd corresponding to each duplicate acknowledgement 7 TCP: Congestion control Congestion avoidance phase: 1. The congestion avoidance (CA) phase is initiated when the cwnd value under slow start phase becomes greater than equal to the ssthresh value 2. During the CA phase, when the cwnd is incremented by 1/cwnd for every new acknowledgement received In other words, when all transmitted packets in the given round are acknowledged the cwnd value is incremented by 1 4. A next segment (if waiting in the buffer) is transmitted after receiving a new acknowledgement 5. When timeout event is detected, a new value of ssthresh is calculated (i.e., 0.5×cwnd) and slow start phase commences 8 TCP: Congestion control Congestion avoidance phase (contd.): 5. When a triple duplicate ACK event is detected during the congestion avoidance phase, TCP Reno implements the fast recovery phase after the following calculation ssthresh=0.5×cwnd and then cwnd=ssthresh+3×MSS 9 TCP: Congestion control Fast recovery phase (TCP Reno only): 1. When a triple duplicate ACK event is detected, the timer is terminated immediately and the oldest unacknowledged packet is transmitted immediately Refer to slide 188 (fast retransmit strategy) 2. Then the acknowledgement for this retransmitted packet is awaited by the sender 3. Suppose the transmission is successful, the algorithm transitions to congestion avoidance phase 4. Suppose triple duplicate ACK event is again detected, the algorithm commences slow start phase after the following calculation and retransmits the missing segment ssthresh = 0.5×cwnd and then cwnd = 1 MSS 10 11 TCP: Congestion control 12 TCP: Congestion control The behaviour of TCP can be analyzed in a simplified manner if we ignore the slow start phase (i.e., no timeout events) The sender’s rate increases linearly and decreases by a factor of two when triple duplicate ACK event occurs Hence, we can dub TCP’s congestion control as additive increase multiplicative decrease (AIMD) 13 Practice problems Numerical #19 Consider sending a large file from a host to another over a TCP connection that has no loss. Suppose TCP uses AIMD for its congestion control without slow start. Assuming cwnd increases by 1 MSS every time a batch of ACKs is received and assuming approximately constant round-trip times, how long does it take for cwnd increase from 6 MSS to 12 MSS (assuming no loss events)? How long did it take to transmit from the 6th segment to 12th segment? 14 Practice problems Numerical #20: Consider a 10Mbps link between the sender and the receiver. Suppose that this link is the only congested link between the sending and receiving hosts. Assume that using TCP, the sender has a huge file to send to the receiver, and the receiver’s receive buffer is much larger than the congestion window. We also make the following assumptions: each TCP segment size is 1500 bytes; the two-way propagation delay of this connection is 150 ms; and this TCP connection is always in congestion avoidance phase. What is the maximum window size (in segments) that this TCP connection can achieve? What is the mean congestion window size (in segments) and the mean throughput (in bps) before any congestion occurs? How long would it take for this TCP connection to reach its maximum window again after recovering from a packet loss? 15 Practice problems Numerical #21: Consider 4 packets, each having 100 bytes, are transmitted by host A to host B using TCP with only congestion avoidance. Assume TCP connection was already complete. Let the sequence numbers of host A and host B to be 1 and 1000 respectively, for subsequent communication. Suppose third packet is corrupted during transmission and the acknowledgement for second packet is lost. Draw a timing diagram including the retransmissions. Assume retransmissions are successful. Timing diagram must depict sequence number and acknowledgement number for each packet transaction. 16 Practice problems Numerical #21 (contd.): Consider 4 packets, each having 100 bytes, are transmitted by host A to host B using TCP Reno. Assume TCP connection was already complete. Let the sequence numbers of host A and host B to be 1 and 1000 respectively, for subsequent communication. Sequence numbers from B to A can be ignored. Suppose second packet is corrupted during transmission but others are received correctly by host B. Draw a timing diagram including the retransmissions. Assume retransmissions are successful. Timing diagram must depict sequence number and acknowledgement number for each packet transaction. 17 COMPUTER COMMUNICATION NETWORKS IPv4 Datagram format Department of Electronics and Communication Engineering 2 Network layer functions Host H1 sends transport layer segments to H2 via a path of routers. Routers implement up to Layer 3 3 Network layer functions Routers are oblivious of the transport layer services which support the application This is analogous to a user sending a letter using postal service. However, the postal department is not concerned about the contents of the letter or whether the recipient should reply Therefore, the transport layer in the source host passes the segments to its network layer Network layer prepares datagrams by inserting an IP header In other words, segments are encapsulated into datagrams Recall that there are further encapsulation and decapsulation The IP headers are sufficient for the routers to guide the datagrams until they reach the destination host 4 Network layer functions Functions at host level Obtaining IP address for host Converts segments to datagrams with the specified IP version Error detection of IP header Functions at network level Assigning address to hosts Forwarding datagrams Switching operation within a router Routing in the internet 5 IPv4 – Datagram format Specified in RFC791 6 IPv4 – Datagram format Version: 4-bit number specifying IP version 4. Routers identify the datagram type using this field. Header length: 4-bit field to give the size of the IP header. It is expressed as 32-bit words. When options field is (rarely) present the length exceeds 20 bytes. Type of service: 8-bit field for providing differentiated services (e.g., real-time and non-real-time) in networks Datagram length: 16-bit field for proving the length of the IP header plus payload. Identifier, flags, fragmentation offset: IPv4 routers fragment large datagrams based on the MTU supported by the links 7 IPv4 – Datagram format Identifier, flags, fragmentation offset: When fragmentation is performed, these fields help in reassembling the original datagram at the destination host. Identifier is a 16-bit field, flags occupy 3 bits and fragmentation offset is a 13-bit field For retransmitted TCP segments, the Identifier field is the same Time-to-live (TTL): 8-bit field which indicates the number of hops the packet can travel without being discarded. A router decrements this field by 1 upon processing Protocol: 8 bit field to describe the upper layer protocol for the datagram (e.g., 6-TCP, 17-UDP, 1- ICMP) 8 IPv4 – Datagram format Header checksum: 16-bit field for error detection. Only the header fields are used in the calculation. Whenever a datagram is forwarded or fragmented this field is updated Source and destination IP addresses: 32-bit fields each. Routers use the destination IP field for delivering the datagrams to the destination host. Give a usage of source IP address Options: This field is rarely used and has been removed in IPv6. Payload: Variable length field which holds the upper layer information encapsulated into the datagram. The max size of payload is 65,535 bytes, however, rarely this size occurs. Datagram size in Ethernet range is 46- 1500 bytes 9 IPv4 – Datagram format Wireshark packet capture – IP v4 datagram 10 COMPUTER COMMUNICATION NETWORKS IPv4 addressing Department of Electronics and Communication Engineering 2 Addressing in datagram networks Host/router connects with a communication link via an interface (e.g., Ethernet interface or wireless interface) Each interface of a host/router is assigned an IP address IP address is 32-bit number expressed in dotted- decimal format What is a subnet? Subnet or IP network represents a network in which the IP addresses have a common portion starting with the most significant bit. The common portion is called prefix Routers lie on the periphery of a subnet. Subnet can consist of switches, hubs, cables or wireless interfaces. 3 Addressing in datagram networks The network address is of the form a.b.c.d/x where a, b, c and d are 8 bit decimal numbers and x is prefix For example, 192.168.1.0/24 means that the first 24 bits are common among the IP addresses in the subnet. Remaining 32–x bits are variable i.e., all zeros to all ones of length 32–x The above format is called Classless Interdomain Routing (CIDR) defined in RFC4632 What is subnet mask? Subnet mask is an alternate way of representing the prefix in 32-bit dotted decimal format. The subnet mask is expressed as x ones and 32–x zeros For example, /24 is expressed as 255.255.255.0 4 Addressing in datagram networks How many subnets are here? 5 Addressing in datagram networks How many subnets are here? 6 Addressing in datagram networks ClDR: Classless Inter-domain Routing Has the format of a.b.c.d/x CIDR overcomes the previous classful addressing schemes Class A has 8 bit subnet address, Class B has 16 bit subnet address and Class C has 24 bit subnet address ICANN assigns address blocks to registrars who in turn distribute portions of the address blocks to the associated ISPs. Under CIDR, IP addresses are distributed hierarchically by the ISPs Whenever a block of IP addresses are distributed, the prefix increases 7 Addressing in datagram networks ClDR (contd.) Consider an address space 223.1.3.0/24 24 bits from the leading bit are common to all IP addresses in this address space The remaining 8 bits are free variables which can take a value from 0 (all zeros) to 255 (all ones) The first address in the address space (223.1.3.0) is assigned to the network while the last address (223.1.3.255) in the address space is used for network broadcast The remaining IP address (223.1.3.1 – 223.1.3.254) can be assigned to any IP interface 8 Addressing in datagram networks ClDR (contd.) As the prefix is 24 in the address space 223.1.3.0/24, the subnet mask in binary form is given by 24 ones followed 8 zeros In dotted decimal form the subnet mask becomes 255.255.255.0 Consider another example 223.1.3.64/28 As the leading 28 bits are common the last 4 bits are free variables (i.e., last 8 bits vary from 0100 0000 to 0100 1111) So the addresses vary from 223.1.3.64 (assigned to the network) to 223.1.3.79 (assigned for network broadcast) So we have the remaining 14 addresses for IP interfaces Subnet mask for 223.1.3.64/28 is 255.255.255.240 9 Addressing in datagram networks CIDR(contd.) Network prefix increases when a network is divided into smaller subnets The number of available IP addresses decreases as network prefix increases Subnetting example: Consider the network address 223.1.3.0/24. It is divided into 4 subnets of 64 (=26) IP addresses each as follows: For 4 (=22) unique subnets (each having their own network address), the prefix length is extended by 2 bits (i.e., 24+2) That results in the last 6 bits being free variables for a unique combination of the 25th and 26th bit 10 Addressing in datagram networks CIDR(contd.) Subnetting example: Note that the 1st to 24th bit is common across all subnets 25th bit 26th bit Network address # IP address Subnet mask 0 0 223.1.3.0/26 64 255.255.255.192 0 1 223.1.3.64/26 64 255.255.255.192 1 0 223.1.3.128/26 64 255.255.255.192 1 1 223.1.3.192/26 64 255.255.255.192 Suppose subnet 223.1.3.64/26 is to be further divided into 4 (=22) subnets of 16 (=24) IP addresses each. The last four bits are treated as free variables for a unique combination of the 27th and 28th bit. Prefix extended by 2 The resulting subnets are given in the next slide 11 Addressing in datagram networks CIDR(contd.) Subnetting example: Note that the 1st to 24th bit is common across all subnets 25th 26th 27th 28th Network # IP Subnet mask bit bit bit bit address address 0 0 Free Free 223.1.3.0/26 64 255.255.255.192 0 1 0 0 223.1.3.64/28 16 255.255.255.240 0 1 0 1 223.1.3.80/28 16 255.255.255.240 0 1 1 0 223.1.3.96/28 16 255.255.255.240 0 1 1 1 223.1.3.112/28 16 255.255.255.240 1 0 Free Free 223.1.3.128/26 64 255.255.255.192 1 1 Free Free 223.1.3.192/26 64 255.255.255.192 12 Problems Numerical #22: Consider a subnet with prefix 128.119.40.128/26. Give an example of one IP address (of form a.b.c.d) within the subnet. Give example of a network address in which this subnet resides. Suppose an ISP owns the block of addresses of the form 128.119.40.64/26. Suppose it wants to create four subnets from this block, with each block having the same number of IP addresses. What are the prefixes (of form a.b.c.d/x) for the four subnets? What are the four network addresses? 13 Problems Numerical #23: Consider a router that interconnects three subnets: Subnet 1, Subnet 2, and Subnet 3. Suppose all of the interfaces in each of these three subnets are required to have the prefix 223.1.17/24. Also suppose that Subnet 1 is required to support at least 60 interfaces, Subnet 2 is to support at least 90 interfaces, and Subnet 3 is to support at least 12 interfaces. Provide three network addresses (of the form a.b.c.d/x) that satisfy these constraints. 14 Problems What are the mistakes in the given figure? Assume each subnet can support a maximum of 254 IP interfaces. Provide suitable corrections. 15 Addressing in datagram networks Route summarization or route aggregation IP address space which is hierarchically divided can also be aggregated. The prefix is reduced in the process From the table in the slide 223, we can note that the four subnets 223.1.3.64/28, 223.1.3.80/28, 223.1.3.96/28 and 223.1.3.112/28 can be combined into 223.1.3.64/26 as up to the 26th bit is common in the addresses under these subnets Similarly, all the subnets in slide 223 can be combined into 223.1.3.0/24 (or simply 223.1.3/24) as up to the 24th bit is common in all these subnets 16 Addressing in datagram networks Hierarchical address assignment 17 Addressing in datagram networks Hierarchical address assignment 18 Addressing in datagram networks Hierarchical address assignment 19 Problems Write one network address which can be used by router R2 to advertise both 10.0.128.0/24 and 10.0.129.0/24. 20 COMPUTER COMMUNICATION NETWORKS IPv4 DHCP Department of Electronics and Communication Engineering 2 IPv4 DHCP Once an organization has obtained a block of addresses, it can assign individual IP addresses to the host and router interfaces in its organization. A network administrator can manually assign IP addresses to various interfaces or alternatively use the DHCP What would be the drawback of manual IP assignment? What should be assigned to newly arriving host? IP address Subnet mask Gateway address DNS server 3 IPv4 DHCP DHCP: Dynamic host configuration protocol Client server model for obtaining IP address Server listens for UDP messages on port 67 DHCP allows a host to obtain (be allocated) an IP address automatically Host communicates initially using broadcast address Broadcast address: 255.255.255.255 Host sends a DHCP discovery message DHCP servers replies with DHCP offer messages Host replies with DHCP request DHCP server acknowledges with the assigned IP address 4 Do we need DHCP server per subnet? 5 6 COMPUTER COMMUNICATION NETWORKS NAT Department of Electronics and Communication Engineering 2 Network address translation (NAT) IP addresses allotted by an ISP to a consumers (e.g., home networks, enterprise networks) could be insufficient In such cases, private IP networks are created and the public IPs, allotted by the ISPs, are shared Private IP address can be assigned manually or using DHCP Private hosts can access the public internet using NAT NAT maps the private IP addresses to the public IP addresses NAT table maintains such mapping Interfaces of NAT-enabled routers have some interfaces assigned to private IP addresses whereas some in interfaces assigned the public IP addresses given by the ISP 3 Network address translation (NAT) 4 Network address translation (NAT) Objections by purists Ports used for addressing processes not hosts Routers should perform only level 3 functions Violates end to end principle Slowing down adoption of IPv6 5 COMPUTER COMMUNICATION NETWORKS IPv4 Datagram format Department of Electronics and Communication Engineering 2 Internet Protocol version 6 Proposed by IETF (actually started to address the limitations of IPv4) Initiated in 1998 and came into effect on 6 June 2012 Provides 3.4 ×1038 IP addresses of 128 bits each Currently implemented by new ISPs and carriers About 22% of the IP traffic is due to IPv6 as of 2018 Fixed 40 byte header compared to the 20 byte (variable length) IPv4 header It is more robust to IP spoofing and attacks Besides unicast and multicast (IPv4 modes), supports anycast communications 3 Internet Protocol version 6 4 Internet Protocol version 6 Version: 4-bit field identifies the IP version number. Traffic class: 8-bit traffic class field, like the TOS field in IPv4. Used to give priority to certain datagrams within a flow, or it can be used to give priority to datagrams from certain applications Flow label: 20-bit field is used to identify a flow of datagrams Payload length: 16-bit value is treated as an unsigned integer giving the number of bytes in the IPv6 datagram following the fixed-length, 40-byte datagram header. 5 IPv6 – Tunnelling (RFC4213) 6