Computer Networks Lecture 02: Application Layer PDF
Document Details
Uploaded by WellAcademicArt
Alexandria University
2024
Dr. Sahar M. Ghanem
Tags
Summary
These are lecture notes for a course on computer networks, covering application layer topics including client-server and peer-to-peer architectures, protocols, and communication methods. The notes are dated 2024 and provide an overview of network applications and their use in various scenarios.
Full Transcript
Computer Networks Lecture 02: Application Layer Prof. Dr. Sahar M. Ghanem Associate Professor Computer and Systems Engineering Department Faculty of Engineering, Alexandria University Chapter 2 Application Layer Computer Networks, 2024 (c)...
Computer Networks Lecture 02: Application Layer Prof. Dr. Sahar M. Ghanem Associate Professor Computer and Systems Engineering Department Faculty of Engineering, Alexandria University Chapter 2 Application Layer Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 2 Outline Principles of Network Applications The Web and HTTP Electronic Mail in the Internet DNS—The Internet’s Directory Service Peer-to-Peer File Distribution Video Streaming and Content Distribution Networks Socket Programming: Creating Network Applications Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 3 Applications (1/2) In the 1970s and 1980s: text e-mail, remote access to computers, file transfers, and newsgroups. In mid-1990s: the World Wide Web, encompassing Web surfing, search, and electronic commerce. In the new millennium: voice over IP and video conferencing such as Skype, Facetime, and Google Hangouts user generated video such as YouTube movies on demand such as Netflix multiplayer online games such as Second Life and World of Warcraft Social networking applications—such as Facebook, Instagram, and Twitter Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 4 Applications (1/2) Smartphone and 4G/5G Internet access: Location based mobile apps, including popular check-in, dating, and road- traffic forecasting apps (such as Yelp, Tinder, and Waz) mobile payment apps (such as WeChat and Apple Pay) messaging apps (such as WeChat and WhatsApp). Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 5 Principles of Network Applications Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 6 Introduction When developing your new application, you need to write software that will run on multiple end systems. You do not need to write software that runs on network-core devices, such as routers or link- layer switches. Two predominant architectural paradigms used in modern network applications: the client-server architecture or the peer-to-peer (P2P) architecture. Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 7 Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 8 client-server architecture In a client-server architecture, there is an always-on host, called the server, which services requests from many other hosts, called clients. Clients do not directly communicate with each other. The server has a fixed, well-known IP address. A data center, housing a large number of hosts, is often used to create a powerful virtual server. It can have hundreds of thousands of servers. Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 9 P2P architecture In a P2P architecture, there is minimal (or no) reliance on dedicated servers in data centers. The application exploits direct communication between pairs of intermittently connected hosts, called peers. An example of a popular P2P application is the file-sharing application BitTorrent. Compelling features: self scalability; cost effective Challenges: security, performance, and reliability due to their highly decentralized structure. Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 10 Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 11 Process Communication How processes running on different hosts (with potentially different operating systems) communicate? Processes on two different end systems communicate with each other by exchanging messages across the computer network. Typically one of the two processes is labeled as the client and the other process as the server. (In P2P file sharing, a process can be both a client and a server.) A process sends messages into, and receives messages from, the network through a software interface called a socket. Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 12 Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 13 Transport Layer Services Popular applications have been assigned specific port numbers. A list of well-known port numbers for all Internet standard protocols can be found at www.iana.org. What are the services that a transport-layer protocol can offer to applications invoking it? The possible services can be classified along four dimensions: reliable data transfer, throughput, timing, and security. The Internet makes two transport protocols available to applications, UDP and TCP. TCP and UDP are missing any mention of throughput or timing guarantees—services not provided by today’s Internet transport protocols. Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 14 Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 15 Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 16 The Web and HTTP Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 17 WWW & HTTP (1/2) The Web’s application-layer protocol is the HyperText Transfer Protocol (HTTP). The client program and server program, executing on different end systems, talk to each other by exchanging HTTP messages. HTTP/1.0 dates back to the early 1990’s (RFC 1945); HTTP/1.1 (RFC 7230); increasingly browsers and Web servers also support HTTP/2 (RFC 7540) Most Web pages consist of a base HTML file and several referenced objects. An object is simply a file that is addressable by a single URL. e.g. HTML file, a JPEG image, a Javascrpt file, a CCS style sheet file, or a video clip. Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 18 Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 19 WWW & HTTP (2/2) Each URL has two components: the hostname of the server that houses the object and the object’s path name. Web browsers (such as Internet Explorer and Chrome) implement the client side of HTTP. Web servers, which implement the server side of HTTP, house Web objects. Popular Web servers include Apache and Microsoft Internet Information Server. HTTP uses TCP as its underlying transport protocol. The server sends requested files to clients without storing any state information about the client (i.e. stateless protocol). Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 20 TCP Connections (1/2) Each request/response pair can be sent over a separate TCP connection (non-persistent connections ), or all of the requests and their corresponding responses are sent over the same TCP connection (persistent connections). HTTP uses persistent connections in its default mode. However, HTTP clients and servers can be configured to use non-persistent connections instead and transports exactly one request message and one response message. Non-persistent connections place a significant burden on the Web server. In addition, each object suffers a delivery delay of two RTTs (Round Trip Time). Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 21 Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 22 TCP Connections (2/2) The requests for objects can be made back-to-back, without waiting for replies to pending requests (pipelining). Users can configure some browsers to control the degree of parallelism. the HTTP server closes a connection when it isn’t used for a certain time (a configurable timeout interval). Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 23 HTTP Request Message (1/3) There are two types of HTTP messages, request messages and response messages. The request message is written in ordinary ASCII text. The first line of an HTTP request message is called the request line; the subsequent lines are called the header lines. The request line has three fields: the method field, the URL field, and the HTTP version field. The method field can take on several different values, including GET, POST, HEAD, PUT, and DELETE. The great majority of HTTP request messages use the GET method. Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 24 HTTP Request Message (2/3) The header line Host: … specifies the host on which the object resides. The Connection: close header line, tells the server that don’t bother with persistent connections. The User-agent:… header line specifies the user agent, that is, the browser type that is making the request to the server. The Accept-language:… header indicates that the user prefers to receive a language version of the object, if such an object exists; otherwise, the server should send its default version. The entity body is empty with the GET method, but is used with the POST method. Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 25 HTTP Request Message (3/3) An HTTP client often uses the POST method when the user fills out a form—for example, when a user provides search words to a search engine. A request generated with a form can also use the GET method and include the inputted data (in the form fields) in the requested URL. When a server receives a request with the HEAD method, it responds with an HTTP message but it leaves out the requested object (e.g. for debugging). The PUT method is often used in conjunction with Web publishing tools. The DELETE method allows a user, or an application, to delete an object on a Web server. Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 26 Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 27 Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 28 HTTP Response Message The example has three sections: an initial status line, six header lines, and then the entity body. The status line has three fields: the protocol version field, a status code, and a corresponding status message. The Date: header line indicates the time and date when the HTTP response was created and sent by the server. The Last-Modified: header line indicates the time and date when the object was created or last modified. The Content-Type: header line indicates that the object in the entity body is HTML text. Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 29 Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 30 Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 31 Response Message Status Codes common status codes and associated phrases include: 200 OK: Request succeeded and the information is returned in the response. 301 Moved Permanently: The new URL is specified in Location: header of the response message. 400 Bad Request: This is a generic error code indicating that the request could not be understood by the server. 404 Not Found: The requested document does not exist on this server. 505 HTTP Version Not Supported: The requested HTTP protocol version is not supported by the server. Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 32 User-Server Interaction: Cookies (1/3) An HTTP server is stateless and has permitted engineers to develop high-performance Web servers that can handle thousands of simultaneous TCP connections. However, it is often desirable for a Web site to identify users, either because the server wishes to restrict user access or because it wants to serve content as a function of the user identity. For these purposes, HTTP uses cookies. Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 33 User-Server Interaction: Cookies (2/3) Cookie technology has four components: (1) a cookie header line in the HTTP response message; (2) a cookie header line in the HTTP request message; (3) a cookie file kept on the user’s end system and managed by the user’s browser; and (4) a back-end database at the Web site. HTTP response a Set-cookie: header, which contains the identification number. The browser appends a line to the special cookie file that it manages. Each of her HTTP requests to the server includes the header line: Cookie: Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 34 Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 35 User-Server Interaction: Cookies (3/3) Cookies can thus be used to create a user session layer on top of stateless HTTP. Although cookies often simplify the Internet shopping experience for the user, they are controversial because they can also be considered as an invasion of privacy. As we just saw, using a combination of cookies and user-supplied account information, a Web site can learn a lot about a user and potentially sell this information to a third party. Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 36 Web Caching (1/5) A Web cache—also called a proxy server—is a network entity that satisfies HTTP requests on the behalf of an origin Web server. A user’s browser can be configured so that all of the user’s HTTP requests are first directed to the Web cache. A cache is both a server and a client at the same time. Typically a Web cache is purchased and installed by an ISP. First, a Web cache can substantially reduce the response time for a client request. Second, Web caches can substantially reduce traffic on an institution’s access link to the Internet and can substantially reduce Web traffic in the Internet as a whole, thereby improving performance for all applications.. Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 37 Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 38 Web Caching (2/5) Example: A router in the institutional network and a router in the Internet are connected by a 15 Mbps link. The average object size is 1 Mbits. The average request rate from the institution’s browsers to the origin servers is 15 requests per second. The HTTP request messages are negligibly small. Internet Delay: the amount of time it takes from when the router forwards an HTTP request until it receives the response is two seconds on average. response time= LAN delay + the access delay between the two routers + the Internet delay Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 39 Web Caching (3/5) LAN traffic intensity = (15 requests/sec) * (1 Mbits/request)/(100 Mbps) = 0.15 (tens of milliseconds of delay) Access link traffic intensity = (15 requests/sec) * (1 Mbits/request)/(15 Mbps) = 1 (delay becomes very large and grows without bound) Increasing the access rate from 15 Mbps to, say, 100 Mbps is a costly proposition. In this case, the response time will roughly be two seconds (the Internet delay). Installing a Web cache in the institutional network has a lower cost. Assume the hit rate is 0.4. The traffic intensity on the access link is reduced from 1.0 to 0.6. The average delay = 0.4 * (0.01 seconds) + 0.6 * (2.01 seconds) = 1.2 seconds Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 40 Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 41 Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 42 Web Caching (4/5) Through the use of Content Distribution Networks (CDNs), Web caches are increasingly playing an important role in the Internet. There are shared CDNs (such as Akamai and Limelight) and dedicated CDNs (such as Google and Netflix). Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 43 Web Caching (5/5) Although caching can reduce user-perceived response times, it introduces a new problem—the copy of an object residing in the cache may be stale (may have been modified since the copy was cached). An HTTP request message is a conditional GET message if it uses the GET method and it includes an If-Modified-Since: header line. Web server can still send a response message but does not include the requested object in the response message if it is not modified. Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 44 HTTP/2 (1/4) In 2020, over 40% of the top 10 million websites supporting HTTP/2. The primary goals for HTTP/2 are to reduce perceived latency by enabling request and response multiplexing over a single TCP connection, provide request prioritization and server push, and provide efficient compression of HTTP header fields. Developers of Web browsers discovered that sending all the objects in a Web page over a single TCP connection has a Head of Line (HOL) blocking problem. For example, using a single TCP connection, a video clip will take a long time to pass through the bottleneck link, while small objects are delayed as they wait behind that video clip. Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 45 HTTP/2 (2/4) HTTP/1.1 browsers typically work around this problem by opening multiple parallel TCP connections, thereby having objects in the same web page sent in parallel to the browser. TCP congestion control aims to give each TCP connection sharing a bottleneck link an equal share of the available bandwidth of that link. By opening multiple parallel (up to six) TCP connections to transport a single Web page, the browser can “cheat” and grab a larger portion of the link bandwidth. One of the primary goals of HTTP/2 is to get rid of (or at least reduce the number of) parallel TCP connections for transporting a single Web page. Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 46 HTTP/2 (3/4) The HTTP/2 solution for HOL blocking is to break each message into small frames, and interleave the request and response messages on the same TCP connection. The header field of the response becomes one frame, and the body of the message is broken down into one for more additional frames. The frames of the response are then interleaved by the framing sub- layer in the server with the frames of other responses and sent over the single persistent TCP connection. A client’s HTTP requests are broken into frames and interleaved. Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 47 HTTP/2 (4/4) The framing sublayer also binary encodes the frames that is more efficient to parse, lead to slightly smaller frames, and are less error-prone. When a client sends concurrent requests to a server, it can prioritize the responses it is requesting by assigning a weight between 1 and 256 to each message. Using these weights, the server can send first the frames for the responses with the highest priority. In addition to the response to the original request, the server can push additional objects to the client, without the client having to request each one. HTTP/3 is described in Internet drafts and has not yet been fully standardized. Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 48 DNS—The Internet’s Directory Service Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 49 Host Identifier We human beings can be identified in many ways: names; social security numbers; driver’s license numbers Within a given context one identifier may be more appropriate than another. An Internet host identifier is its hostname that is appreciated by humans. e.g. www.facebook.com; www.google.com An Internet host also is identified by so-called IP addresses that consists of four bytes and has a rigid hierarchical structure. e.g. 121.7.106.83 As we scan the IP address from left to right, we obtain more and more specific information about where the host is located in the Internet. Similar to scanning postal address from bottom to top. Network Security 2024, (c) Sahar M. Ghanem 50 Services Provided by DNS The Internet’s domain name system (DNS) is a directory service that translates hostnames to IP addresses. DNS is a distributed database implemented in a hierarchy of DNS servers and an application-layer protocol that allows hosts to query the distributed database. The DNS protocol runs over UDP and uses port 53. RFC 1034; RFC 1035 The DNS servers are often UNIX machines running the Berkeley Internet Name Domain (BIND) software. DNS is employed by other application-layer protocols to translate user- supplied hostnames to IP addresses (e.g. HTTP, SMTP, …) Network Security 2024, (c) Sahar M. Ghanem 51 DNS Services DNS provides other important services: Host aliasing: Alias hostnames, when present, are more mnemonic than canonical hostnames. e.g. canonical: relay1.west-coast.enterprise.com; alias: www.enterprise.com Mail server aliasing: the MX record permits a company’s mail server and Web server to have identical (aliased) hostnames Load distribution: among replicated servers each having a different IP address. Rotates the ordering of the addresses within each reply. Network Security 2024, (c) Sahar M. Ghanem 52 Overview of How DNS Works hostname-to-IP-address translation On UNIX-based machines, gethostbyname() is the function call that an application calls in order to perform the DNS translation. A simple design for DNS would have one DNS server that contains all the mappings but this design doesn’t scale. Problems: A single point of failure; Traffic volume; Distant centralized database; Maintenance Instead, the mappings are distributed across the DNS servers. There are three classes of DNS servers organized in a hierarchy: root ; top-level domain (TLD); authoritative Network Security 2024, (c) Sahar M. Ghanem 53 Network Security 2024, (c) Sahar M. Ghanem 54 Classes of DNS servers Root DNS servers. There are more than 1000 root servers instances scattered all over the world that provide the IP addresses of the TLD servers. Copies of 13 different root servers coordinated through the Internet Assigned Numbers Authority (IANA). Top-level domain (TLD) servers. For each of the top-level domains (com, org, net, edu, and gov, …) and all of the country top-level domains (uk, fr, ca, jp, …) Provide the IP addresses for authoritative DNS servers. Authoritative DNS servers. Every organization with publicly accessible hosts. Network Security 2024, (c) Sahar M. Ghanem 55 Local DNS Server Local DNS server(s): Each ISP has a local DNS server(s) and provides the host with the IP address of that server (through DHCP) Check accessing network status windows When a host makes a DNS query, the query is sent to the local DNS server, which acts a proxy, forwarding the query into the DNS server hierarchy. Any DNS query can be iterative or recursive. Usually, the query from the requesting host to the local DNS server is recursive, and the remaining queries are iterative. Network Security 2024, (c) Sahar M. Ghanem 56 Network Security 2024, (c) Sahar M. Ghanem 57 Network Security 2024, (c) Sahar M. Ghanem 58 DNS Caching DNS extensively exploits DNS caching in order to improve the delay performance and to reduce the number of DNS messages ricocheting around the Internet. When a DNS server receives a DNS reply it can cache the mapping in its local memory and provide that mapping, even if it is not authoritative for the hostname. DNS servers discard cached information after a period of time (often set to two days). Because of caching, root servers are bypassed for all but a very small fraction of DNS queries. Network Security 2024, (c) Sahar M. Ghanem 59 DNS Records and Messages DNS distributed database store resource records (RRs) A resource record is a four-tuple that contains the following fields: (Name, Value, Type, TTL); TTL is the time to live; If Type=A, then Name is a hostname and Value is the IP address for the hostname. If Type=NS, then Name is a domain (such as foo.com) and Value is the hostname of an authoritative DNS server. If Type=CNAME, then Value is a canonical hostname for the alias hostname Name. If Type=MX, then Value is the canonical name of a mail server that has an alias hostname Name. … Network Security 2024, (c) Sahar M. Ghanem 60 DNS Message Format (1/2) Both query and reply messages have the same format. The first 12 bytes is the header section has a number of fields: 16-bit number that identifies the query A 1-bit query/reply flag (query (0); reply (1)) A 1-bit authoritative flag (DNS server is an authoritative server) A 1-bit recursion-desired flag A 1-bit recursion-available field four number-of fields that indicate the number of occurrences of the four types of data sections that follow the header Network Security 2024, (c) Sahar M. Ghanem 61 DNS Message Format (2/2) The question section contains information about the query that includes a name field that contains the name that is being queried, and a type field that indicates the type of question being asked (e.g. Type A, or Type MX). In a reply from a DNS server, the answer section contains the resource records for the name that was originally queried. The authority section contains records of other authoritative servers. The additional section contains other helpful records. Network Security 2024, (c) Sahar M. Ghanem 62 Network Security 2024, (c) Sahar M. Ghanem 63 nslookup nslookup program is available from most Windows and UNIX platforms that allows sending a DNS query to any DNS server. Many Web sites allow to remotely employ nslookup. Network Security 2024, (c) Sahar M. Ghanem 64 ICANN How records get into the database? A registrar is a commercial entity that verifies the uniqueness of the domain name, enters the domain name into the DNS database, and collects a small fee for its services. (ICANN accredits the various registrars). (http://www.internic.net) e.g. Created a new startup company Register the domain name at a registrar. Provide the registrar with the names and IP addresses of the primary and secondary authoritative DNS servers. The registrar would then make sure that a Type NS and a Type A record are entered into the TLD servers. Network Security 2024, (c) Sahar M. Ghanem 65 Socket Programming Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 66 Introduction (1/2) A typical network application consists of a pair of programs—a client program and a server program—residing in two different end systems. When these two programs are executed, a client process and a server process are created, and these processes communicate with each other by reading from, and writing to, sockets. There are two types of network applications. One type is an implementation whose operation is specified in a protocol standard, such as an RFC or some other standards document. The other type of network application is a proprietary network application. Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 67 Introduction (2/2) One of the first decisions the developer must make is whether the application is to run over TCP or over UDP. When developing a proprietary application, the developer must be careful to avoid using such well-known port numbers. Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 68 Socket Programming with UDP Before the sending process can push a packet of data out the socket door, when using UDP, it must first attach a destination address to the packet. The destination address consists of the destination host’s IP address and the destination socket’s port number. The sender’s source address—consisting of the IP address of the source host and the port number of the source socket—are also attached to the packet. Attaching the source address to the packet is automatically done by the underlying operating system. Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 69 Example UDP Application 1. The client reads a line of characters (data) from its keyboard and sends the data to the server. 2. The server receives the data and converts the characters to uppercase. 3. The server sends the modified data to the client. 4. The client receives the modified data and displays the line on its screen. In order for the server to be able to receive and reply to the client’s message, it must be ready and running—that is, it must be running as a process before the client sends its message. Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 70 Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 71 Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 72 Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 73 Socket Programming with TCP (1/2) Using UDP, the server must attach a destination address to the packet before dropping it into the socket. TCP is a connection-oriented protocol, that is before the client and server can start to send data to each other, they first need to handshake and establish a TCP connection. Using TCP, when one side wants to send data to the other side, it just drops the data into the TCP connection via its socket. As in the case of UDP, the TCP server must be running as a process before the client attempts to initiate contact. The server program must have a special socket that welcomes some initial contact from a client process running on an arbitrary host. Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 74 Socket Programming with TCP (2/2) When the client creates its TCP socket, it specifies the address of the welcoming socket in the server (serverSocket). When the server “hears” the knocking, it creates a new socket that is dedicated to that particular client (connectionSocket). Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 75 Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 76 Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 77 Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 78 Computer Networks, 2024 (c) Dr. Sahar M. Ghanem 79