Distributed Programming with Sockets PDF
Document Details
Uploaded by AgileRetinalite5575
AAIT
2019
Wondimagegn D.
Tags
Summary
This document is a set of lecture notes for a course on distributed programming with sockets. The notes cover topics such as a brief introduction to computer networks, addressing and name resolution, TCP and UDP protocols, I/O multiplexing and server structures. The document was created on September 29, 2019.
Full Transcript
Distributed Programming with Sockets Wondimagegn D. September 29, 2019 Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 1 / 36 Overview 1 A Very Brief Introduction to Computer Networks 2 IP Addressing an...
Distributed Programming with Sockets Wondimagegn D. September 29, 2019 Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 1 / 36 Overview 1 A Very Brief Introduction to Computer Networks 2 IP Addressing and Name Resolution 3 TCP 4 I/O Multiplexing 5 Server Structures Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 2 / 36 A Very Brief Introduction to Computer Networks Introduction There is one way computers can communicate together By sending network messages to each other All other kinds of communications are built from messages There is one way programs can send/receive network messages Through sockets All other communication paradigms are built from sockets Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 3 / 36 A Very Brief Introduction to Computer Networks Two Different Kinds of Networks Circuit switching One electrical circuit assigned per communication Example: the (analog) phone network Guaranteed constant quality of service Waste of resources (periods of silence), fault tolerance Packet switching Messages are split into packets, which are transmitted independently Packets can take different routes Network infrastructures are shared among users Example: the Internet, and most computer networks Good resource usage, fault tolerance Variable QOS, packets may be delivered in the wrong order Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 4 / 36 A Very Brief Introduction to Computer Networks Internet Protocol Most computer networks use the Internet Protocol The base protocol: IP (Internet Protocol) Send packets of limited size Up to 65,536 bytes But if the MTU (Maximum Transmission Unit) of some link on the path is lower, the packet will be fragmented (IPv4) or dropped (IPv6) Minimum allowed MTU is 576 bytes; in practice nowadays higher Each packet is sent to an IP address Example:10.5.55.87 IP offers no guarantee: Packets may get lost Packets may be delivered twice Packets may be delivered in wrong order Packets may be corrupted during transfer Usually, programs do not use IP directly All other Internet Protocols are built over IP Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 5 / 36 A Very Brief Introduction to Computer Networks UDP: User Datagram Protocol UDP is very similar to IP Send/receive packets No guarantee In UDP, packets are called datagrams Each datagram is sent to an IP address and a port number Example:10.5.55.87 port=1234 Ports allow to distinguish between several programs running simultaneously on the same machine Program A uses port 1234 Program B uses port 1235 When a datagram is received, the OS knows which program it should be delivered to. Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 6 / 36 A Very Brief Introduction to Computer Networks TCP: Transmission Control Protocol TCP establishes connections between pairs of machines To communicate with a remote host, we must first connect to it TCP provides the illusion of a reliable data flow to the users Flows are split into packets, but the users don’t see them TCP guarantees that the data sent will not be lost,unordered, corrupted, etc. The sender gives numbers to packets so that the receiver can reorder them. The receiver acknowledges received packets so that the sender can retransmit lost packets. Communication is bi-directional The same connection can be used to send data in the two directions E.g., a request and its response Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 7 / 36 A Very Brief Introduction to Computer Networks A Very Simplified View Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 8 / 36 IP Addressing and Name Resolution IP Address Conversion IP Addresses 32-bit integers: 2183468070 (good for computers!) Dotted strings: 130.37.20.38 (good for humans!) DNS name: www.aait.edu.et (even better for humans!) You can convert between integer and dotted string: Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 9 / 36 IP Addressing and Name Resolution IP Address Conversion IP Addresses 32-bit integers: 2183468070 (good for computers!) Dotted strings: 130.37.20.38 (good for humans!) DNS name: www.aait.edu.et (even better for humans!) You can convert between integer and dotted string: #i n c l u d e i n a d d r t i n e t a d d r ( c o n s t c h a r ∗ d o t t e d ) ; /∗ D o t t e d t o Network ∗/ c h a r ∗ i n e t n t o a ( s t r u c t i n a d d r n e t w o r k ) ; /∗ Network t o D o t t e d ∗/ in addr t is an unsigned 32-bit integer struct in addr is a structure containing an in addr t Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 10 / 36 IP Addressing and Name Resolution Big/Little-endian, Network Ordering Computers represent numbers in different orderings: 32-bit integers: 2183468070 (good for computers!) Big − endian : 0x12345678 => 0x120x340x560x78 E.g PowerPC, Sparc, UltraSparc Little − endian : 0x12345678 => 0x780x560x340x12 E.g Alpha, i386, AMD64 To convert numbers: host ¡—¿ network ordering #i n c l u d e uint16 t htons ( u i n t 1 6 t value ); /∗ Host t o Network , 16 b i t s ∗/ uint32 t htonl ( uint32 t value ); /∗ Host t o Network , 32 b i t s ∗/ uint16 t ntohs ( u i n t 1 6 t value ); /∗ Network to host , 16 b i t s ∗/ uint32 t ntohl ( uint32 t value ); /∗ Network to host , 32 b i t s ∗/ Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 11 / 36 IP Addressing and Name Resolution sockaddr in: Unix Network Addresses Unix represents network addresses with a struct sockaddr This structure is generic for all kinds of networks For Internet addresses, we use sockaddr in struct sockaddr in { s a f a m i l y t s i n f a m i l y ; /∗ s e t t o AF INET ∗/ i n p o r t t s i n p o r t ; /∗ P o r t number ∗/ s t r u c t i n a d d r s i n a d d r ; /∗ C o n t a i n s t h e IP a d d r e s s ∗/ }; struct in addr { i n a d d r t s a d d r ; /∗ IP a d d r e s s i n n e t w o r k o r d e r i n g ∗/ }; sin family: indicates which type of address. Always set to AF INET. sin port: port number, in network byte order sin addr.s addr: IP address, in network byte order. To represent an unspecified IP address, set it to htonl(INADDR ANY). Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 12 / 36 IP Addressing and Name Resolution Domain Names Internet Protocols are all based on IP addresses But IP addresses are hard for humans to remember Our web server: http://10.5.10.3 Better: http://www.aait.edu.et Using Domain Names Domain names cannot be used directly by network protocols Network protocols only use IP addresses But you can convert domain names into IP addresses thanks to DNS Domain Name Service (DNS): handles Domain Name resolution Hundreds of thousands of servers around the world that cooperate to resolve addresses To learn more on how this works, go to the Distributed Systems book page 14! Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 13 / 36 IP Addressing and Name Resolution Converting Domain Names to IP Conversion is done by gethostbyname() #i n c l u d e s t r u c t h o s t e n t ∗ g e t h o s t b y n a m e ( c o n s t c h a r ∗name ) ;...where struct hostent is as follows struct hostent { c h a r ∗h name ; /∗ o f f i c i a l name o f h o s t ∗/ c h a r ∗∗ h a l i a s e s ; /∗ a l i a s l i s t ∗/ int h addrtype ; /∗ h o s t a d d r e s s t y p e ∗/ i n t h l e n g t h ; /∗ l e n g t h o f a d d r e s s ∗/ c h a r ∗∗ h a d d r l i s t ; /∗ l i s t o f a d d r e s s e s ∗/ }; h addr list: A null-terminated array of network addresses for the host Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 14 / 36 IP Addressing and Name Resolution gethostbyname() Example #i n c l u d e i n t p r i n t r e s o l v ( c o n s t c h a r ∗name ) { struct hostent ∗resolv ; s t r u c t i n a d d r ∗addr ; r e s o l v = g e t h o s t b y n a m e ( name ) ; i f ( r e s o l v==NULL) { p r i n t f ( ” A d d r e s s n o t f o u n d f o r %s \n” , name ) ; r e t u r n −1; } else { a d d r = ( s t r u c t i n a d d r ∗) r e s o l v −>h a d d r l i s t [ 0 ] ; p r i n t f ( ”The IP a d d r e s s o f %s i s %s \n” , name , i n e t n t o a (∗ a d d r ) ) ; return 0; } } Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 15 / 36 TCP TCP Sockets Defined in RFC 793 Popular TCP-based protocols TELNET FTP – File Transfer Protocol SMTP – Simple Mail Transfer Protocol HTTP – Hyper Text Transfer Protocol SSH – Secure Shell Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 16 / 36 TCP TCP Socket Functions Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 17 / 36 TCP The TCP three-way handshake Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 18 / 36 TCP Creating a Socket Some functions are the same as in UDP socket(): creates a socket s o c k f d = s o c k e t ( AF INET , SOCK STREAM , 0 ) ; bind(): to specify the address of a socket Only useful for server sockets Exactly like UDP sockets Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 19 / 36 TCP listen(): Setting a Server Socket By default, TCP sockets are created as client sockets A client socket cannot receive incoming connections Server sockets need to maintain more state TCP establishes connections thanks to the three-way handshake: Server sockets must allocate resources for handling connections To convert a client socket to a server socket, use listen() And indicate how many not-yet-accepted connections can be supported in parallel If this number is exceeded, the server will refuse connections Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 20 / 36 TCP listen() The interface is simple #i n c l u d e i n t l i s t e n ( i n t sockfd , i n t backlog ) ; sockfd: the socket descriptor backlog: the size of the buffer (often set to 5) Return value: 0 for success, -1 for error Note backlog is not a limit on the number of connections established in parallel! It only limits the number of pending connections (i.e., connections before having been accepted) Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 21 / 36 TCP Initiating a TCP connection Clients initiate connections to servers thanks to connect(): #i n c l u d e #i n c l u d e i n t connect ( i n t sockfd , const s t r u c t sockaddr ∗serv addr , s o c k l e n t addrlen ) ; sockfd: the socket descriptor serv addr: a pointer to a struct sockaddr in containing the address where to connect to Obviously you must specify the destination IP address and port number addrlen: sizeof(struct sockaddr in) Return value: 0 for success, -1 for error connect() binds the client’s socket to a random unused port Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 22 / 36 TCP Waiting for an Incoming Connection accept() blocks the process until an incoming connection is received When a connection is received, accept() creates a new socket dedicated to this connection The new socket is used to communicate with the client The original socket is immediately ready to wait for other connections accept(): #i n c l u d e #i n c l u d e i n t a c c e p t ( i n t s o c k f d , s t r u c t s o c k a d d r ∗addr , s o c k l e n t ∗ a d d r l e n ) ; sockfd: the socket descriptor addr: a pointer to a sockaddr in structure where the address of the client will be copied addrlen: a pointer to an integer containing the size of addr Return value: the descriptor of the newly created socket, or -1 for error Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 23 / 36 TCP Example use of accept Example i n t s o c k , newsock , r e s ; sockaddr in client addr ; socklen t addrlen ; ( t h e s o c k e t s o c k i s c r e a t e d and bound ) r e s = l i s t e n ( sock , 5 ) ; i f ( r e s < 0) {... } addrlen = sizeof ( struct sockaddr in ); newsock = a c c e p t ( s o c k , ( s t r u c t s o c k a d d r ∗) &c l i e n t a d d r , &a d d r l e n ) ; i f ( newsock < 0 ) {... } else { p r i n t f ( ” C o n n e c t i o n from %s !\ n” , i n e t n t o a ( c l i e n t a d d r. s i n a d d r ) ) ; } Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 24 / 36 TCP Writing data to a socket write works the same for sending data to a TCP socket or writing to a file #i n c l u d e s s i z e t w r i t e ( i n t s o c k f d , c o n s t v o i d ∗buf , s i z e t count ) ; sockfd: socket descriptor buf: buffer to be sent count: size of buffer Return value: number of bytes sent, or -1 for error Attention: When writing to a socket, write may send fewer bytes than requested Due to limits in internal kernel buffer space Always check the return value of write, and resend the non- transmitted data Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 25 / 36 TCP Reading data from a socket read() blocks the process until receiving data from the socket #i n c l u d e s s i z e t r e a d ( i n t s o c k f d , v o i d ∗buf , s i z e t count ) ; sockfd: socket descriptor buf: buffer where to write the data read count: size of buffer Return value: number of bytes read, or -1 for error Attention: When reading from a socket, read() may read fewer bytes than requested It delivers the data that have been received This does not mean that the stream of data is finished, there may be more to come The end-of-file (EOF) is notified to the read by read() returning 0 Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 26 / 36 TCP Closing a TCP socket To stop sending data to a socket, use close(): #i n c l u d e int close ( int sockfd ); Anyone can call this, either the client or the server This sends an EOF message to the other party When receiving an EOF, read returns 0 bytes Subsequent reads and writes will return errors Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 27 / 36 TCP Asymmetric Disconnection Sometimes you may want to tell the other party that you are finished, but let it finish before closing the connection #i n c l u d e i n t shutdown ( i n t s o c k f d , i n t how ) ; how: SHUT WR for stopping writing, SHUT RD for stopping reading When one party has closed the connection, the other can still write data (and then close the connection as well) To initiate a disconnection To receive a disconnection shutdown(fd,SHUT WR) read() receives an EOF Keep on reading the last data Keep on writing the last data Until receiving an EOF Then close() the socket close() the socket Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 28 / 36 I/O Multiplexing I/O Multiplexing How can a program handle multiple file descriptors simultaneously? accept() and read() block programs until something is received How can you wait for connections/data from multiple sockets? Several methods: Use multiple processes Resource consuming, hard to program Use non-blocking I/O It works for read() but not for accept() select() monitors multiple file descriptors It blocks the program until one of them is ready for reading or writing poll() is similar to select() With additional information about streams Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 29 / 36 Server Structures Server Structures Often, a server accepts connections to one (TCP) socket But it wants to process several requests simultaneously Better use of the server’s resources Incoming requests can start being processed immediately after reception Depending on its nature, a server can receive between 0 and dozens of thousands of requests per second Several server structures can be used: Iterative (i.e., not concurrent) One child per client Prefork Select loop Many other variants... Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 30 / 36 Server Structures Iterative Servers An iterative server treats one request after the other i n t f d , newfd ; while (1) { newfd = a c c e p t ( fd ,... ) ; t r e a t r e q u e s t ( newfd ) ; c l o s e ( newfd ) ; } Simple Potentially low resource utilization - If treat request() does not utilize all the CPU, resources are wasted Potentially long waiting queue of incoming connections waiting to be accept()ed Increased request treatment latency If the queue increases, the server may start rejecting incoming connections Wondimagegn D. (AAIT ) Distributed System Programming September 29, 2019 31 / 36 Server Structures One Child Per Client A new process is created to handle each connection void s i g c h l d ( int ) { w h i l e ( w a i t p i d ( 0 , NULL ,WNOHANG)>0) {} s i g n a l ( SIGCHLD , s i g c h l d ) ; } i n t main ( ) { i n t f d , newfd , p i d ; s i g n a l ( SIGCHLD , s i g c h l d ) ; while (1) { newfd = a c c e p t ( fd ,... ) ; i f ( newfd