Week5_Internet and Web based system.pdf

Full Transcript

E-BUSINESS PROF. MAMATA JENAMANI DEPARTMENT OF INDUSTRIAL AND SYSTEMS ENGINEERING IIT KHARAGPUR 1 Week 5: Lecture 1 COMPONENTS OF E-BUSINESS INFRASTRUCTURE We are going to learn Goal of quality information se...

E-BUSINESS PROF. MAMATA JENAMANI DEPARTMENT OF INDUSTRIAL AND SYSTEMS ENGINEERING IIT KHARAGPUR 1 Week 5: Lecture 1 COMPONENTS OF E-BUSINESS INFRASTRUCTURE We are going to learn Goal of quality information services Components of the e-business technology infrastructure 3 Goals for quality of information services Performance Scalability Availability and maintainability Menasce, D.A. and Almeida, V.A., 2000. Scaling for E-Business: Technologies. Models, Performance, and Capacity Planning. Goals for quality of information services Performance – Response time – May be caused by ISP, Network, Servers, Applications, Third party services Scalability – Handling waves of demand – Scaling up (larger server) scaling out (more servers) Goals for quality of information services Availability and maintainability – Identification of the single points of failure – Minimum configuration needed – Self repairing capability – Availability of diagnostics and alert information – Emergency procedures – MTTF (meantime to failure) and MTTR (meantime to repair) Technology Platform for e-business Software Solutions Web Languages Packaged Solutions for E-Business Server Platforms Data Infrastructure Networking Infrastructure Networking overview Communication Protocols Network Security Digital Payment Systems Web System Architecture Internet Web Client Web Server and Database Application Server Server Web Server Elements HTTP Server TCP/IP Operating System Hardware Processor, Disks, Network Interfaces etc. Characteristics of a Web server Also known as HTTP Server/ HTTP Daemon Continuously listens to the client requests and returns the requested file Handles more than one request at a time – Forking / Multithreading Performance metrics for the Web server Throughput – The rate at which the HTTP requests are serviced – Measured in HTTP operations/second OR megabits per second (Mbps) Latency – The time required to complete a request – Average latency is the average time for handling requests. Dynamic Load Balancing Splitting the traffic across the servers Mirroring the site Methods – DNS Based Mapping to a cluster of servers in a round-robin fashion during address translation. – Dispatcher based Address of a special TCP router as the address of the Web server Router diverts the request to the server with less load – Server based Address redirection Increase in client response time Application Server Handles all the transactions between the Web server and the backend database Supports different programming languages and/or scripting languages Database Server Database management system Structured query language Database connectivity Other important components and concepts in E-business infrastructure Mainframe and Legacy systems – Integration Technologies Proxies – Network traffic reduction – Privacy and security (Firewalls) – Load balancing Caches – Traffic reduction – Levels of Caches – Dedicated community proxy servers Third party Services – Security services, Ad servers, Trust services, Escrow services – A source of additional delay in the Web servers response time Other data resources Data warehouses and data marts Online Analytical Processing Queries (OLAP) Business Intelligence 18 Week 5: Lecture 2 INTERNET AND THE WEB 19 We are going to learn Features of the Internet Infrastructure for connecting to the Internet Domain name system HTTP protocol and webpage generation 20 The Internet Originated in 1960 as a result of research supported by Advances Research Project Agency by US DOD – ARPANET A collection of networks Basic Features – Data Centric – Separation of communication from data processing – Packet Switching Features of a packet switched Network Network consists of two types of nodes – Hosts: Originators and destinations of data packets – Routers: Responsible for routing the packets A connectionless system – No-fixed routing scheme between the hosts – Routing tables changes based on network state Congestion or link failure – Packets arrive out of sequence packets A “Best-effort” delivery network – In case of congestion or link failure the packets are discarded – Recognition of failure and the corrective action is the task of the host computer. Connecting to the Internet To connect a computer to the internet it must be connected to a router that is a part of the Internet Routers are sponsored by a university, research centers, or commercial companies (ISPs). ISPs Operate at many levels – Local ISPs Lease Connections from the national or regional ISPs Provide dial-up access to the users and charge them – National or regional ISPs Have their own backbone to carry traffic Charge local ISPs for providing Some pieces of the Internet Kurose, J.F. and Ross, K.W., 2010. Computer networking: a top-down approach (Vol. 5). Reading: Addison-Wesley. 24 Domain Name System Converting IP addresses to human readable form An application on which many other application level protocols rely Includes a distributed database system responsible for storing domain names How DNS works Client enters a domain name (www.domainname.com) into his browser. The browser contacts the Client's ISP for the IP address of the domain name. The ISP first tries to answer by itself using "cached" data. If the answer is found it is returned. Since the ISP isn't in charge of the DNS, and is just acting as a "dns relay", the answer is marked "non-authoritative" If the answer isn't found, or it's too old, then the ISP DNS contacts the nameservers for the domain directly for the answer. If the nameservers are not known, the ISP's looks for the information at the 'root servers', or 'registry servers'. Getting a domain name ICANN (Internet Corporation for Assigned Names and Numbers) is the private (non- government) non-profit corporation with responsibility for IP address space allocation, protocol parameter assignment, domain name system management, and root server system management functions. 27 https://whois.icann.org/en/domain-name-registration-process 28 Uniform Resource Locator Unique address of an Internet resource Protocol://domain-name:port/directory/resource http://www.accd.edu/sac/lrc/john/wwwtest2.htm The port number can be deleted if it usage the standard port. HTTP Protocol An application level protocol A client issues a request to a server and server returns a response – Request is in ASCII format – Response in MIME (Multipurpose Internet Mail Extension) format Text: HTML Image: JPEG/GIF A stateless protocol HTTP request response model Client Server Requ t0 est fo r P ag eA t1 1.Web client makes a TCP connection to the server (at port 80). s Page A Time t2 Serve r S e nd t3 2.Sends HTTP request (header+data) t4 3. Server returns HTTP response. t5 Reque st for Pag (Status, header, requested resource) eB t6 Static Web page generation HTML Tags Browser 32 Dynamic Webpage Generation Server side programming  Database Connectivity  Passing additional data to the Web server  Java: Servlets, JSP  Microsoft: ASP  PHP, CGI Script Client side programming  Java scripts Cookies To cope with stateless nature of HTTP Tracking a client Supporting applications like shopping cart Privacy issues Servers sets cookies by sending a set-cookie header in HTTP response  Set-cookie: Name=Value Whenever required by the server the client includes cookie in the request header by using  Cookie: Name=value Week 5: Lecture 3 NETWORKING RESOURCES 35 We are going to learn ISO-OSI reference model TCP/IP protocol stack 36 Computer Network A set of communicating computing devices Consisting of the following building blocks – The framework Standard Organizations ISO-OSI Reference Model Addressing – Protocols Protocol suit Applications – Hardware Kurose, J.F. and Ross, K.W., 2010. Computer networking: a top-down approach (Vol. 5). – Physical Connectivity Reading: Addison-Wesley. Standard Organizations ISO (International Standard Organization) IAB (Internet Advisory board) IEEE (Institute of Electrical and Electronic Engineers) The ISO-OSI Model Reference Model for Computer Network Why do we need such a model Originally intended as the benchmark for the international standardization of computer networking protocols. A divide and conquer approach Layers are used to isolate groups of related functions so that development and flexibility are promoted through the use of well- defined interfaces. Each layer is insulated from the addressing details used by the layer below. Networking Protocols/ Protocol suits can be designed and compared in the framework of this model. Today TCP/IP is the most important protocol suit TCP/IP – A Layered Model Application Layer Provides a specific application Transport Layer Provides end-to-end transport service between two hosts Network Layer Forwards the packets across the network Link Layer Provides interface or access to the network TCP/IP and the OSI Model in context 7. Application Layer 6. Presentation Layer FTP HTTP Telnet SMTP 5. Session Layer 4. Transport Layer TCP UDP 3. Network Layer IP ARP 2. Data Link Layer LLC (Logical Link Control)–MAC (Medium Access Control) 1. Physical Layer Physical Processing at Each Layer Stream Application 1 Application Data Appln Header Layer 2 TCP Segment TCP 3 Application Application Data Layer Header Header 4 IP 5 TCP Application Application Data IP Datagram Header Header Header Layer 6 Frame Link 7 IP TCP Application Application Data Link Header Header Header Header Layer Transfer of Packet Host A Host B Application Application Application Application Data Data TCP TCP IP IP IP IP Link Link Link Link Link Layer Provides access to the network Addresses physical characteristics Handles many access control protocols for each physical network standard Functions – Encapsulation of IP datagrams into frames – Mapping of IP addresses to physical address used by the network Network Layer Internet Protocol – Defining datagram – Defining Internet addressing scheme – Moving data between Network layer and Transport layer – Routing datagrams – Performing segmentation and reassembling of datagrams IP Addresses IPv4 – 32 bit address IPv6 – 128 bit addresses 0 IPv4 Header Format 31 Other Control fields Other Control fields TTL PID Check Sum Destination Address Source Address Options and Padding Representation of IP Addresses Dot decimal format – Ex. 128.0.0.1 – Binary equivalent of the above is 10000000.00000000.00000000.00000001 Consists of two parts – Network number – Host number (within the network) Transport Layer TCP and UDP TCP (Transmission control protocol) – Connection oriented – Handshaking – Source port, destination port, sequence number and acknowledgement. – Sliding window mechanism UDP (User datagram protocol) – Connectionless – No handshaking – Source port and destination port – No acknowledgement – No retransmission 0 UDP Header Format 31 Source Port Destination Port Length Checksum 0 TCP Header Format 31 Source Port Destination Port Sequence Number Acknowledgement Number Off Flags Window Check Sum Urgent Pointers Options and Padding Application Layer Includes all the processes that use the transport layer protocol to deliver data. Example: – HTTP: Hypertext Transfer Protocol – FTP – Telnet – SMTP Protocol Port and Socket Data multiplexing and demultiplexing – Combining data from many sources for delivering to the network – Dividing the data for delivery to multiple sources Protocol number: to identity transport protocol Port number: To identify application – May be dynamically allocated by the system Socket: The combination of IP address and Port number – Uniquely identifies a network process within entire Internet Week 5: Lecture 4 HARDWARE AND SOFTWARE RESOURCES 53 We are going to learn Networking hardware Computing hardware Storage options Software resources 54 Networking hardware in Context 7. Application Layer 6. Presentation Layer 5. Session Layer Gateways 4. Transport Layer Routers 3. Network Layer Bridge- Routers 2. Data Link Layer Switches, Bridges 1. Physical Layer Repeater, Transceivers Transceivers (Media Attachment Units) Provide the means for encoding data into purely electrical or light signals ready for transmission onto the physical media. Also responsible for converting the signal back into the data at the receiving station. Ex. Network Adapter Card Repeaters Used to extend the LAN Regeneration of the Frames Must be compliant with maximum acceptable delay in the network (bit-budget delay) Mostly dumb Some are semi-intelligent – Memory – Inhibit regeneration of error frames and collision frames Ex: 10/100 Base T (Ethernet) Bridges Offer filtering and forwarding capability based on Layer 2 fields and independent of Layer 3 protocols. Filtering and forwarding capability on layer 2 fields to increase backbone efficiency. Traffic management capability at Link level – Associating node MAC addresses with particular interfaces and forwarding them Responsible for preserving network topology integrity by stopping the formation of loops – Using protocols such as spanning tree or its variants Switches Used when there is a need for higher bandwidth in shared access LAN High speed bridges Replacing the old bridges and repeaters Routers A special purpose layer 3 device used instead of a host. Forwards network traffic based on IP addresses rather than the MAC addresss Communicate with one another, learning neighbors, routs, costs, and addresses and select the best path routs for individual packets. Scalable and can support very large internetworks in terms of both load and addressing Requires skilled support and maintenance staffs Gateways A generic term Any network device capable of protocol translation capability Transport Relay devices Older literatures refer routers as gateways Computer hardware platforms – Client machines Desktop PCs, mobile devices – PDAs, laptops – Servers Blade servers: ultrathin computers stored in racks – Mainframes: IBM mainframe equivalent to thousands of blade servers – Top chip producers: AMD, Intel, IBM – Top firms: IBM, HP, Dell, Sun Microsystems Server A computer that provides services to other computers, or the software that runs on it Ex. – Application server, a server dedicated to running certain software applications – Communications server, carrier-grade computing platform for communications networks – Database server, provides database services – Fax server, provides fax services for clients – File server, provides file services – Game server, a server that video game clients connect to in order to play online together – Standalone server, an emulator for client-server (web-based) programs – Web server, a server that HTTP clients connect to in order to send commands and receive responses along with data contents. Factors that influences a server selection Most Applications Support Important Cost Ease of Administration Familiarity Homogeneity Interoperability Reliability (mean-time-between-failures) Scalability Least Security Important Vendor Support (Source: A study by Advisory Council ) How do we define execution time? Response Time – Also known as Lapsed time, Wall-clock time, Execution Time, Latency to complete a task – Includes disk access time, memory accesses, input-output activities, operating system overhead CPU Time – The time when CPU is computing (not including the waiting time for I/O or for running other programs) – user CPU time – system CPU time. System performance refers to elapsed time on the unloaded system CPU performance refers to user CPU time on the unloaded system. Response time of a computer system not only depends on the CPU time but also on the I/O time. – For example, suppose we have a difference between CPU time and response time of 10%,and we speed up the CPU by a factor of 10,while neglecting I/O. Then we will get a speedup of only 5 times, with half the potential of the CPU wasted. – Similarly, making the CPU 100 times faster without improving the I/O would obtain a speedup of only 10 times, squandering 90% of the potential. Thus I/O performance can reduce CPU performance. While making a purchasing decision, generally the cost is held constant. – determined by either system or commercial requirements. Speed and storage capacity are adjusted to meet the cost target. The Concept of Memory Hierarchy A simple axiom of hardware design: Smaller is faster. – In high speed machines , signal propagation is a major cause of delay; larger memories have more signal delay and require more levels to decode addresses. – In most technologies we can obtain smaller memories that are faster than larger memories. This is primarily because the designer can use more power per memory cell in smaller design. – The fastest memories are generally available in smaller numbers of bits per chip at any point in time, and they cost substantially more per byte. The principle of locality Levels in a typical memory hierarchy C Memory CPU a c Bus Memory I/O Bus I/O Devices Register h e Memory Disk Register Cache Reference Memory Reference Reference Reference Faster Slower The cache A cache is a small, fast memory located close to the CPU that holds the most recently accessed code or data. – cache hit. – cache miss – Temporal locality – spatial locality The time required for the cache miss depends on both the latency of the memory and its bandwidth, which determines the time to retrieve the entire block. A cache miss, which is handled by hardware, usually causes the CPU to pause, or stall, until the data are available. Main Memory all objects referenced by a program need to reside in main memory. – virtual memory – pages. – page fault The CPU usually switches to some other task while the disk access occurs. Types of Storage Devices Magnetic storage Semiconductor storage Optical disc storage Magnetic storage Non-volatile in nature Magnetic storage uses different patterns of magnetization on a magnetically coated surface to store information. The information is accessed using one or more read/write heads. Since the read/write head only covers a part of the surface, magnetic storage is sequential access and must seek, cycle or both. The example includes – Magnetic disk: Floppy disk, Hard disk – Magnetic tape, used for tertiary and off-line storage Semiconductor memory Semiconductor memory uses semiconductor-based integrated circuits to store information. A semiconductor memory chip may contain millions of tiny transistors or capacitors. Both volatile and non-volatile forms of semiconductor memory exist. In modern computers, primary storage almost exclusively consists of dynamic volatile semiconductor memory or dynamic random access memory. A type of non-volatile semiconductor memory known as flash memory is used as off-line storage for home computers. Non-volatile semiconductor memory is also used for secondary storage in various advanced electronic devices and specialized computers. Optical disc storage Optical disc storage uses tiny pits etched on the surface of a circular disc to store information, and reads this information by illuminating the surface with a laser diode and observing the reflection. Optical disc storage is non-volatile and sequential access. The following forms are currently in common use: – CD, CD-ROM, DVD: Read only storage, used for mass distribution of digital information (music, video, computer programs) – CD-R, DVD-R, DVD+R: Write once storage, used for off-line storage – CD-RW, DVD-RW, DVD+RW, DVD-RAM: Slow write, fast read storage, used for off-line storage Network storage Network storage is any type of computer storage that involves accessing information over a computer network. Network storage arguably allows to centralize the information management in an organization, and to reduce the duplication of information. Examples – Direct Attached Storage (DAS) – Network Access Storage (NAS) – Storage area Networks (SAN) Direct-attached storage (DAS) Direct-attached storage, or DAS, is the most basic level of storage, in which storage devices are part of the host computer, as with drives, or directly connected to a single server, as with RAID arrays or tape libraries. Network workstations must therefore access the server in order to connect to the storage device. Direct-attached storage (DAS) This is in contrast to networked storage such as NAS and SAN, which are connected to workstations and servers over a network. As the first widely popular storage model, DAS products still comprise a large majority of the installed base of storage systems in today's IT infrastructures. Network Access Storage (NAS) Network Access Storage (NAS): NAS systems are generally computing- storage devices that can be accessed over a computer network (usually TCP/IP), rather than directly being connected to the computer (via a computer bus such as SCSI). This enables multiple computers to share the same storage space at once, which minimizes overhead by centrally managing hard disks. NAS systems usually contain one or more hard disks, often arranged into logical, redundant storage containers or RAID arrays. Network Access Storage (NAS) Almost any machine that can connect to the LAN (or is interconnected to the LAN through a WAN) can use NFS (Network file system), CIFS (Common Internet File System) or HTTP protocol to connect to a NAS and share files. A NAS allows greater sharing of information especially between disparate operating systems such as Unix and NT. Storage area Networks (SAN) A storage area network (SAN) is very similar to NAS, except it uses a block-based protocol and generally runs over an independent, specialized storage network. Only server class devices with SCSI Fiber Channel can connect to the SAN. SAN File Sharing is operating system dependent and does not exist in many operating systems. The Fiber Channel of the SAN has a limit of around 10km at best. However data transfer is much fasted in case of SANS. NAS Scalable Data can be transferred over a long distance Slow, Problem of congestion Inefficient data backup and recovery (depends on DAS devices) SAN Efficient data integrity, backup and recovery Faster, No congestion Not easily scalable Data can not be transferred over a long distance Software Resources The Operating System Operating System is a collection of programs designed to manage the system’s resources, namely, memory, processors, devices and information (program and data). The operating system keeps track of the resources, deciding on which process is to get the resource (how much and when), and allocating it and reclaiming it if necessary. Functions of Operating Systems 7A-85 86 Week 5: Lecture 5 DATA RESOURCES 87 We are going to learn Types of data resources 88 logical data elements in information systems 89 The concept of entities and relationships 90 Entity Relationship Diagram 91 Relational database structure 92 Logical User Views Data elements and relationships (the subschemas) needed for checking, savings, or instalment loan processing Data elements and relationships (the schema) needed for the support of all bank services Software Interface The DBMS provides access to the bank’s databases Physical Data Views Organization and location of data on the storage media 93 The concept of a database management system 94 The concept of Structured Query Language 95 Major types of databases used by organizations and end users 96 Components of a complete data warehouse system 97 Data mining process 98 99 Multi dimensional view of the data 100 Online analytical processing (OLAP) Consolidation. Consolidation involves the aggregation of data, which can involve simple roll-ups or complex groupings involving interrelated data. For example, data about sales offices can be rolled up to the district level, and the district-level data can be rolled up to provide a regional-level perspective. 101 Online analytical processing (OLAP) Drill-down. OLAP can also go in the reverse direction and automatically display detailed data that comprise consolidated data. This process is called drill-down. For example, the sales by individual products or sales reps that make up a region’s sales totals could be easily accessed. 102 Online analytical processing (OLAP) Slicing and Dicing. Slicing and dicing refers to the ability to look at the database from different viewpoints. One slice of the sales database might show all sales of a product type within regions. Another slice might show all sales by sales channel within each product type. Slicing and dicing is often performed along a time axis to analyse trends and find time-based patterns in the data. 103 104

Use Quizgecko on...
Browser
Browser