CIS 9340 Chapter 29 - Web Transaction and DBMS PDF
Document Details
Uploaded by ProblemFreeQuail
Tags
Summary
This document provides an introduction to web technology and database management systems (DBMS). It covers topics such as the internet, web, intranets, extranets, e-commerce, and e-business. The document also includes various stages of internet evolution in a business context.
Full Transcript
CIS 9340 CHAPTER 29 WEB TECHNOLOGY AND DBMS . WEB TECHNOLOGY AND DBMS Introduction to the Internet and the Web The Internet is made up of many separate but interconnected networks belonging to commercial, educational, and government organizations and Internet Service Providers (ISPs) The services...
CIS 9340 CHAPTER 29 WEB TECHNOLOGY AND DBMS . WEB TECHNOLOGY AND DBMS Introduction to the Internet and the Web The Internet is made up of many separate but interconnected networks belonging to commercial, educational, and government organizations and Internet Service Providers (ISPs) The services offered on the Internet include electronic mail conferencing and collaboration services, as well as the ability to access remote computers and send and receive files. Internet A worldwide collection of interconnected computer networks. WEB TECHNOLOGY AND DBMS The Internet began with funding from the NSF as a means to allow American universities to share the resources of five national supercomputing centers. Its numbers of users quickly grew as access became cheap enough for domestic users to have their own links on PCs. By the early 1990s, the wealth of information made freely available on this network had increased so much that a host of indexing and search services sprang up to answer user demand such as Archie, Gopher,Veronica, and WAIS (Wide Area Information Service), which provided services through a menu-based interface. In contrast, the Web uses hypertext to allow browsing, and a number of Webbased search engines were created, such as Google,Yahoo!, and MSN. From initially connecting a handful of nodes with ARPANET, the Internet was estimated to have over 100 million users in January 1997. WEB TECHNOLOGY AND DBMS Intranet A Web site or group of sites belonging to an organization, accessible only by the members of the organization Internet standards for exchanging email and publishing Web pages are becoming increasingly popular for business use within closed networks called intranets. Typically, an intranet is connected to the wider public Internet through a firewall with restrictions imposed on the types of information that can pass into and out of the intranet. Extranet An intranet that is partially accessible to authorized outsiders Whereas an intranet resides behind a firewall and is accessible only to people who are members of the same organization, an extranet provides various levels of accessibility to outsiders. WEB TECHNOLOGY AND DBMS e-Commerce and e-Business There is considerable discussion currently about the opportunities the Internet provides for electronic commerce (e-commerce) and electronic business (e-business). As with many emerging developments of this nature, there is some debate over the actual definitions of these two terms. Cisco Systems, now one of the largest organizations in the world, defined five incremental stages to the Internet evolution of a business, which include definitions of these terms. Stage 1: Email As well as communicating and exchanging files across an internal network, businesses at this stage are beginning to communicate with suppliers and customers by using the Internet as an external communication medium. This delivers an immediate boost to the business’s efficiency and simplifies global communication. Stage 2: Website Businesses at this stage have developed a Web site, which acts as a shop window to the world for their business products. The Web site also allows customers to communicate with the business at any time, from anywhere, which gives even the smallest business a global presence. WEB TECHNOLOGY AND DBMS Stage 3: e-Commerce e-Commerce Customers can place and pay for orders via the business’s Web site. Businesses at this stage are not only using their Web site as a dynamic brochure but also allowing customers to make procurements from the Web site, and may even be providing service and support online as well. This allows the business to trade 24 hours a day, every day of the year, thereby increasing sales opportunities, reducing the cost of sales and service, and achieving improved customer satisfaction. Stage 4: e-Business e-Business Complete integration of Internet technology into the economic infra-structure of the business. Businesses at this stage have embraced Internet technology through many parts of their business. Internal and external processes are managed through intranets and extranets; sales, service, and promotion are all based around the Web. Among the potential advantages are that the business achieves faster communication, streamlined and more efficient processes, and improved productivity. WEB TECHNOLOGY AND DBMS Stage 5: Ecosystem In this stage, the entire business process is automated via the Internet. Customers, suppliers, key alliance partners, and the corporate infrastructure are integrated into a seamless system. It is argued that this provides lower costs, higher productivity, and significant competitive advantage. WEB TECHNOLOGY AND DBMS The Web The Web A hypermedia-based system that provides a means of browsing information on the Internet in a nonsequential way using hyperlinks. The World Wide Web (the Web for short) provides a simple “point and click” means of exploring the immense volume of pages of information residing on the Internet (Berners-Lee, 1992; Berners-Lee et al., 1994). Information on the Web is presented on Web pages, which appear as a collection of text, graphics, pictures, sound, and video. In addition, a Web page can contain hyperlinks to other Web pages, which allow users to navigate in a nonsequential way through information. Much of the Web’s success is due to the simplicity with which it allows users to provide, use, and refer to information distributed geographically around the world. WEB TECHNOLOGY AND DBMS The Web consists of a network of computers that can act in two roles: as servers, providing information; and as clients, usually referred to as browsers, requesting information. Examples of Web servers are Apache HTTP Server, Microsoft Internet Information Server (IIS), and Google Web Server (GWS), and examples of Web browsers are Microsoft Internet Explorer, Firefox, Opera, and Safari. Much of the information on the Web is stored in documents using a language called HTML (HyperText Markup Language), and browsers must understand and interpret HTML to display these documents. The protocol that governs the exchange of information between the Web server and the browser is called HTTP (HyperText Transfer Protocol). WEB TECHNOLOGY AND DBMS HTTP The protocol used to transfer Web pages through the Internet. The HyperText Transfer Protocol (HTTP) defines how clients and servers communicate. HTTP is a generic object-oriented, stateless protocol to transmit information between servers and clients (Berners-Lee, 1992). HTTP/0.9 was used during the early development of the Web. HTTP/1.0, which was released in 1995 as informational RFC† 1945, reflected common usage of the protocol (Berners-Lee et al., 1996). The most recent release, HTTP/1.1, provides more functionality and support for allowing multiple transactions to occur between client and server over the same request. HTTP is based on a request– response paradigm. An HTTP transaction consists of the following stages: • Connection: The client establishes a connection with the Web server. • Request: The client sends a request message to the Web server. • Response: The Web server sends a response (for example, an HTML document) to the client. • Close: The connection is closed by the Web server WEB TECHNOLOGY AND DBMS HTML The document formatting language used to design most Web pages. The HyperText Markup Language (HTML) is a system for marking up, or tagging, a document so that it can be published on the Web. HTML defines what is generally transmitted between nodes in the network. It is a simple, yet powerful, platform-independent document language (Berners-Lee and Connolly, 1993). HTML was originally developed by Tim Berners-Lee while he was at CERN but was standardized in November 1995 as the IETF (Internet Engineering Task Force) RFC 1866, commonly referred to as HTML version 2. The language has evolved and the World Wide Web Consortium (W3C)† currently recommends use of HTML 4.01, which has mechanisms for frames, stylesheets, scripting, and embedded objects (W3C, 1999a). In early 2000, W3C produced XHTML 1.0 (eXtensible HyperText Markup Language) as a reformulation of HTML 4 in XML (eXtensible Markup Language) (W3C, 2000a). WEB TECHNOLOGY AND DBMS Uniform Resource Locators A string of alphanumeric characters that represents the location or address of a resource on the Internet and how that resource should be accessed. URL URLs define uniquely where documents (resources) can be found on the Internet. Other related terms that may be encountered are URIs and URNs. Uniform Resource Identifiers (URIs) are the generic set of all names/addresses that refer to Internet resources. Uniform Resource Names (URNs) also designate a resource on the Internet, but do so using a persistent, location-independent name. URNs are very general and rely on name lookup services and are therefore dependent on additional services that are not always generally available (Sollins and Masinter, 1994). URLs, on the other hand, identify a resource on the Internet using a scheme based on the resource’s location. URLs are the most commonly used identification scheme and are the basis for HTTP and the Web. WEB TECHNOLOGY AND DBMS Static and Dynamic Web Pages An HTML document stored in a file is an example of a static Web page: the content of the document does not change unless the file itself is changed. On the other hand, the content of a dynamic Web page is generated each time it is accessed. As a result, a dynamic Web page can have features that are not found in static pages, such as: • It can respond to user input from the browser. For example, returning data requested by the completion of a form or the results of a database query. • It can be customized by and for each user. For example, once a user has specified some preferences when accessing a particular site or page (such as area of interest or level of expertise), this information can be retained and information returned appropriate to these preferences. When the documents to be published are dynamic, such as those resulting from queries to databases, the hypertext needs to be generated by the server. To achieve this, we can write scripts that perform conversions from different data formats into HTML on the fly. These scripts also need to understand the queries performed by clients through HTML forms and the results generated by the applications owning the data (for example, the DBMS). WEB TECHNOLOGY AND DBMS Web Services In recent years Web services have been established as an important paradigm in building applications and business processes for the integration of heterogeneous applications in the future. Web services are based on open standards and focus on communication and collaboration among people and applications. Unlike other Web-based applications, Web services have no user interface and are not aimed at browsers. Instead, they consist of reusable software components designed to be consumed by other applications, such as traditional client applications, Web-based applications, or other Web services. WEB TECHNOLOGY AND DBMS Requirements for Web–DBMS Integration Although many DBMS vendors are working to provide proprietary database connectivity solutions for the Web, most organizations require a more general solution to prevent them from being tied into one technology. In this section, we briefly list some of the most important requirements for the integration of database applications with the Web. These requirements are ideals and not fully achievable at the present time, and some may need to be traded off against others WEB TECHNOLOGY AND DBMS The requirements are as follows: • The ability to access valuable corporate data in a secure manner. • Data and vendor-independent connectivity to allow freedom of choice in the selection of the DBMS now and in the future. • The ability to interface to the database independent of any proprietary Web browser or Web server. • A connectivity solution that takes advantage of all the features of an organization’s DBMS. • An open-architecture approach to allow interoperability with a variety of systems and technologies; for example, support for: – different Web servers; – Microsoft’s .NET framework; – CORBA/IIOP (Internet Inter-ORB protocol); – Java/RMI (Remote Method Invocation); – XML; – Web services (SOAP, WSDL, and UDDI; RESTful). WEB TECHNOLOGY AND DBMS • A cost-effective solution that allows for scalability, growth, and changes in strategic directions, and helps reduce the costs of developing and maintaining applications. • Support for transactions that span multiple HTTP requests. • Support for session- and application-based authentication. • Acceptable performance. • Minimal administrati’] \on overhead. • A set of high-level productivity tools to allow applications to be developed, maintained, and deployed with relative ease and speed. WEB TECHNOLOGY AND DBMS WEB TECHNOLOGY AND DBMS WEB TECHNOLOGY AND DBMS Approaches to Integrating the Web and DBMSs Some of the current approaches to integrating databases into the Web environment: • scripting languages such as JavaScript and VBScript; • Common Gateway Interface (CGI), an early, and possibly one of the most widely used, techniques; • HTTP cookies; • extensions to the Web server, such as the Netscape API (NSAPI) and Microsoft’s Internet Information Server API (ISAPI); • Java, JEE, JDBC, SQLJ, JDO, JPA, Servlets, and JavaServer Pages (JSP); • Microsoft’s Web Solution Platform: .NET, Active Server Pages (ASP), and ActiveX Data Objects (ADO); • Oracle’s Internet Platform. WEB TECHNOLOGY AND DBMS Scripting Languages Scripting languages allow the creation of functions embedded within HTML code. This allows various processes to be automated and objects to be accessed and manipulated. Programs can be written with standard programming logic such as loops, conditional statements, and mathematical operations. Some scripting languages can also create HTML on the fly, allowing a script to create a custom HTML page based on user selections or input, without requiring a script stored on the Web server to construct the necessary page. Most of the hype in this area focuses on Java, which we discuss in Section 29.7. However, the important day-to-day functionality will probably be supplied by scripting engines, such as JavaScript,VBScript, Perl, and PHP, providing the key functions needed to retain a “thin” client application and promote rapid application development. These languages are interpreted, not compiled, making it easy to create small applications. WEB TECHNOLOGY AND DBMS JavaScript and JScript JavaScript and JScript are virtually identical interpreted scripting languages from Netscape and Microsoft, respectively. Microsoft’s JScript is a clone of the earlier and widely used JavaScript. Both languages are interpreted directly from the source code and permit scripting within an HTML document. The scripts may be executed within the browser or at the server before the document is sent to the browser. The constructs are the same, except that the server side has additional functionality— for example, for database connectivity. JavaScript is an object-based scripting language that has its roots in a joint development program between Netscape and Sun, and has become Netscape’s Web scripting language. It is a very simple programming language that allows HTML pages to include functions and scripts that can recognize and respond to user events such as mouse clicks, user input, and page navigation. These scripts can help implement complex Web page behavior with a relatively small amount of programming effort WEB TECHNOLOGY AND DBMS VBScript VBScript is a Microsoft proprietary interpreted scripting language whose goals and operation are virtually identical to those of JavaScript/JScript, although unsupported by browsers such as Firefox and Opera.VBScript, however, has syntax more like Visual Basic than Java. It is interpreted directly from source code and permits scripting within an HTML document. As with JavaScript/JScript,VBScript can be executed from within the browser or at the server before the document is sent to the browser.VBScript is a procedural language and so uses subroutines as the basic unit. VBScript grew out of Visual Basic, a programming language that has been around for years.Visual Basic is the basis for scripting languages in the Microsoft Office packages (Word, Access, Excel, and PowerPoint). Visual Basic is component-based: a Visual Basic program is built by placing components on to a form and then using the Visual Basic language to link them together WEB TECHNOLOGY AND DBMS Common Gateway Interface (CGI) A specification for transferring information between a Web server and a CGI program. A Web browser does not need to know much about the documents it requests. After submitting the required URL, the browser finds out what it is getting when the answer comes back. The Web server supplies certain codes, using the MIME specifications to allow the browser to differentiate between components. This allows a browser to display a graphics file, but to save a .zip file to disk, if necessary. By itself, the Web server is only intelligent enough to send documents and to tell the browser what kind of documents it is sending. However, the server also knows how to launch other programs. When a server recognizes that a URL points to a file, it sends back the contents of that file. On the other hand, when the URL points to a program (or script), it executes the script and then sends back the script’s output to the browser as if it were a file. WEB TECHNOLOGY AND DBMS HTTP Cookies One way to make CGI scripts more interactive is to use cookies. A cookie is a piece of information that the client stores on behalf of the server. The information that is stored in the cookie comes from the server as part of the server’s response to an HTTP request. A client may have many cookies stored at any given time, each one associated with a particular Web site or Web page. Each time the client visits that site/page, the browser packages the cookie with the HTTP request. The Web server can then use the information in the cookie to identify the user and, depending on the nature of the information collected, possibly personalize the appearance of the Web page. The Web server can also add or change the information within the cookie before returning it. All cookies have an expiration date. If a cookie’s expiration date is explicitly set to some time in the future, the browser will automatically save the cookie to the client’s hard drive. Cookies that do not have an explicit expiration date are deleted from the computer’s memory when the browser closes. As a cookie is sent back to the server with each new request, they become a useful mechanism to identify a series of requests that come from the same user. When a request is received from a known user, the unique identifier can be extracted from the cookie and used to retrieve additional information from a user database WEB TECHNOLOGY AND DBMS Extending the Web Server CGI is a standard, portable, and modular method for supporting applicationspecific functionality by allowing scripts to be activated by the server to handle client requests. Despite its many advantages, the CGI approach has its limitations. Most of these limitations are related to performance and the handling of shared resources, which stem from the fact that the specification requires the server to execute a gateway program and communicate with it using some Inter-Process Communication (IPC) mechanism. The fact that each request causes an additional system process to be created places a heavy burden on the server. To overcome these limitations, many servers provide an Application Programming Interface (API), which adds functionality to the server or even changes server behavior and customizes it. Such additions are called non-CGI gateways. Two of the main APIs are the Microsoft’s Internet Information Server API (ISAPI) and the Apache Web Server API. WEB TECHNOLOGY AND DBMS Java Java is a proprietary language developed by Sun Microsystems (which has since merged into Oracle Corporation). Originally intended as a programming language suitable for supporting an environment of networked machines and embedded systems, Java did not really fulfill its potential until the Internet and the Web started to become popular. Now, Java is one of the most popular programming languages for Web computing. JDBC The most prominent and mature approach for accessing relational DBMSs from Java appears to be JDBC† (Hamilton and Cattell, 1996). Modeled after the Open Database Connectivity (ODBC) specification (see Appendix I.3), the JDBC package defines a data-base access API that supports basic SQL functionality and enables access to a wide range of relational DBMS products. With JDBC, Java can be used as the host language for writing database applications. WEB TECHNOLOGY AND DBMS Container-Managed Persistence (CMP) The EJB 2.0 specification not only defined Container-Managed Persistence (CMP) but also ContainerManaged Relationships (CMR) and the EJB Query Language (EJB-QL). We discuss these three components in this section but start with a brief overview of EJBs. The three types of EJBs (session, entity, and messagedriven) have three elements in common: an indirection mechanism, a bean implementation, and a deployment description. With the indirection mechanism, clients do not invoke EJB methods directly (with MDBs, clients do not invoke methods at all but place messages in a queue for the MDB to process). Session and entity beans provide access to their operations via interfaces. The home interface defines a set of methods that manage the life cycle of a bean. The corresponding server-side implementation classes are generated at deployment time. To provide access to other operations, a bean can expose a local interface (if the client and bean are colocated), a remote interface, or both a local and remote interface. Local interfaces expose methods to clients running in the same container or JVM. Remote interfaces make methods available to clients no matter where they are deployed. WEB TECHNOLOGY AND DBMS Java Servlets Servlets are programs that run on a Java-enabled Web server and build Web pages, analogous to CGI programming. However, servlets have a number of advantages over CGI, such as: • Improved performance.With CGI, a separate process is created for each request. In contrast, with servlets a lightweight thread inside the JVM handles each request. In addition, a servlet stays in memory between requests whereas a CGI program (and probably also an extensive runtime system or interpreter) needs to be loaded and started for each CGI request. As the number of requests increase, servlets achieve better performance over CGI. • Portability. Java servlets adhere to the “write once, run anywhere” philosophy of Java. On the other hand, CGI tends to be less portable, tied to a specific Web server. • Extensibility. Java is a robust, fully object-oriented language. Java servlets can utilize Java code from any source and can access the large set of APIs available for the Java platform, covering database access using JDBC, email, directory servers, CORBA, RMI, and Enterprise JavaBeans. WEB TECHNOLOGY AND DBMS • Simpler session management. A typical CGI program uses cookies on either the client or server (or both) to maintain some sense of state or session. Cookies, however, do not solve the problem of keeping the connection “alive” between the CGI program and the database—each client session is still required to reestablish or maintain a connection • Improved security and reliability. Servlets have the added advantage of benefiting from the built-in Java security model and the inherent Java type safety, making the servlet more reliable. WEB TECHNOLOGY AND DBMS Microsoft’s Web Platform Microsoft’s latest Web Platform, Microsoft .NET, is a vision for the third generation of the Internet where “software is delivered as a service, accessible by any device, any time, any place, and is fully programmable and personalizable.” To help understand this platform, we first discuss the composition of Microsoft’s technology, comprising OLE, COM, DCOM, and now .NET. Universal Data Access The Microsoft Open Database Connectivity (ODBC) technology provides a common interface for accessing heterogeneous SQL databases (see Appendix I.3). ODBC is based on SQL as a standard for accessing data. This interface (built on the C language) provides a high degree of interoperability: a single application can access different SQL DBMSs through a common set of code. This enables a developer to build and distribute a client–server application without targeting a specific DBMS. Although ODBC is considered a good interface for supplying data, it has many limitations when used as a programming interface. Many attempts have been made to disguise this difficult-to-use interface with wrappers. WEB TECHNOLOGY AND DBMS Microsoft .NET Although the Microsoft Web Solution Platform was a significant step forward, there were a number of limitations with the approach: • a number of programming languages were supported with different programming models (as opposed to JEE composed solely of Java); • no automatic state management; • relatively simple user interfaces for the Web compared with traditional Windows user interfaces; • the need to abstract the operating system (it was recognized that the Windows API was difficult to program for a variety of reasons). As a result the next, and current, evolution in Microsoft’s Web solution strategy was the development of Microsoft .NET. WEB TECHNOLOGY AND DBMS Oracle Internet Platform Oracle has a different approach to a Web-centric computing model provided by Oracle Fusion Middleware. The platform provides multiple services, including Java EE and developer tools, integration services, business intelligence, collaboration, and content management. Figure 29.28 provides a simplified overview of the Oracle Fusion Middleware architecture. It is an n-tier architecture based on industry standards such as: • HTTP and HTML/XML for Web enablement. • Java, JEE, Enterprise JavaBeans (EJB), JDBC and SQLJ for database connectivity, Java servlets, and JavaServer Pages (JSP), as discussed in Section 29.7. It also supports Java Messaging Service (JMS), Java Naming and Directory Interface (JNDI), and it allows stored procedures to be written in Java.