CSCI571-Notes.pdf
Document Details
Uploaded by AwedBalalaika
Full Transcript
CSCI 571 Lecture Notes CSCI 571 Lecture Notes Lec1 Course Introduction Course objectives Sample Web Sites Web server farms Cloud Computing Example: Amazon's Elastic Compute Cloud Example: Google Cloud Platform Serverless Architecture...
CSCI 571 Lecture Notes CSCI 571 Lecture Notes Lec1 Course Introduction Course objectives Sample Web Sites Web server farms Cloud Computing Example: Amazon's Elastic Compute Cloud Example: Google Cloud Platform Serverless Architecture Web Browsers Use Standard Layout Engines Evolution of Web Sites Lec2 Internet Trends and Web Basics Internet Trends IoT IoT Protocols IoT platforms Domain Name System (DNS) Internet Domain Names Top Level Domain Names (TLDs) World Wide Web Major Technology Components WWW server Uniform Resource Locator (URL) Markup Languages Lec3 HTML What is HTML? Version of HTML HTML General Structure Lec4 HTML: Style Sheets Use inline stle attribute Use element Composite Styles DOCTYPE directive Style Sheet Media Types Pseudo Elements and Classes Properties of Style Setting 1. Inheriting Style Properties 2. Precedence (specificity) 3. Cascade Box Model CSS Vendor Prefixes Reset CSS Lec5 JavaScript Basics Event Handlers What JavaScript can do? JavaScript Basics of the Language Literals Variables Arrays Object Popup Boxes Common mistakes ECMAScript Lec6 JavaScript Object Notation (JSON) What is JSON? Brief History How to use the JSON format? JSON Basic Data Types Rules for JSON Parsers Same Origin Policy JSON: the Cross-Domain Hack XMLHttpRequest Compared to the Dynamic Script Tag Arguments against JSON JSONP Lec7 Python Flask Lec8 Document Object Model (DOM) Useful DOM Functions XMLHTTPRequest Object Lec9 Forms and Common Gateway Interface Mechanism Forms Form Control Group: fieldset Common Gateway Interface (CGI) Purpose of CGI CGI Script Environment Variables Lec12 HTTP Protocol What does WWW server do? How does a Web server communicate? HTTP History MIME Media types HTTP Scenario An HTTP 1.0 "default" Scenario A more Complicated HTTP Scenario Caching Proxies Gateways Tunnels the Most General HTTP Scenario Connections Persistent Connections Example of a GET request Client HTTP request HTTP Headers Byte Range Headers Entity Tags HTTP Status Codes HTTP Authentication META HTTP-EQUIV (meta tag) X-Frame-Options: sameorigin HTTP Strict-Transport-Security (HSTS) Cross-origin resourse sharing (CORS) Lec14 Secure Web Communication & Web Server Performance Secure Web Communication Public & private key encryption Digital Certificates & Certifying Authorities Secure Sockets Layer Protocol (SSL) and https Web Server Performance Popular platforms Web Server Farms Load Balancing Switches DNS redirection Web Server Performance Testing Benchmarking Improving Apache Web Server Performance Web Server as Proxy Server HTML Meta Tags vs. HTTP Headers Using Apache as a proxy server Lec15 Web Service and REST REST Service Introduction REST (Representational State Transfer) REST vs Other Approaches REST as Lightweight Web Services Cloud Service Lec16 Ajax: Asynchronous JavaScript + XML Traditional vs. Ajax Websites Ajax Engine Role Security Issues Ajax Cross Domain Security Cross-domain solutions Fetch API Lec17 Responsive Web Design The Need: Mobile Growth Design for Mobile Web Why not use mobile.mycompany.com webs? What is Responsive Web Design Major Technology Features Media queries Fluid grids Scalable images Bootstrap Lec18 JS Frameworks Node.JS AngularJS Basic functionality Goals Features of Angular 4 RxJS Lec19 jQuery jQuery Basic Selectors jQuery Functions jQuery & AJAX jQuery Event Lec20 High Performance Websites Lec24 Serverless Application Overview of Serverless Serverless Architectures Features of Serverless Architectures Faas Baas AWS Lambda AWS Lambda Example Architecture Overview of Containers Container Architectures: Docker Where we go from here AWS Lambda + AWS API Gateway GCP Functions Lec25 HTML5: the Next Generation Major New Elements in HTML5 Lec26 Cookies and Privacy What is a Cookie? Elements of a Cookie Cookie Scope Cookie Types and Taxonomy Cookie Processing Algorithm Additional Facts about Cookies Client-Side Cookies Cookie-based marketing Opt-Out Lec27 Web Security - Hacking the Web General Introduction Common ways that websites get infected How users get the malware planted by hackers? The Damage Caused by Hackers Authentication Attacks Brute-forcing Attacks Insufficient Authentication Weak Password Recovery Validation PassPhrases Client-Side Attacks Cross-site Scripting (XSS) Browser and plugin vulnerabilities Clickjacking Injection Attacks Recent Attacks Privacy Tools TOR Lec1 Course Introduction Course objectives Core technologies HTML and CSS HTTP Web servers Server-Side programming using JavaScript and Python Client-side programming using JavaScript and JS Frameworks Ajax Development Style New technologies: Responsive Website Design (Bootstrap, etc.) JS Frameworks (Angular, React and Node.js) Web Services (REST) Web security, TOR, Dark web Native Mobile frameworks (Java / Android and Swift / iOS) React (native) Cloud computing (AWS, GCP, Azure) Serverless Applications, Containers, Docker AWS Lambda, Google Cloud Functions, Azure Functions Sample Web Sites 1. Modest Size: www.fogdog.com: Online sale of sporting goods Solution: Commodity hardware Linux server running Apache 2.0 web servers Using MySQL data base Move to www.ebay.com/str/fogdog: F5 BIG-IP OS, Apache 2.0.64 web server 2. Medium Size: www.autobytel.com: New/used car sale (now AutoWeb) Original Microsoft solution: Microsoft Windows Server Microsoft IIS 7.5 web server Microsoft SQL server database Akamai CDN Today: Windows Server Microsoft IIS/7.5 web server 3. Large Size: www.etrade.com online investing services and resources Solution: IBM 90 xSeries running Linux/Citrix Netscaler, Apache and Tomcat web servers, AWS Route 53 (DNS) Hardware facility for load balancing and redundancy Oracle database system Proprietary programming systems Web server farms Recently all serious web sites were maintained using web server farms: A group of computers acting as servers and housed in a single location; Internet Service Providers (ISP’s) provide web hosting services using a web server farm Hardware and software is used to load balance requests across the machines Other issues addressed: Redundancy Eliminate single point of failure Backup and failover strategy Security: secure areas behind firewalls which monitor web traffic, network address translation, port translation, SSL Popular Web Hosting Services: For individuals and small business: 1&1 GoDaddy.com Yahoo For companies willing to pay MUCH higher cosets: Rackspace Network Solutions Reviews and price comparisons: Cloud Computing Cloud computing is Internet-based computing, shared resources, software, and information are provided to computers and other devices on demand, like the electricity grid User does not need to be expert in the infrastructure Cloud computing providers applications online that are accessed from another Web service/software software and data are stored on servers Major cloud service provider: Amazon Google Microsoft Salesforce Skytap HP IBM Apple iCloud Example: Amazon's Elastic Compute Cloud A web service providing resizable compute capacity elastic: the service instantly scales to meet demand with no up-front investment user need to create Amazon Machine Image (AMI) Amazon’s Simple Storage Service (S3): large- scale, persistent storage Example: Google Cloud Platform Basic compute, storage, big data services, massively scalable gaming solutions, mobile application backend, and Apache Hadoop App Engine: A platform for building scalable web applications and mobile backend, scales automatically in amount of traffic it receives Compute Engine: Offers predefined virtual machine configurations Google uses software-defined networking technology to route packets across the globe and enable fast edge-caching so that data is where it needs to be to serve users Serverless Architecture Internet based systems, application development noes not use the usual server process rely on combination of: 3-party services, or Backend as a Service (BaaS) Client-side logic Service hosted remote procedure calls, or Function as a Service (FaaS) AWS Lambda is implementations of FaaS Web Browsers Use Standard Layout Engines WebKit: used to render web pages, open source used by Chrome and Safari web browsers Gecko: layout engine of Firefox web browser used to display web pages and application's user interface provide rich programming API Originated with Netscape Communications Corporation Some web kits and the browsers that use them Gecko-based: FireFox (Mozilla), Flock, Netscape Trident-shells: Internet Explorer (Microsoft) EdgeHTML: Edge (Microsoft), fork of Trident 7 Jan 2020 moves to Chromium WebKit-based: Chrome and Android (Google), Midori, Safari and Mobile Safari (Apple), Symbian$^3$ (Nokia) and many others Chromium: Chrome Presto-based: Opera, Nintendo DS, Opera Mini, Opera Mobile Java-based: HotJava, Lobo Web Browsers can: 1. Mouse-driven graphical user interface 2. Display of Hypertext documents (HTML standard) Text with fonts/styles/point_size Foreign-language character sets (ISO-8859) Forms composed of edit boxes, check boxes, radio boxes, lists, text areas Graphics in different formats 3. Invoke helper applications and plug-ins (Obsoleted in HTML5): Adobe Acrobat (pdf files) Windows Media Player (digital sound files) Adobe Flash Player (video) Retired in 2020 4. Communicate over a secure channel (SSL) 5. Maintain/Exchange digital certificates 6. Run scripts in JavaScript 7. run Java applets and Active X components (also obsoleted in HTML5) Browser rank: Chrome > Firefox > Edge/IE > Safari > Opera 85% of browsers use WebKit !!!!! Internet Explorer Browser Caching: History: Links/URLs accessed before Disk cache: Temporary internet files Memory cache: Session-based information that is cached during the session Offline content: Web content is downloaded when online and viewed offline Evolution of Web Sites 1st gen (1991): Client-centric, Static HTML, Scripts, CGI 2nd gen (1997): Server Applications, Databases, Dynamic web pages ODBC, JDBC ASP, Applets, ActiveX 3rd gen (2000): Web services Multiple layers, Business and service Integration XML, WML, SQL,.NET, COM+, Beans 4th gen (2005): Service Oriented Arch (SOA), Client-centric Ajax, Web 2.0, JSON 5th gen (2008): Multi-platform (desktop, tablet, phone), Client-centric HTML5, CSS3, JS, gestures navigation 6th gen (2014): IoT, Wearables, Cloud computing, Serverless Arch (Baas, Faas) JS Frameworks, AWS, GCP, Azure, Microservices containers Lec2 Internet Trends and Web Basics Internet Trends Internet:__ a global digital infrastructurethat connects computers WWW: a mechanism that unifies the retrieval and display of a subset of data on the Internet Intranet: a local/global information structure that connects an organization internally. (also use Web technologies now) Extranet: a private network that uses the public telecommunication system to securely share part of a business's information/operations Recent trends in Internet Development: Growth: number of users connected Smartphone use (iOS/Android) digital data (photo/video) Social media Internet use from Mobile/tablet (平板和移动端) over desktop/laptop use of cloud Derease: dominance of Microsoft Windows Host counts in 2019 > 1,012 million IoT IoT: the Internet of Things IoT Protocols Device/thing to Gateway: ZigBee: Wireless sensors BLE: Wireless sensors ModBus (Serial or TCP) Gateway to Server: ModBus TCP: common OPC: common for industrial assets HTTP: JSON over HTTP MQTT: Consumer oriented, promising IoT platforms Amazon IoT Physical/Shadow Device (Persisted JSON State) MQTT Endpoint Rules AWS Connectivity GE Predix 2.0 (PaaS) CloudFoundry, HDP Asset Model, Machine Connectivity, Time Series DB, Analystics Plugin (BPMN) PTC ThingWorx Originally HMI for TCP-connected devices Xively Device connectivity, time series database, connectivity to applications Popular with Arduino developers Domain Name System (DNS) DNS resolution: when visit a website, the computer need to perform DNS lookup Complex pages require multiple DNS lookups before loading DNS latency mainly from: round-trip time to make the request and get the response, due to network congestion, overloaded servers, denial-of-service attacks Cache misses which cause recursive querying of other name servers Google has introduced Google Public DNS use 8.8.8.8 and 8.8.4.4 handles more than 70 billion requests a day! Google also has IPv6 addresses Another alternative is opendns.com a global network of DNS resolvers to speed resolution Free for basic service, but upgrades cost Internet Domain Names DNS is a mapping to/from IP addresses to domain names Defined in RFC 1034, 1035 13 top level root name services founded in 1998, ICANN is the organization in charge of maintaining the DNS system Top Level Domain Names (TLDs) In 1984, originally divided into 6 logical categories com edu gov mil net org In 2001 new top level domain added: biz, info, name, musem, coop, aero, pro, xxx In 2009 ICANN agreed to accept internationalized domain names, encoded as Unicode In 2011 ICANN announced expansion of TLDs, giving requirements for anyone wanting to establish one In 2019.com ,.net are the most popular top name domain. World Wide Web Define: A wide-area hypertext, multimedia information retrieval system that provides access to a large universe of documents A uniform way of accessing and viewing some information on the Internet WWW subsumes the capabilities of ftp, gopher, wars, and news Major Technology Components Client/server architecture: client programs interact with web servers Network protocol: HTTP understood by browsers and web servers Addressing system (Uniform Resource Locators) Markup Language: support HyperText and multimedia WWW server Web browsers/servers communicate according to a protocol (HTTP) current HTTP is version 1.1 The Web server is a software system running on a machine often called the Web server A web server can receive/reply to HTTP requests retrieve documents from specified directories run programs in specified directories handle limited forms of security A web server does not know about the contents of a document, links in a document, images in a document or whether a particular file, e.g. a *.gif file, is in the correct format Uniform Resource Locator (URL) A mechanism whereby an Internet resource can be specified in a single line of ASCII text RFC 1738 General description of URL: 1. Scheme http:, ftp:, news:, wais: 2. Double dash // 3. Internet domain name: usc.edu 4. Port number (optimal) 5. Path Markup Languages HTML: hypertext markup language, specifies document layout and the specification of hypertext links to text, graphics and other objects Browsers display text and graphics using the markup as guidance HyperText: Regular text, with the additional feature of links to related documents Lec3 HTML Lec3 HTML What is HTML? hypertext markup language (HTML) can describe: The display and format of text The display of graphics Pointers to other html files Pointers to files containing graphics, digitized video and sound Forms that capture information from the viewer HTML: by Tim Berners-Lee of CERN around 1990 understand by WWW browsers Version of HTML 1990 V0: original one V1: highlighting & images 1995 V2: V0 + V1 + forms 1997 V3.2: released by W3CW, tables 1999 HTML4.01 2014 HTML5: vocabulary & APIs 2017 HTML5.2 2019: HTML Living Standard W3C & WHATWG agreement HTML General Structure HTML documents have a head and body A leading line indicates the version of HTML comments in HTML: , cannot be nested IE/Firefox are tolerant browsers: not insist that the HTML document begin and end with and/or tags are not required HTML chracter set HTML uses Universal Character Set (UCS), defined in ISO10646 Character references: numeric character entity HTML anchor to designate a link to another document or to a specific place in the same document anchor name : Unique & String matching anchor using id attribute: use href=#id where id is from other tag id and name attributes share the same name space (cannot use same as each other) Universal Resource Identifier (URI): scheme of the mechanism used to access the resource name of the machine hosting the resource name of the resource itself, given as a path Fragment identifiers are URIs that refer to a location within a resource e.g. http://www.usc.edu/dept/cs/index.html#section2 link element in part: provide a variety of information to search engines Links to alternate versions of a document, written in another human language Links to alternate versions of a document, designed for different media Links to the starting page of a collection of documents Links to style sheets and “media queries” used in Responsive Web Design Create graphic: image source: digital camera/phone graphic editor scanner image format: x-pixelmaps: 256 colors GIP JPEG: includes image compression; for photographic images PNG (portable network graphics): lossless compression; patent-free compared with GIF & TIFF why alt attribute in tag? replace an image with text, if the image is unavailable or a text browser is used active image: with a border around it and the cursor changes shape when passed over usemap attribute in tag element: insert Name/Value pairs describing document properties & robotic exclusion: index: whether the search engine can index the page follow: whether the web crawler can follow links contained by the page Why validate HTML? Browsers display HTML differently Browsers treat HTML errors differently Lec4 HTML: Style Sheets start from HTML4.x style sheets specify: the amount of white space between text or between lines the amount lines are indented the colors for text/backgrounds font size and text style the precise position of text/graphics Style sheet language: CSS , XSL express style within HTML: element and style attribute to point to external style sheets combining style information from multiple sources, called cascading There is a defined order of precedence where the definitions of a style element conflict Pre-defined color names Black="#000000" Silver="#C0C0C0" Gray="#808080" White="#FFFFFF" Maroon="#800000" Red="#FF0000" Purple="#800080" Fuschia="#FF00FF" Green="#008000" Lime="#00FF00" Olive="#808000" Yellow="#FFFF00" Navy="#000080" Blue="#0000FF" Teal="#008080" Aqua="#00FFFF" Use inline stle attribute Setting Body Attributes The nine planets of the solar system... Use element The Solar System BODY {text-align: center} The nine planets of the solar system are mercury, venus, earth, mars, jupiter, saturn, uranus, neptune and pluto. The very nearest star is about 7,000 times farther away than pluto is to our sun. ID attribute can only be used once in the entire document class rule preceded by. and applied to multiple elements Values assigned to ID and class are case sensitive Composite Styles font-family: Verdana, Arial, Helvetica, sans-serif; font-size:small; font-style:normal; font-variant:small-caps; font-weight:bold; line-height:2em; is equal to font: normal small-caps bold small/2em Verdana, Arial, Helvetica, sans-serif; DOCTYPE directive Instructs modern browsers to work in ‘standards compliant mode Your web page will look the same in all browsers – Browsers turn off their proprietary extensions Fonts are rendered in the same way For example, font-size: small, is rendered the same size on all browsers HOWEVER, if you do not specify a !DOCTYPE , browsers work in Quirks mode Internet Explorer will display fonts larger than standards mode IE Uses the ‘broken box model’ Measures the dimensions of a box using the inner size, not the outer size as in standard mode Style Sheet Media Types Enable authors to create documents for different media types: H1 {color:blue} H1 {text-align:center} Used in CSS3 for media queries @media all and (min-width:500px) {... } @media (min-width:500px) {... } recognized media types: all , braille , embossed , handheld , print , projection , screen , speech , tty , tv , 3d- glasses Pseudo Elements and Classes pseudo-classes :link – a normal, un-visited link :visited – a link the user has visited :hover - a link when the user mouses over it :active - a link the moment it is clicked :lang - selects every element with a lang attribute :focus - selects the input element which has the focus :first-child - select every elements that is the first child of its parent pseudo elements :first-line, add a special style to the first line of a text :first-letter, add a special style to the first letter of a text :before, to insert some content before the content of an element :after, to insert some content after the content of an element Properties of Style Setting 1. Inheriting Style Properties Some CSS property values set on parent elements are inherited by their child elements, and some aren’t. and tags have no initial presentation properties exception, line break before and after a tag – applies to inline elements (example: ) applies to block elements (example: ) With CSS, properties such as text-align are “inherited” from the parent element 2. Precedence (specificity) Specificity is how the browser decides which rule applies if multiple rules have different selectors but could still apply to the same element. The more precise a specification is, the higher the precedence a style for tag.class has higher precedence than one for.class, which has higher precedence than a style for the tag itself styles defined using a style attribute (inline) have highest precedence styles defined using element have next highest precedence styles defined in a separate file, e.g. special.css, have lowest precedence 3. Cascade At a very simple level this means that the order of CSS rules matter; when two rules apply that have equal specificity the one that comes last in the CSS is the one that will be used. Box Model Each box has a content area (e.g., text, an image, etc.) and optional surrounding padding, border, and margin areas. margin: 10px 5px 15px 20px; means: top margin is 10px right margin is 5px bottom margin is 15px left margin is 20px CSS Vendor Prefixes The CSS browser prefixes are: – Android: -webkit- – Chrome: -webkit- – Firefox: -moz- – Internet Explorer: -ms- – iOS: -webkit- – Opera: -o- – Safari: -webkit- Reset CSS A CSS Reset is a short, often compressed (minified) set of CSS rules that resets the styling of all HTML elements to a consistent baseline. The goal of a reset stylesheet is to reduce browser inconsistencies in things like default line heights, margins and font sizes of headings, and so on. Lec5 JavaScript Basics JavaScript has 2 distinct systems server-side JavaScript runs on Web servers client-side JavaScript runs on Web browsers\ JavaScript syntax resembles C, C++, and Java Developed in 10 days by Brendan Eich, in May 1995 originally named as Mocha renamed as LiveScript, then JavaScript JavaScript is embedded in HTML: in the body document.write("Last updated on " + document.lastModified + ". ") in the as a deferred script //the Javascript here creates functions for later use Event Handlers Mouse events onclick onblclick onmouseover onmouseout Keyboard events onkeydown onkeyup Object events onload onunload onresize onscroll What JavaScript can do? Designed for manipulating web pages, but can also be general-purpose language. Control Web page appearance and content (intended) Control the Web browser, open windows, test for browser properties Interact with document content Retrieve and manipulate all hyperlinks Interact with the user, sensing mouse clicks, mouse moves, keyboard actions Read/write client state with cookies Limitations of Client-side JavaScript: was difficult to draw graphics has been dramatically improved in the latest versions No access to the underlying file system or operating system Unable to open and use arbitrary network connections No support for multithreading was not suitable for computationally intensive applications has been improved in the latest versions JavaScript Basics of the Language case-sensitive (HTML is not case-sensitive) ignores spaces, tabs, newlines (can be minified) Semicolon is optional C and C++ style comments are supported Literals numbers boolean strings: immutable (cannot be changed after created) string properties: str.length , str.tolowerCase , str.toupperCase , str.indexOf , str.charAt , str.substring Variables scope: Any variable outside a function is a global variable and can be referenced by any statement in the document Variables declared in a function as “var” are local to the function if var is omitted, the variable becomes global Arrays array properties: 1 dimensional, indexed from zero arr.length Arrays are sparse: most elements are not allocated after initiation loop for (i=0; i A --> B --> C --> O A , B and C are three intermediaries between the user agent and origin server. A request or response message that travels the whole chain will pass through four separate connections UA stands for User Agent, typically a browser O stands for the origin server; the server that actually delivers the document Connections Persistent Connections In the original HTTP protocol each request was made over a new connection an HTML page with n distinct graphic elements produced n+1 requests TCP uses a three-way handshake when establishing a connection client sends SYN server replies ACK/SYN client responds with ACK HTTP 1.0 introduced a keep-alive feature the connection between client and server is maintained for a period of time allowing for multiple requests and responses a.k.a. persistent connection Persistent connections are now the default request header to set timeout (in sec.) and max. amount of requests, before closing: Keep-Alive: timeout=5, max=1000 client and server must explicitly say they do NOT want persistence using the header Connection: close HTTP permits multiple connections in parallel, but generally browsers severely limit multiple connections and servers do as well Example of a GET request Suppose the user clicks on the link: click here The request from the client may contain the following lines GET /html/file.html HTTP/1.1 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:15.0) Gecko/20100101 Firefox/15.0.1 Referer: http://www.usc.edu/html/prevfile.html If-Modified-Since: Wed, 11 Feb 2009 13:14:15 GMT {there is a blank line here which terminates the input} the server responds with the following HTTP/1.1 200 OK Date: Monday, 29-May-09 12:02:12 GMT Server: Apache/2.0 MIME-version: 1.0 Content-Type: text/html Last-modified: Sun, 28-May-09 15:36:13 GMT Content-Length: 145 {a blank line goes here } {the contents of file.html goes here } Client HTTP request The general form of an HTTP request has four fields: HTTP_Method : to be done to the object specified in the URL; some possibilities include GET , HEAD , and POST GET : retrieve whatever information is identified by the request URL HEAD : identical to GET , except the server does not return the body in the response POST : instructs the server that the request includes a block of data in the message body, which is typically used as input to a server-side application PUT : used to modify existing resourses or create new ones, contained in the message body DELETE : used to remove existing resourses TRACE : traces the requests in a chain of web proxy servers; used primarily for diagnostics OPTIONS : aloows requests for info about the server's capabilities identifier : the URL of the resourse or the body HTTP_version : the current HTTP version, e.g. HTTP/1.1 Body : optional text HTTP Headers HTTP/1.1 divides headers into four categories: general : present in requests or responses request : present only in requests response : present only in response entity : describe the content of a body Byte Range Headers Requests If-Range : entity-tag Range : bytes=1-512 , 2046-4096 used to request a byte range Responses Accespt-ranges : bytes indicates the server can respond to range requests Entity Content-Range : 0-399/2000 response to byte range request giving the byte ranges actually returned, e.g. the first 400 bytes of a 2000 byte document HTTP/1.1 introduces Vary: accept-language , user-agent the header specifies acceptable languages and browsers. if a French version is requested and cached, then a new request may fail to retrieve the English version request: GET http://www.myco.com/ HTTP/1.1 User-agent: Mozilla/4.5 Accept-language: en response: HTTP/1.1 200 OK Vary: Accept-language Content-type: text/html Content-language: en Response Header Status Code 10 Response is stale 11 Revalidation failed 12 Disconnected operation 13 Heuristic expiration 14 Transformation applied 99 Miscellaneous warning Entity Tags used for web cache validation, and which allows a client to make conditional requests assigned by a web server to a specific version of a resource found at a URL If the resource content at that URL ever changes, a new/different ETag is assigned ETags are similar to fingerprints, and they can be compared to determine whether two versions of a resource are the same An ETag is a serial number or a checksum that uniquely identifies the file caches use the If-None-Match condition header to get a new copy if the entity tag has changed if the tags match, then a 304 Not Modified is returned ETag is determined by the server, sent as response HTTP Status Codes Informational 100 : Continue, the client may continue with its request; used for a PUT before a large document is sent 101 : Switching Protocols, switching either the version or the actual protocol Successful 200 : OK, request succeeded 201 : Created, result is newly created 202 , Accepted, the resourse will be created later 203 : Non-authoritative information, infor returned is from a cached copy and may be wrong 204 : No content, response is intensionally blank, so client should not change the page 205 : Reset Content, notifies the client to reset the current document, e.g. clear a form field 206 : Partial content, e.g. a byte range response Redirection Client Error Server Error HTTP Authentication The web server can maintain secure directories and request authentication when someone tries to access them Procedure: web server receives a request without proper authorization web server responds with 401 Authentication Required client prompts for username and password and returns the information to the web server META HTTP-EQUIV (meta tag) a mechanism for authors of HTML documents to set HTTP headers, in particular HTTP responses Two common used: set the expiration time of a document cause a refresh of a document X-Frame-Options: sameorigin Indicate whether or not a browser should be allowed to render a page in a or. Sites can use this to avoid clickjacking attacks, by ensuring that their content is not embedded into other sites; deny sameorigin allow-from uri HTTP Strict-Transport-Security (HSTS) HSTS is a security feature that lets a web site tell browsers that it should only be communicated with using HTTPS, instead of using HTTP Strict-Transport-Security: max-age=expireTime [; includeSubdomains] Cross-origin resourse sharing (CORS) CORS allows allows many resources (e.g, fonts, JavaScript, etc.) on a web page to be requested across domains AJAX calls can use XMLHttpRequest across domains If the server does not allow the CORS request, the browser will deliver an error instead of the asked URL response. Lec14 Secure Web Communication & Web Server