Lecture 24: Serverless Applications (AWS Lambda/GCP Functions) PDF
Document Details
Uploaded by AttentivePink
Tags
Summary
This document discusses serverless applications, comparing them with virtual machines (VMs) and containers. It explains the need for VMs due to the complexity of deploying server applications with varied requirements, and then explores serverless architecture's advantages, including cost-effectiveness, event-driven operation, and improved developer productivity.
Full Transcript
Lecture 24: Serverless Applications (AWS Lambda)/(GCP Functions) Notes ● Need for VMs ○ The Issue ■ Deployment of server applications is getting complicated since software can have many types of requirements. ○ The Solution ■ Run each individual application on a separate virtual machine. (One on Nod...
Lecture 24: Serverless Applications (AWS Lambda)/(GCP Functions) Notes ● Need for VMs ○ The Issue ■ Deployment of server applications is getting complicated since software can have many types of requirements. ○ The Solution ■ Run each individual application on a separate virtual machine. (One on NodeJS VM, one on PHP VM, one on Java VM) ○ Virtualization ■ Offers a hardware abstraction layer that can adjust to the specific CPU, memory, storage and network needs of applications on a per server basis. ■ While VMs virtualize the hardware, containers virtualize the operating system. This means that containers are more about running the application and its dependencies, while VMs are about running full-fledged operating systems. ○ Vms allowed multiple operating systems to run concurrently on a single physical machine, which was revolutionary because it provided a way to isolate different computing tasks and applications from each other, thus improving the utilization of expensive mainframe hardware. ○ They became a way to maximize server utilization, allowing one server to do the job of multiple servers by sharing its resources across multiple environments. ● VMs are expensive ○ The Problems with Virtual machines ■ Money – You need to predict the instance size you need. You are charged for every CPU cycle, even when the system is “running its thumbs” ■ Time – Many operations related to virtual machines are typically slow ● VMs can take a significant amount of time to start because they need to boot up the entire operating system. ○ The Solution ■ Serverless Architectures ● No Server Management: With serverless architectures, developers do not need to manage servers or containers at all. The cloud provider dynamically manages the allocation of machine resources. ○ Feature 1: • No compute resource to manage ● Automatic Scaling: Serverless applications automatically scale with the number of requests, from a few requests a day to thousands per second. ○ Provisioning and scaling handled by the service itself ● ■ Cost-Effectiveness: Serverless computing is often billed based on the exact amount of resources consumed by an application, down to the function call level, rather than on pre-purchased units of capacity. ● Event-driven and Instantaneous: Serverless architectures are naturally event-driven, only running code in response to events, which makes them highly efficient for workloads that are intermittent or sporadic. ● Developer Productivity: Developers can focus on writing code rather than managing and operating servers or runtime environments, accelerating the deployment cycle. ○ You write code and the execution environment is provided by the service ● the best known vendor host of which currently is AWS Lambda(announced at re:invest 2014…first of its kind to completely abstracted the execution environment from the code) ● Feature 4: Core functionality (e.g., database, authentication and authorization) is provided by at-scale Web Services Containers ● Efficient Resource Utilization: Containers share the host system’s kernel and only include the application and its dependencies. They are lighter and use fewer resources than VMs. ● Fast Start-up: Containers start almost instantly, allowing for quicker scaling up and down in response to demand. ● Portability: Containers encapsulate everything needed to run an application, so they can easily be moved between different environments, such as from development to production or from on-premises to cloud. ● Reduced Cost: Containers can lead to cost savings because they allow for higher density and utilization of underlying resources, reducing the number of servers needed. ● Key Features ○ Lightweight ■ All containers running on the same host share a single Linux kernel ■ Container images don’t require a full OS install like a virtual machine image ○ Portable ■ – Execution environment abstracts the underlying host from the container ■ – No dependency on a specific virtual machine technology ■ – Container images can be shared using GitHub-like repositories, such as Docker Hub (hub.docker.com) ○ ○ ● ● ● Containers ■ Operating System Level virtualization, a lightweight approach to virtualization that only provides the bare minimum that an application requires to run and function as intended. ■ Containers were designed to provide a lightweight form of virtualization. The idea was to package applications and their dependencies into a single object that could be moved between environments—development, test, and production—without change. This solves the problem of software running on one environment but not another, often summarized as "it works on my machine". ■ Unlike VMs, containers don't require a full operating system for each instance. Instead, they share the host system's kernel and isolate the application processes from each other within the user space. This makes containers much more efficient, fast, and lightweight compared to VMs. Docker ○ Docker is a tool that allows developers, sys-admins etc. to easily deploy their applications in a sandbox (called containers) to run on the host operating system, i.e., Linux. ○ Key Benefit : Allows users to package an application with all its dependencies into a standardized unit for software development. ○ Unlike virtual machines, Containers do not have the high overhead and hence enable more efficient usage of the underlying system and resources. ○ Allow extremely higher efficient sharing of resources ○ Provides standard and minimizes software packaging ○ Decouples software from underlying host w/ no hypervisor ○ Container Issues l Security l Less Flexibility in Operating Systems, Networking l Management of Docker and Container in production is challenge Origins of Functions as a Service/ serverless arch ○ AWS Lambda – Announced at re:Invent 2014. First web service of it’s kind that completely abstracted the execution environment from the code ○ API Gateway – Launched in mid-2015. Critical ingredient for building service endpoints with Lambda. ○ Combined with existing back-plane services like DynamoDB, Cloudformation, and S3 and “serverless” development was born. Baas - Backend as a Service ○ Data Stores ■ – NoSQL Databases ■ – BLOB Storage ■ – Cache (CDN) ○ ● Analytics ■ – Query ■ – Search ■ – IoT (Internet of Things) ■ – Stream Processing ○ AI ■ – Machine Learning ■ – Image Recognition ■ – Natural Language Processing/Understanding ■ – Speech to Text/Text to Speech Serverless - Where we go from here ○ Backend-as-a-Service ■ – AI ● • Fraud detection ● • Latent sematic analysis ■ – Geospatial ● • Satellite imagery ● • Hyper-Locality ■ – Analytics ● • Query ● • Search ● • Stream Processing ■ – Database ● • Graph ■ – HPC (High Performance Computing) ○ Function-as-a-Service ■ – Polyglot language support (each function written in a different language) ■ – Stateful endpoints (Web Sockets) ● • AWL Lambda implements WebSockets by integrating the Fanout service ■ – Remote Debugging ■ – Enhanced Monitoring ■ – Evolution of CI/CD Patterns (Continuous Integration / Continuous Deployment) ■ – IDE’s ■ – See “Ten Attributes of Serverless Computing Platforms”: ■ https://thenewstack.io/ten-attributes-serverless-computing-platforms/ ○ Polyglot Platform ■ Though JavaScript seems to be the lowest common denominator for Serverless, supporting other languages is important. ○ Support for Sync and Async Invocation ■ Functions deployed in FaaS may be synchronous or asynchronous. A certain class of applications demands immediate response while others may prefer asynchronous invocation. For example, the data ○ ○ ○ ○ ○ ○ generated by sensors needs to be processed and analyzed immediately while images uploaded to object storage may be converted to thumbnails by a batch process. API Gateway Integration ■ The value of an API Gateway integrated with the serverless platform cannot be emphasized enough. Though the functions deployed in serverless environments are typically triggered by external event sources such as stream processors and databases, it is the API Gateway that lights up the functions. It adds the logical routes for mapping the standard HTTP verbs to respective functions. Developer Productivity ■ Most of the IDEs that are used by developers today are not designed for modern DevOps processes. The support for source code control systems, build automation, CI/CD, and A/B testing came through plugins and third party add-ons. It will take a long time for traditional IDE vendors to support FaaS. Support for Devops and Tooling ■ There is a misconception that FaaS magically reduces the need for DevOps and tooling. Serverless platforms should have tight integration with source code control systems and build automation tools. They should support automated and repeatable deployment patterns. IDE support and integration with existing DevOps pipeline is a major factor to consider while choosing a FaaS platform. Responsiveness and Performance ■ Responsiveness plays a critical role in designing microservices-based applications on FaaS. Poorly designed platforms will introduce startup latency and delay the invocation process, which would become obvious to end users. Lightweight, interpreted languages such as JavaScript and Python respond faster than Java and .NET. If there is a considerable gap between each invocation, the delay becomes noticeable. Customers must benchmark the turnaround window for each language and runtime before deploying their microservices solution. Logging and Monitoring ■ The only way both can be controlled is through a powerful dashboard that shows the current state. FaaS platforms should have extensive support for logging and monitoring. Everything that is written to stdout and stderr should be logged to separate streams. This is essential to understanding the current health of an application and debugging individual functions. REST Endpoints and Automation ■ Like most of the cloud-based delivery models, FaaS must be fully automated. This is only possible when the platform supports API for performing all the operations done through the portal or the CLI. This ● ● feature enables developers and operators to efficiently automate the workflow of deploying and managing the microservices. ○ Support Long running Jobs and Batch Processing ■ Mature serverless platforms have inbuilt support for long-running, scheduled jobs. A function that is deployed in FaaS may be periodically invoked to perform an ETL job. ○ Extensibility and Integration ■ The real value of a serverless platform lies in the broad integration and extensibility. For example, the platform must support a variety of security schemes including oAuth and custom LDAP-based authentication. It should support HTTPS endpoints out of the box for secure transport. The platform should have enough hooks for easy integration with a variety of event sources. Support for custom environment variables is another example of extensibility. With this, customers can pass parameters other than those included in the request. Containers - Where we go from here ○ Networking ■ – Overlay networks between containers running across separate hosts ○ • Stateful Containers ■ – Support for container architectures that read and write persistent data ○ • Monitoring and Logging ■ – Evolution of design patterns for capturing telemetry and log data from running containers ○ • Debugging ■ – Attach to running containers and debug code ○ • Security ■ – Better isolation at the kernel level between containers running on the same host ■ – Secret/Key management – Transparently pass sensitive configuration AWS Lambda ○ Compute Service using Amazon's infrastructure ○ l Code === function ○ l Supported – Java, Python, C#, and Node.js (i.e., JavaScript) ○ l Can say it to be Docker under the covers ○ l A system that uses Linux Containers ○ l Pay only for the compute time you use ○ l Triggered by events or called from HTTP ○ l It still has SERVERS, but we do not care about them ○ l Functions are unit of deployment and scaling ○ l No Machines, no Vms or containers visible in Programming Model ○ l Never pay for idle ○ l Auto-Scaling and Always Available, adapts to rate of incoming requests ○ Using AWS Lambda ■ No Servers to Manage ■ ■ ■ ■ ■ ■ ■ ■ ● • Continuous Scaling • Subsecond metering • Bring your own code • Simple resource model • Flexible Authorization and Use • Stateless but you can connect to others to store state • Authoring functions • Makes it easy to ● - Perform real time data processing ● - Build scalable backend services ● - Glue and choreograph systems Google Cloud Functions ○ Google Cloud Functions + API Management ○ Scalable pay as you go Functions-as-a-Service (FaaS) to run your code with zero server management. ■ – No servers to provision, manage, or upgrade ■ – Automatically scale based on the load ■ – Integrated monitoring, logging, and debugging capability ■ – Built-in security at role and per function level based on the principle of least privilege ■ – Key networking capabilities for hybrid and multi-cloud scenarios ○ Cloud Functions are ephemeral, spinning up on-demand and back down in response to events in the environment. Pay only while your function is executing, metered to the nearest 100 milliseconds, and pay nothing after your function finishes ○ Cloud Functions are written in JavaScript and execute in a standard Node.js runtime environment. Python and Go are also supported. ○ Price ■ 2 million invocations is free per month ■ $0.40 per million invocations Lecture 25: Cookies and Privacy Notes ● What is a Cookie? Basic Facts ○ What is a Cookie? ■ Short pieces of text generated during web activity and stored in the user’s local machine by the user’s web browser for future reference ■ Cookies are created by website authors who write software for reading and writing cookies ■ Cookies were initially used so websites would remember that a user had visited before, allowing customization of sites without need for repeating preferences ● All the steps in the shopping cart… ■ ○ Until cookies were involved..difficult to connect one page to another…http pages are stateless…one page has no relation to current pages Elements of a Cookie? ■ A cookie is associated with a website’s domain and contains name, value, path, and expiration date ■ ○ ○ ○ ○ ● Utma is the cookie name ● ■ Such cookies are sometimes referred to as HTTP cookies because they are sent by servers and are placed there using the HTTP protocol as the delivery mechanism What can cookies do? ■ Store and manipulate any information you explicitly provide to a site • ■ Track your interaction with the site such as pages visited, time of visits, number of visits ■ Use any information available to the web server including: your IP address, Operating System, Browser Type What they cant DO? ■ Pii …being able to idenity who you are ■ Have automatic access to Personal Identifiable Information (PII) like name, address, email ■ Read or write data to disk ● Cookies are data that is eventually stored in cookie database ■ Read or write information in cookies placed by other sites ● If google send cookie they can only read/write not msft ■ Run programs on your computer ■ As a result, they ● – Cannot carry viruses ● – Cannot install malware on the host computer Finding Cookies ■ Browsers store their cookies in their own format and files. ■ All cookies in their own browser are individual to the browser itself ■ Safari allows you see individual cookies instead of the number of cookies Other Google Property Cookies ■ Doubleclick Cookies and Youtube Cookies ● Doubleclick and YouTube are two companies owned and operated by Google; Doubleclick cookies are referred to as 3rd party ○ ○ cookies because the user never actually visits the Doubleclick site Cookie Types and taxonomy ■ By Lifespan ● - Session Cookies (stored in RAM)...delete when tab/browser is closed ● - Persistent Cookies (stored on disk) …expiration date ■ • By Read-Write Mechanism ● - Server-Side Cookies (included in HTTP Headers)...sent by servers..only read by servers ● - Client-Side Cookies (manipulated with JavaScript)..read and written by javascript ■ • By Structure ● - Simple Cookies ● - Array Cookies ■ • Session cookies exist only while the user is reading and navigating the website; browsers normally delete session cookies when the user exits the browser ■ • Persistent cookies, also known as tracking cookies have an expiration date ■ • Secure/”protected” cookies have the secure attribute enabled and are only used via https, so the cookie is always encrypted ● Can only be read and written by https…unprotected wifi area..people can steal your cookies and be you accessing what you were accessing ■ Third-party cookies are not from the “visited” site…used for advertisement and tracking you ● Third party cookies are cookies set with a different domain than the one in the browser’s address bar ○ These cookies may be placed by an advertisement on the page or an image on the page ○ RFC 6265 allows browsers to implement whatever policy they wish regarding third party cookies ● Advertisers use third party cookies to track a user across multiple sites ○ A user visits www.site1.com which sets a cookie with domain ad.adtracking.com; later the same user visits www.site2.com which also sets a cookie with domain ad.adtracking.com; Eventually both cookies will be sent to the advertiser, that will match them ● Cookie Processing Algorithm ■ 1. A URL is requested, (either by entering one into the address field or clicking on a link) ■ ○ ○ 2. The browser scans its Cookie database for any cookies whose domain and path matches the requested URL ■ 3. If any are found, all the cookies are sent along with the request as part of the HTTP headers (value of Cookie) ● Brower will look at cookie db..if finds any of them then browser sends them all ■ 4. The server-side programs may/may not make use of any cookies from the client to determine what page to return ■ 5. The server-side program may generate one (or more) cookies and send them along with the requested page; cookies are included in the HTTP headers returned to the browser (value of Set-Cookie) ■ 6. The browser stores any new cookies into its database; cookies can be accessed on the client using the document.cookie object in JavaScript Additional facts about cookies ■ Scope: by default, cookie scope is limited to all URLs on the current host name. Scope may be limited with the path= parameter to specify a specific path prefix to which the cookie should be sent, or broadened to a group of DNS names, rather than single host only, with domain=. ■ Time to live: by default, each cookie has a lifetime limited to the duration of the current browser session. Alternatively, an expires= parameter may be included to specify the date at which the cookie should be dropped ■ Overwriting cookies: if a new cookie with the same NAME, domain, and path as an existing cookie is encountered, the old cookie is discarded ■ Deleting cookies: There is no specific mechanism for deleting cookies, although a common hack is to overwrite a cookie with a bogus value as outlined above, plus a backdated or short-lived expires= ● Reset the expiration to the past…expired to the browser is as if it doesnt exist ■ "Protected" cookies: as a security feature, some cookies set may be marked with a special secure keyword, which causes them to be sent over HTTPS only Client-Side Cookies…set and send cookies via JS ■ JavaScript has a property of the document object named cookie: document.cookie ● Maintains all the list of cookies you can read and write ● Gets array of all cookies of particular domain ■ This is a string variable that can be read and written using the JavaScript string functions ■ A cookie can be removed from the cookie database either because it expires or because the cookie file gets too large ● – browsers need not store more than 300 cookies, nor more than 20 cookies per web server, nor more than 4K per cookie ○ ○ ■ Setting document.cookie creates a new cookie for the web page ■ Reading document.cookie retrieves all defined cookies (array) Escape(s) and unescape(s) ■ Cookie ‘values’ should not contain white space, brackets, parentheses, equals signs, commas, double quotes, slashes, question marks, at signs, colons, and semicolons ● – Cookie values are encoded into their hex equivalents ■ escape(s) returns a new version of string 's' that is encoded ● – all spaces, punctuation, accented characters, and other non-ASCII letters or numbers are converted to %xx format (ISO-8859-1) ■ unescape(s) returns a new version of string 's' that is decoded ● – all %xx are replaced by their character equivalent ● Get back original inout ■ If cookie values contain any of those symbols, then they need to escape ■ Only escape values not cookie names Create,delete, get cookies ■ ■ ● ■ Cookie based Marketing ○ How Advertising on the Web Works ■ Advertisers use cookies..they can target you with specific ads..through cookies they are tracking you…via 3rd party…match you from a different cookie ■ An online advertising network or ad network is a company that connects advertisers to web sites that want to host advertisements. ● – The key function of an Ad Network is to place advertisements on the web sites of web publishers who wish to sell advertising space. ■ There are four key players involved in an Ad Network’s delivery of ads to users. ● – First, there are the advertisers that wish to place the ads. ● – Second, there are the website owners who wish to make money by selling ad space on their websites. ● – Third, there is the Ad Network that signs up advertisers and places their ads on the web pages of website owners. ● – Fourth, there are the visitors who view the web pages that contain the ads. ○ • When a visitor requests a web page, the Ad Network is notified, and it supplies one ad from its inventory to appear on the web page that was requested. The advertiser will pay the Ad Network for placing its ads and the Ad Network will return a portion of that fee to the website owner. ○ Cookie Based marketing ■ What is it? ● A user customized online advertising and marketing system that uses cookies and databases to create, maintain and utilize consumer profiles and monitor their activity ■ How does it work? ● ● ● ○ ○ Ad serving companies make agreements with website owners; website owners agree to send cookies from ad serving companies to their clients When a user visits another such site, it sends data placed in your cookies to the Ad Serving company which retrieves marketing information about you from their database enabling them to customize the resulting ad Result: One person may see ads for sporting goods and another for baby clothes ■ Doubleclick ads ■ Doubleclick is an Ad Network; purchased by google ■ How Doubleclick works: – When a user invokes Web page, a tag on the page signals Doubleclick's server to delve into its inventory of advertisements to find one that matches the marketer's needs with the user's profile. Google Analytics ■ Another way websites can track users ■ Returns more than just cookies ■ Google Analytics is Google’s free web analytics tool that helps website owners understand how their visitors engage with their website. ■ Analytics uses its own set of cookies to track visitor interactions. These cookies are used to store information, such as time of current visit, previous visits, and referred site. ■ A different set of cookies is used for each website, and visitors are not tracked across sites. ■ ● ● To disable this cookie, you can install the Google Analytics Opt-out Add-on in your browser, which prevents Google Analytics from collecting information about your website visits. ○ Google uses Cookies for Conversion Tracking ■ Google uses cookies to help businesses that buy ads from Google determine how many people who click their ads end up purchasing their products. ■ The conversion tracking cookie is set on your browser only when you click an ad delivered by Google where the advertiser has “opted in” to conversion tracking. ■ Google uses cookies to help businesses that buy ads from Google determine how many people who click their ads end up purchasing their products. ■ The conversion tracking cookie is set on your browser only when you click an ad delivered by Google where the advertiser has “opted in” to conversion tracking. ■ If you want to disable conversion tracking cookies, you can set your browser to block cookies from the googleadservices.com domain. ○ Other types of cookies ■ Evercookie is a JavaScript API that produces extremely persistent cookies in a browser ○ 6 ways to Opt out cookies ■ Select “do not track” in your browser Settings ● Set do not track…only a wish most cases dont work bc most people dont accept it when requested. ■ Download opt-out cookies - This is a process that usually involves clicking on a button to download the opt-out cookie ■ Use the cookie management tools in your web browser. ● Delete all the cookies in management tools ● In most web browsers, you can set your browser to accept only session cookies, or to turn all cookies into session cookies. Session cookies are generally harmless. ■ View current cookies and delete what you don't need ■ Check your account preferences on registration sites ■ Use browser add-ons…Ghostery will turn off cookies…some sites will not allow u to see the sites if u have a cookie block Cookies, Privacy & Legislation Conclusion