Web Development Book PDF
Document Details
Uploaded by VerifiableRetinalite849
Tags
Summary
This textbook explains the fundamental concepts of web development. It covers the history of the internet, the role of HTML and HTTP, how various processes work, and how to access websites using URLs. This is a good starting point for beginners in web development.
Full Transcript
The Web's Early Days The modern internet began in the early 1990s when Tim Berners-Lee developed the tools and framework for the World Wide Web. Within a few months, he created: the first web browser, called WORLDWIDEWEB (Now, we can choose between Chrome, Firefox, Edge and other browsers...
The Web's Early Days The modern internet began in the early 1990s when Tim Berners-Lee developed the tools and framework for the World Wide Web. Within a few months, he created: the first web browser, called WORLDWIDEWEB (Now, we can choose between Chrome, Firefox, Edge and other browsers) HYPERTEXT MARKUP LANGUAGE (HTML), the language that structures a website. HYPERTEXT TRANSFER PROTOCOL (HTTP), handles communication between a the computers that provide the website and the browser RRR SURFING THE WEB INTERNET World wide web The INTERNET is a worldwide system of computer networks and the connection. You can use it to connect, surf and explore websites. The WORLD WIDE WEB , or web, is a collection of websites that are linked together through the internet. The web uses the internet to let you access these sites. Imagine the WWW as a giant spiderweb, with websites as the points on the web. Most websites also have links that let you hop from one site to another (represents strings) youtube.com instagram.com crazygames.com Websites Internet LLL When you open a website on your computer, you're actually downloading a temporary copy of the HTML FILES that make up that site. HTML stands for "Hypertext Markup Language" — the language used to structure and create websites. The "hypertext" part means HTML is designed to link websites together, allowing you to jump from one site to another via links. “May i view Every website is stored on youtube.com?” a computer called HOST. When you open a website, your host you are requesting the host computer to send you the files needed to display the site. “Sure, here are the files” A host can be a regular computer, but most of the time, websites are stored on specialized computers called SERVERS. These are equipped with large amounts of storage, fast internet connections, and software designed to efficiently handle web traffic. This allows them to store and manage all the data required to run websites and deliver content to the user ( client ) quickly. RRR Have you ever noticed that HTTP "http" or “https” is in the Hypertext Transfer Protocol (HTTP) is used beginning of every website by web browsers to address? That’s because your communicate with servers. computer uses HTTP(s) to download the website. HTTP and HTTPS (the secure version of HTTP) are the main protocols used to send websites over the internet, ensuring your browser can request and display them. HYPERTEXT TRANSFER PROTOCOL SECURE (HTTPS) works just like HTTP but adds an extra layer of security by encrypting all the data being transferred FOR EXAMPLE, part of HTTP consists of status codes that the server sends to the browser to indicate the status of a request. An example of a status code is “404 Not Found” which tells the browser that the server could not find the website that the browser asked for. LLL THE JOURNEY OF ACCESSING A WEBSITE Domains and dns Almost every website on the internet has it’s own DOMAIN which is the unique name that identifies the website. Examples: facebook.com ebay.de A domain makes it easy for users to access websites without needing to remember the server’s IP address. Remember, your computer can’t send a packet directly to a domain name — it needs the actual numerical IP address to know where to send the data. This is where a DOMAIN NAME SYSTEM (DNS) server comes in. It acts like a translator, converting the domain name you typed into the corresponding IP address of the website. What is the IP of google.com? It is 142.251.214.142 RRR URLs Every website on the internet has a UNIFORM RESOURCE LOCATOR (URL) or address. These are the parts of a URL: Path Protocol Top-Level Subdomain Domain Root Domain https://www.youtube.com/watch ?v=dQw4w9WgXcQ&t=60#somewhere Parameters Anchor Protocol: indicates the protocol the browser must use to request the resource Subdomain: additional info added to the beginning of a website’s domain name used to organize the website’s content (“blog.” or “shop.”) Root domain: main part of your website’s domain name Top-Level domain: think of it as a domain extension with different meanings (.com = commercial,.de = German website) Path: path to the resource on the web server LLL Parameters: are extra information provided to the Web server. Those parameters are a list of key/value pairs separated with the & symbol. Anchor: “bookmark” to a part of the resource itself. On an HTML document, for example, the browser will scroll to the point where the anchor is defined PUTTING THE PIECES TOGETHER Computer DNS Server 1. What is the IP of google.com? 2. It is 142.251.214.142 3. C an I view goo 4. S gle ure.com ! He ? re y ou g o! IP ADDRESS: 142.251.214.142 Web Server Google of Google... RRR GETTING OUR HANDS DIRTY CHECKING IF A WEBSITE IS ALIVE One of the easiest ways to check if a website is online, or "alive," is by using the PING COMMAND in your operating system’s terminal. A ping uses ICMP to do this check. When you ping a server, you ICMP send an ICMP Echo request Internet Control Message Protocol packet to its IP address, (ICMP) is used by network devices to asking for a response. The diagnose network communication issues server replies with an ICMP Echo REPLY, confirming it's reachable and providing the round-trip time. Command structure: “ping ” > ping google.com IP address of web server PING google.com (172.217.18.14)... 64 bytes from 172.217.18.14: icmp_seq=1 ttl=119 time=15ms Time it took from your 64 bytes from 172.217.18.14: computer to the server and back icmp_seq=2 ttl=119 time=17ms LLL Sequence number TTL Tracing the Path TRACEROUTE is a command-line network diagnostic tool that tracks the path packets take to a destination, showing each hop and its time. a HOP Traceroute sends ICMP echo A computer, router, or packets on Windows, and any device that comes in between the source uses UDP packets on Linux. and the destination. Each IP packet that we send on the internet has got a field called as TTL. TTL (Time-to-live) is not measured by seconds, but the number of hops. Its the maximum number of hops that a packet can travel, before its discarded. If there was no TTL, the packet will flow endlessly from one router to another, forever searching for the destination. FOR EXAMPLE, If we need to reach the IP address 8.8.8.8 and my default TTL is 30 hops, the packet can travel up to that many hops before it's discarded. Each router along the path reduces the TTL by 1 before forwarding it. When TTL reaches 1, the router discards the packet and informs the sender that the TTL limit was exceeded. RRR Let’s take a look at an example diagram of the process: TTL: 1 2.2.2.2 2.2.2.2 TTL EXCEEDED TTL: 2 TTL: 1 2.2.2.2 4.4.4.4 2.2.2.2 4.4.4.4 TTL EXCEEDED Each time the TTL expires, the router sends a "TTL exceeded" message, and traceroute logs the router's IP. The above process will continue until either it hit’s the destination host or traceroute reaches it maximum hop count (usually 30). In the end, we have a list of all the hops the packet passed through. LLL We can run a traceroute on Windows, by opening a terminal and typing “tracert ”. We can use Google’s public DNS server 8.8.8.8 as the destination: > tracert 8.8.8.8 Tracing route to dns.google [8.8.8.8] over a maximum of 30 hops: 1 nslookup -type=A google.com ns1.google.com Server: router.local Address: 192.168.1.1 Name: google.com Address: 172.217.16.174 No more “Non- LLL authoritive answer” Did you know we can also find a domain name by its IP address? This process is called REVERSE DNS LOOKUP. The reverse DNS lookup queries DNS servers for a PTR (pointer) record. If the server does not have a PTR record, it cannot resolve a reverse lookup. PTR records store IP addresses with their segments reversed, and they append ".in-addr.arpa" to that. For example if a domain has an IP address of 19.102.0.24, the PTR record will store it as “24.0.102.19.in-addr.arpa” > nslookup -type=PTR 8.8.8.8 Server: router.local Address: 192.168.1.1 Non-authoritive answer: 8.8.8.8.in-addr.arpa name = dns.google RRR A CLOSER LOOK AT RESOLVING DOMAINS When you want to visit a website a “DNS lookup” process for the website’s domain will be initiated: 1. If the requested domain name is not present in the browser’s cache, the browser makes a call to your operating system (OS), and asks it if it has the address in its cache 2. If the browser doesn’t find it on the OS cache it then requests the RESOLVING NAME SERVER cache which is from your ISP to check if it has the requested address 3. If the resolver name server doesn’t find it in its cache, it asks the Root name server to give it the location of the top-level domain server (TLD) such as.com or.org 4. The TOp-level domain server responds with the IP address of the authoritative nameserver for the domain 5. If the IP address of the requested domain is available, the browser then sends a request to the server at the IP address to retrieve the webpage LLL Client Resolving Name Server (ISP) Root Name Server Top-Level Domain Server Authoritive Name Server RRR