OS U4 (1).pdf
Document Details
Uploaded by PunctualClover
Tags
Full Transcript
OPERATING SYSTEMS Storage Management Suresh Jamadagni Department of Computer Science OPERATING SYSTEMS File System Suresh Jamadagni Department of Computer Science OPERATING SYSTEMS Slides Credits for all the PPTs of this course The slides/diagrams in this course are an adaptation, combina...
OPERATING SYSTEMS Storage Management Suresh Jamadagni Department of Computer Science OPERATING SYSTEMS File System Suresh Jamadagni Department of Computer Science OPERATING SYSTEMS Slides Credits for all the PPTs of this course The slides/diagrams in this course are an adaptation, combination, and enhancement of material from the following resources and persons: 1. Slides of Operating System Concepts, Abraham Silberschatz, Peter Baer Galvin, Greg Gagne - 9th edition 2013 and some slides from 10th edition 2018 2. Some conceptual text and diagram from Operating Systems - Internals and Design Principles, William Stallings, 9th edition 2018 3. Some presentation transcripts from A. Frank – P. Weisberg 4. Some conceptual text from Operating Systems: Three Easy Pieces, Remzi Arpaci-Dusseau, Andrea Arpaci Dusseau OPERATING SYSTEMS File Concept Computers can store information on various storage media, such as magnetic disks, magnetic tapes, and optical disks. File is logical storage where as disks are physical storage unit What is a File? collection of related information Data cannot to be written to disk unless they are in files Data can be Numeric, alphabetic, alphanumeric, binary A file is a sequence of bits, bytes, lines, or records, the meaning of which is defined by the file’s creator and user Contents of file can be source or executable programs, numeric or text data, photos, music, video, and so on. OPERATING SYSTEMS File Attributes Files are named and are referred by their names by the human users. Attributes of files Name – only information kept in human-readable form Identifier – unique tag (number) identifies file within file system Type – needed for systems that support different types Location – pointer to file location on device Size – current file size Protection – controls who can do reading, writing, executing Time, date, and user identification – data for protection, security, and usage monitoring Information about files are kept in the directory structure, which is maintained on the disk. OPERATING SYSTEMS File Attributes o The information about all files is kept in the directory structure, which also resides on secondary storage. o Directory entry consists of the file’s name and its uniqueand its unique identifier. o Identifier in turn locates the file attributes. File: g.c Size: 218 Blocks: 0 IO Block: 4096 regular file Device: 2h/2d Inode: 3377699720577240 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2020-09-15 15:18:57.338585500 +0530 Modify: 2021-03-20 10:40:37.416576700 +0530 Change: 2021-03-20 10:40:37.416576700 +0530 OPERATING SYSTEMS File Operations File is an abstract data type. File Operations Create Write – Use the system call to write to a file. A write pointer points to the location in the file, where the next write is to take place. Read – – Use the system call to write to a file. A read pointer points to the location in the file from where the next read is to take place Reposition within file - Repositioning within a file need not involve any actual I/O. It is done with system call lseek. Repositioning is also called seek to location in a file Delete: Use the system call to delete/remove a file. File to be deleted is searched in the directory. Release the memory allocated after deletion OPERATING SYSTEMS File Operations Truncate: The user may want to erase the contents of a file but keep its attributes. Instead of deleting the file and then recreating it, use the function that sets the size of file to zero. Appending to the end of a existing file Open a file – use a system call open() Open-file table: tracks open files File pointer: pointer to last read/write location, per process that has the file open File-open count: counter of number of times a file is open – to allow removal of data from open-file table when last processes closes it Close: When a file is actively not used close it. OPERATING SYSTEMS File Operations Several pieces of information are associated with an open file. File pointer File-open cou nt. Disk location of th e file. Access rights OPERATING SYSTEMS Open Files Provided by some operating systems and file systems Similar to reader-writer locks Shared lock similar to reader lock – several processes can acquire concurrently Exclusive lock similar to writer lock OS mediates access to a file Mandatory or advisory: Mandatory – access is denied depending on locks held and requested Advisory – processes can find status of locks and decide what to do OPERATING SYSTEMS File Types – Name, Extension OPERATING SYSTEMS File Structure None - sequence of words, bytes Simple record structure Lines Fixed length Variable length Complex Structures Formatted document Relocatable load file Can simulate last two with first method by inserting appropriate control characters Who decides: Operating system Program OPERATING SYSTEMS Access Methods OPERATING SYSTEMS Sequential-access File Sequential Access read next write next reset Direct Access (or relative access) – file is fixed length logical records read n write n position to n read next write next n = relative block number (i.e. index relative to the beginning of the file) Relative block numbers allow OS to decide where file should be placed OPERATING SYSTEMS Simulation of Sequential Access on Direct-access File cp = current position Some systems allow only sequential file access; others allow only direct access OPERATING SYSTEMS Other Access Methods Can be built on top of base methods General involve creation of an index for the file Keep index in memory for fast determination of location of data to be operated on (consider UPC code plus record of data about that item with associated prices) If index file becomes too large, create index for the index file Primary index file contains pointers to the secondary index files, which point to the actual data items IBM indexed sequential-access method (ISAM) Small master index, points to disk blocks of secondary index (which points to the actual file blocks) File kept sorted on a defined key Binary search of the master index provides the block number of the secondary index and another binary search to find the block containing the desired record THANK YOU Suresh Jamadagni Department of Computer Science Engineering [email protected] OPERATING SYSTEMS File Management Suresh Jamadagni Department of Computer Science OPERATING SYSTEMS File System Suresh Jamadagni Department of Computer Science OPERATING SYSTEMS Slides Credits for all the PPTs of this course The slides/diagrams in this course are an adaptation, combination, and enhancement of material from the following resources and persons: 1. Slides of Operating System Concepts, Abraham Silberschatz, Peter Baer Galvin, Greg Gagne - 9th edition 2013 and some slides from 10th edition 2018 2. Some conceptual text and diagram from Operating Systems - Internals and Design Principles, William Stallings, 9th edition 2018 3. Some presentation transcripts from A. Frank – P. Weisberg 4. Some conceptual text from Operating Systems: Three Easy Pieces, Remzi Arpaci-Dusseau, Andrea Arpaci Dusseau OPERATING SYSTEMS Directory Structure n A collection of nodes containing information about all files Directory Files F2 F4 F1 F3 Fn Both the directory structure and the files reside on disk OPERATING SYSTEMS A Typical File-system Organization OPERATING SYSTEMS Operations Performed on Directory n Search for a file n Create a file n Delete a file n List a directory n Rename a file n Traverse the file system OPERATING SYSTEMS Single-Level Directory n A single directory for all users OPERATING SYSTEMS Two-Level Directory n Separate directory for each user OPERATING SYSTEMS Tree-Structured Directories OPERATING SYSTEMS Tree-Structured Directories (Cont.) n Absolute or relative path name n Creating a new file is done in current directory n Delete a file rm n Creating a new subdirectory is done in current directory mkdir q Example: if in current directory /mail mkdir count Deleting “mail” deleting the entire subtree rooted by “mail” OPERATING SYSTEMS Acyclic-Graph Directories OPERATING SYSTEMS Acyclic-Graph Directories (Cont.) n Two different names (aliasing) n If dict deletes list dangling pointer Solutions: l Backpointers, so we can delete all pointers Variable size records a problem l Backpointers using a daisy chain organization l Entry-hold-count solution n New directory entry type l Link – another name (pointer) to an existing file l Resolve the link – follow pointer to locate the file OPERATING SYSTEMS General Graph Directory Advantages: It allows cycles within a dir structure. Multiple directories can be derived from more than one parent dir. Disadvantages: It is more costly than others. It needs garbage collection (traversing the entire file system, marking everything that can be accessed) OPERATING SYSTEMS General Graph Directory (Contd.) n How do we guarantee no cycles? l Allow only links to file not subdirectories l Garbage collection l Every time a new link is added use a cycle detection algorithm to determine whether it is OK THANK YOU Suresh Jamadagni Department of Computer Science Engineering [email protected] OPERATING SYSTEMS File Management Suresh Jamadagni Department of Computer Science OPERATING SYSTEMS File System – File-System Mounting, File Sharing and File Protection Suresh Jamadagni Department of Computer Science OPERATING SYSTEMS Slides Credits for all the PPTs of this course The slides/diagrams in this course are an adaptation, combination, and enhancement of material from the following resources and persons: 1. Slides of Operating System Concepts, Abraham Silberschatz, Peter Baer Galvin, Greg Gagne - 9th edition 2013 and some slides from 10th edition 2018 2. Some conceptual text and diagram from Operating Systems - Internals and Design Principles, William Stallings, 9th edition 2018 3. Some presentation transcripts from A. Frank – P. Weisberg 4. Some conceptual text from Operating Systems: Three Easy Pieces, Remzi Arpaci-Dusseau, Andrea Arpaci Dusseau OPERATING SYSTEMS File System Mounting A file system must be mounted before it can be accessed Fig (a) shows Existing File System A unmounted file system (i.e., Fig. (b)) is mounted at a mount point Fig (c) shows the effect of mounting (c) OPERATING SYSTEMS File Sharing Sharing of files on multi-user systems is desirable Sharing may be done through a protection scheme On distributed systems, files may be shared across a network Network File System (NFS) is a common distributed file-sharing method If multi-user system User IDs identify users, allowing permissions and protections to be per-user Group IDs allow users to be in groups, permitting group access rights Owner of a file / directory Group of a file / directory OPERATING SYSTEMS File Sharing – Remote File Systems Uses networking to allow file system access between systems Manually via programs like FTP Automatically, seamlessly using distributed file systems Semi automatically via the world wide web OPERATING SYSTEMS File Sharing – Remote File Systems (Cont.) OPERATING SYSTEMS Protection File owner/creator should be able to control: what can be done by whom Types of access Read Write Execute Append Delete List OPERATING SYSTEMS Access Lists and Groups Mode of access: read, write, execute Three classes of users on Unix / Linux RWX a) owner access 7 111 RWX b) group access 6 110 RWX c) public access 1 001 Attach a group to a file chgrp G game OPERATING SYSTEMS Windows Access-Control List Management THANK YOU Suresh Jamadagni Department of Computer Science and Engineering [email protected] OPERATING SYSTEMS File Management Suresh Jamadagni Department of Computer Science OPERATING SYSTEMS Files and Directories – System calls Suresh Jamadagni Department of Computer Science OPERATING SYSTEMS Slides Credits for all the PPTs of this course The slides/diagrams in this course are an adaptation, combination, and enhancement of material from the following resources and persons: 1. Slides of Operating System Concepts, Abraham Silberschatz, Peter Baer Galvin, Greg Gagne - 9th edition 2013 and some slides from 10th edition 2018 2. Some conceptual text and diagram from Operating Systems - Internals and Design Principles, William Stallings, 9th edition 2018 3. Some presentation transcripts from A. Frank – P. Weisberg 4. Some conceptual text from Operating Systems: Three Easy Pieces, Remzi Arpaci-Dusseau, Andrea Arpaci Dusseau OPERATING SYSTEMS Properties of a File stat function returns a structure of information about the named file The fstat function obtains information about the file that is already open on the descriptor fd. The lstat function returns information about the symbolic link, not the file referenced by the symbolic link The fstatat function returns the file statistics for a pathname relative to an open directory represented by the fd argument. OPERATING SYSTEMS Structure that represents properties of a File The timespec structure type defines time in terms of seconds and nanoseconds. It includes at least the following fields: time_t tv_sec; long tv_nsec; OPERATING SYSTEMS File type macros File type macros in OPERATING SYSTEMS File operations - creat, open and close A new file can be created by calling the creat function. creat function is equivalent to open (path, O_WRONLY | O_CREAT | O_TRUNC, mode); A file is opened or created by calling either the open function or the openat function An open file is closed by calling the close function OPERATING SYSTEMS File operations - lseek Every open file has an associated ‘‘current file offset,’’ normally a non-negative integer that measures the number of bytes from the beginning of the file Read and write operations normally start at the current file offset and cause the offset to be incremented by the number of bytes read or written. By default, this offset is initialized to 0 when a file is opened, unless the O_APPEND option is specified. If whence is SEEK_SET, the file’s offset is set to offset bytes from the beginning of the file. If whence is SEEK_CUR, the file’s offset is set to its current value plus the offset. The offset can be positive or negative. If whence is SEEK_END, the file’s offset is set to the size of the file plus the offset. The offset can be positive or negative OPERATING SYSTEMS File operations - read Data is read from an open file with the read function If the read is successful, the number of bytes read is returned. If the end of file is encountered, 0 is returned. The read operation starts at the file’s current offset. Before a successful return, the offset is incremented by the number of bytes actually read. OPERATING SYSTEMS File operations - write Data is written to an open file with the write function. The return value is usually equal to the nbytes argument; otherwise, an error has occurred A common cause for a write error is either filling up a disk or exceeding the file size limit for a given process For a regular file, the write operation starts at the file’s current offset. If the O_APPEND option was specified when the file was opened, the file’s offset is set to the current end of file before each write operation. After a successful write, the file’s offset is incremented by the number of bytes actually written OPERATING SYSTEMS Directories Directories are created with the mkdir and mkdirat functions, and deleted with the rmdir function. The specified file access permissions, mode, are modified by the file mode creation mask of the process OPERATING SYSTEMS Reading Directories Directories can be read by anyone who has access permission to read the directory. But only the kernel can write to a directory, to preserve file system sanity fdopendir provides a way to convert an open file descriptor into a DIR structure for use by the directory handling functions. The dirent structure defined in is implementation dependent. OPERATING SYSTEMS Hard links link function or the linkat function can be used to create a link to an existing file. These functions create a new directory entry, newpath, that references the existing file existingpath. If the newpath already exists, an error is returned. Only the last component of the newpath is created. The rest of the path must already exist Use the unlink function to remove an existing link. If there are other links to the file, the data in the file is still accessible through the other links OPERATING SYSTEMS Symbolic links A symbolic link is created with either the symlink or symlinkat function A new directory entry, sympath, is created that points to actualpath. It is not required that actualpath exist when the symbolic link is created The symlinkat function is similar to symlink, but the sympath argument is evaluated relative to the directory referenced by the open file descriptor for that directory (specified by the fd argument) THANK YOU Suresh Jamadagni Department of Computer Science and Engineering [email protected] OPERATING SYSTEMS File Management Suresh Jamadagni Department of Computer Science OPERATING SYSTEMS Implementing File-Systems Suresh Jamadagni Department of Computer Science OPERATING SYSTEMS Slides Credits for all the PPTs of this course The slides/diagrams in this course are an adaptation, combination, and enhancement of material from the following resources and persons: 1. Slides of Operating System Concepts, Abraham Silberschatz, Peter Baer Galvin, Greg Gagne - 9th edition 2013 and some slides from 10th edition 2018 2. Some conceptual text and diagram from Operating Systems - Internals and Design Principles, William Stallings, 9th edition 2018 3. Some presentation transcripts from A. Frank – P. Weisberg 4. Some conceptual text from Operating Systems: Three Easy Pieces, Remzi Arpaci-Dusseau, Andrea Arpaci Dusseau OPERATING SYSTEMS Layered File System Manages metadata information, and the dir structure to provide the file-org module. Also responsible for protection Translates logical to physical block addresses Issues generic commands to the device driver, manages mem buffers Device drivers and interrupt handlers OPERATING SYSTEMS File-System Implementation On-disk structures o Boot Control Block o Volume Control Block o Directory Structure o File Control Block In-Memory structures o In memory mount table o In memory Directory Structure o System wide Open File Table o Per process Open File Table OPERATING SYSTEMS File-System Implementation OPERATING SYSTEMS In-Memory File System Structures OPERATING SYSTEMS Partitions and Mounting Partition can be a volume containing a file system (“cooked”) or raw – just a sequence of blocks with no file system Boot block can point to boot volume or boot loader set of blocks that contain enough code to know how to load the kernel from the file system Or a boot management program for multi-os booting Root partition contains the OS, other partitions can hold other OS’s, other file systems, or be raw Mounted at boot time Other partitions can mount automatically or manually At mount time, file system consistency checked Is all metadata correct? If not, fix it, try again If yes, add to mount table, allow access OPERATING SYSTEMS Virtual File Systems Virtual File Systems (VFS) on Unix provide an object-oriented way of implementing file systems VFS allows the same system call interface (the API) to be used for different types of file systems Separates file-system generic operations from implementation details Implementation can be one of many file systems types, or network file system 4 Implements vnodes which hold inodes or network file details Then dispatches operation to appropriate file system implementation routines OPERATING SYSTEMS Virtual File Systems The API is to the VFS interface, rather than any specific type of file system (FS) VFS provides a single set of commands for the kernel and developers to access all types of FSs VFS software calls the specific device driver required to interface to various types of FSs Device driver interprets the standard set of FS commands to a specific type of FS on the partition or logical volume OPERATING SYSTEMS Virtual File Systems For example, Linux has four object types: Inode object, file object, superblock object, dentry object VFS defines set of operations on the objects that must be implemented Every object has a pointer to a function table Function table has addresses of routines to implement that function on that object For example: int open(...)—Open a file int close(...)—Close an already-open file ssize t read(...)—Read from a file ssize t write(...)—Write to a file int mmap(...)—Memory-map a file THANK YOU Suresh Jamadagni Department of Computer Science and Engineering [email protected] OPERATING SYSTEMS File Management Suresh Jamadagni Department of Computer Science OPERATING SYSTEMS Implementing File-Systems – Disk Space Allocation Methods Suresh Jamadagni Department of Computer Science OPERATING SYSTEMS Slides Credits for all the PPTs of this course The slides/diagrams in this course are an adaptation, combination, and enhancement of material from the following resources and persons: 1. Slides of Operating System Concepts, Abraham Silberschatz, Peter Baer Galvin, Greg Gagne - 9th edition 2013 and some slides from 10th edition 2018 2. Some conceptual text and diagram from Operating Systems - Internals and Design Principles, William Stallings, 9th edition 2018 3. Some presentation transcripts from A. Frank – P. Weisberg 4. Some conceptual text from Operating Systems: Three Easy Pieces, Remzi Arpaci-Dusseau, Andrea Arpaci Dusseau OPERATING SYSTEMS Allocation Methods - Contiguous n An allocation method refers to how disk blocks are allocated for files: n Contiguous allocation – each file occupies set of contiguous blocks l Best performance in most cases l Simple – only starting location (block #) and length (number of blocks) are required l Supports both sequential and direct access l Problems include finding space for file, knowing file size, external fragmentation, need for compaction off-line (downtime) or on-line OPERATING SYSTEMS Contiguous Allocation OPERATING SYSTEMS Linked Allocation OPERATING SYSTEMS Example of Indexed Allocation OPERATING SYSTEMS Example of Indexed Allocation In this scheme, a special block known as the Index block contains the pointers to all the blocks occupied by a file. Each file has its own index block. The ith entry in the index block contains the disk address of the ith file block. The directory entry contains the address of the index block as shown in the figure. OPERATING SYSTEMS Allocation Methods – Indexed (Cont.) Advantages: Ø This supports direct access to the blocks occupied by the file and therefore provides fast access to the file blocks. Ø It overcomes the problem of external fragmentation. Disadvantages: Ø The pointer overhead for indexed allocation is greater than linked allocation. Ø For very small files, say files that expand only 2-3 blocks, the indexed allocation would keep one entire block (index block) for the pointers which is inefficient in terms of memory utilization. (Note: In linked allocation we lose the space of only 1 pointer per block.) OPERATING SYSTEMS Indexed Allocation – Mapping (Cont.) Ø For files that are very large, single index block may not be able to hold all the pointers. Ø Other Schemes such as Linked scheme, Multilevel index and Combined Scheme are used. OPERATING SYSTEMS Combined Scheme: UNIX UFS 4K bytes per block, 32-bit addresses More index blocks than can be addressed with 32-bit file pointer OPERATING SYSTEMS Performance n Best method depends on file access type l Contiguous great for sequential and random n Linked good for sequential, not random n Declare access type at creation -> select either contiguous or linked n Indexed more complex l Single block access could require 2 index block reads then data block read l Clustering can help improve throughput, reduce CPU overhead THANK YOU Suresh Jamadagni Department of Computer Science Engineering [email protected] OPERATING SYSTEMS File Management Suresh Jamadagni Department of Computer Science OPERATING SYSTEMS File Management – Free space management Suresh Jamadagni Department of Computer Science OPERATING SYSTEMS Slides Credits for all the PPTs of this course The slides/diagrams in this course are an adaptation, combination, and enhancement of material from the following resources and persons: 1. Slides of Operating System Concepts, Abraham Silberschatz, Peter Baer Galvin, Greg Gagne - 9th edition 2013 and some slides from 10th edition 2018 2. Some conceptual text and diagram from Operating Systems - Internals and Design Principles, William Stallings, 9th edition 2018 3. Some presentation transcripts from A. Frank – P. Weisberg 4. Some conceptual text from Operating Systems: Three Easy Pieces, Remzi Arpaci-Dusseau, Andrea Arpaci Dusseau OPERATING SYSTEMS Free-Space Management Need to reuse the disk space from deleted files for new files To keep track of free disk space, the system maintains a free-space list The free-space list records all free disk blocks When a file is created, the free-space list is searched for the required amount of space and allocate that space to the new file. This space is then removed from the free-space list. When a file is deleted, its disk space is added to the free-space list OPERATING SYSTEMS Bit Vector The free-space list is implemented as a bit map or bit vector. Each block is represented by 1 bit. If the block is free, the bit is 1; if the block is allocated, the bit is 0 For example, consider a disk where blocks 2, 3, 4, 5, 8, 9, 10, 11, 12, 13, 17, 18, 25, 26, and 27 are free and the rest of the blocks are allocated. The free-space bit map would be 001111001111110001100000011100000... The main advantage of this approach is its relative simplicity and its efficiency in finding the first free block or n consecutive free blocks on the disk OPERATING SYSTEMS Linked List OPERATING SYSTEMS Linked List Link together all the free disk blocks, keeping a pointer to the first free block in a special location on the disk and caching it in memory. The first free block contains a pointer to the next free disk block, and so on Example: Consider blocks 2, 3, 4, 5, 8, 9, 10, 11, 12, 13, 17, 18, 25, 26, and 27 are free and the rest of the blocks are allocated. The pointer would point to block 2 as the first free block. Block 2 would contain a pointer to block 3, which would point to block 4 and so on This scheme is not efficient; to traverse the list, one must read each block, which requires substantial I/O time. OPERATING SYSTEMS Grouping A modification of the free-list approach stores the addresses of n free blocks in the first free block. The first n−1 of these blocks are actually free. The last block contains the addresses of another n free blocks, and so on. The addresses of a large number of free blocks can now be found quickly, unlike the situation when the standard linked-list approach is used. OPERATING SYSTEMS Counting Generally, several contiguous blocks may be allocated or freed simultaneously, particularly when space is allocated with the contiguous-allocation algorithm or through clustering. Rather than keeping a list of n free disk addresses, we can keep the address of the first free block and the number (n) of free contiguous blocks that follow the first block Each entry in the free-space list then consists of a disk address and a count. The entries can be stored in a balanced tree, rather than a linked list, for efficient lookup, insertion, and deletion. THANK YOU Suresh Jamadagni Department of Computer Science Engineering [email protected] OPERATING SYSTEMS File Management Suresh Jamadagni Department of Computer Science OPERATING SYSTEMS File Management – Efficiency and Performance Suresh Jamadagni Department of Computer Science OPERATING SYSTEMS Slides Credits for all the PPTs of this course The slides/diagrams in this course are an adaptation, combination, and enhancement of material from the following resources and persons: 1. Slides of Operating System Concepts, Abraham Silberschatz, Peter Baer Galvin, Greg Gagne - 9th edition 2013 and some slides from 10th edition 2018 2. Some conceptual text and diagram from Operating Systems - Internals and Design Principles, William Stallings, 9th edition 2018 3. Some presentation transcripts from A. Frank – P. Weisberg 4. Some conceptual text from Operating Systems: Three Easy Pieces, Remzi Arpaci-Dusseau, Andrea Arpaci Dusseau OPERATING SYSTEMS Efficiency Efficiency dependent on: Disk allocation and directory algorithms Types of data kept in file’s directory entry Pre-allocation or as-needed allocation of metadata structures Fixed-size or varying-size data structures OPERATING SYSTEMS Performance Performance dependent on: Keeping data and metadata close together Buffer cache – separate section of main memory for frequently used blocks Synchronous writes sometimes requested by apps or needed by OS No buffering / caching – writes must hit disk before acknowledgement Asynchronous writes more common, buffer-able, faster Free-behind and read-ahead – techniques to optimize sequential access Reads frequently slower than writes OPERATING SYSTEMS Page cache Some systems maintain a separate section of main memory for a buffer cache, where blocks are kept under the assumption that they will be used again shortly. Other systems cache file data using a page cache. The page cache uses virtual memory techniques to cache file data as pages rather than as file-system-oriented blocks. Caching file data using virtual addresses is far more efficient than caching through physical disk blocks, as accesses interface with virtual memory rather than the file system. Several systems—including Solaris, Linux, and Windows —use page caching to cache both process pages and file data. This is known as unified virtual memory. OPERATING SYSTEMS I/O without a unified buffer cache OPERATING SYSTEMS I/O using a unified buffer cache OPERATING SYSTEMS Synchronous and Asynchronous writes Another issue that can affect the performance of I/O is whether writes to the file system occur synchronously or asynchronously. Synchronous writes occur in the order in which the disk subsystem receives them, and the writes are not buffered. Thus, the calling routine must wait for the data to reach the disk drive before it can proceed. In an asynchronous write, the data are stored in the cache, and control returns to the caller. Most writes are asynchronous, metadata writes can be synchronous OPERATING SYSTEMS Page replacement Sequential access can be optimized by techniques known as free-behind and read-ahead. Free-behind removes a page from the buffer as soon as the next page is requested. The previous pages are not likely to be used again and waste buffer space. With read-ahead, a requested page and several subsequent pages are read and cached. These pages are likely to be requested after the current page is processed. Retrieving these data from the disk in one transfer and caching them saves a considerable amount of time OPERATING SYSTEMS Disk cache When data is written to a disk file, the pages are buffered in the disk cache, and the disk driver sorts its output queue according to disk address to minimize disk-head seeks Unless synchronous writes are required, a process writing to disk simply writes into the cache, and the system asynchronously writes the data to disk when convenient The user process sees very fast writes. When data are read from a disk file, the block I/O system does some read-ahead Writes are much more nearly asynchronous than are reads. Thus, output to the disk through the file system is often faster than is input for large transfers, counter to intuition THANK YOU Suresh Jamadagni Department of Computer Science Engineering [email protected] OPERATING SYSTEMS Storage Management Chandravva Hebbi Department of Computer Science OPERATING SYSTEMS Mass-Storage Structure Chandravva Hebbi Department of Computer Science OPERATING SYSTEMS Slides Credits for all PPTs of this course The slides/diagrams in this course are an adaptation, combination, and enhancement of material from the following resources and persons: 1. Slides of Operating System Concepts, Abraham Silberschatz, Peter Baer Galvin, Greg Gagne - 9th edition 2013 and some slides from 10th edition 2018 2. Some conceptual text and diagram from Operating Systems - Internals and Design Principles, William Stallings, 9th edition 2018 3. Some presentation transcripts from A. Frank – P. Weisberg 4. Some conceptual text from Operating Systems: Three Easy Pieces, Remzi Arpaci-Dusseau, Andrea Arpaci Dusseau OPERATING SYSTEMS Overview of Mass Storage Structure Magnetic disks provide bulk of secondary storage of modern computers. Each disk platter has a flat circular shape, like a CD. Common platter diameters range from 1.8 to 3.5 inches. The two surfaces of a platter are covered with a magnetic material. The heads are attached to a disk arm that moves all the heads as a unit. The surface of a platter is logically divided into circular tracks, which are subdivided into sectors. The set of tracks that are at one arm position makes up a cylinder. OPERATING SYSTEMS Moving-head Disk Mechanism OPERATING SYSTEMS Moving-head Disk Mechanism OPERATING SYSTEMS Overview of Mass Storage Structure When the disk is in use, a drive motor spins it at high speed. Most drives rotate 60 to 250 times per second. Disk speed has two parts. The transfer rate is the rate at which data flow between the drive and the computer. The positioning time, or random-access time, consists of two parts The time necessary to move the disk arm to the desired cylinder, called the seek time. the time necessary for the desired sector to rotate to the disk head, called the rotational latency. Disk Latency = Seek time + Rotational latency + Transfer time OPERATING SYSTEMS Magnetic Disk Specifications – Reference: www.seagate.com OPERATING SYSTEMS Overview of Mass Storage Structure Disk head flies on an extremely thin cushion of air. head will sometimes damage the magnetic surface when it makes contact with it. Head crash Not recoverable A disk can be removable. Removable magnetic disks generally consist of one platte Examples CDs, DVDs, and Blu-ray discs as well as removable flash-memory OPERATING SYSTEMS Mass Storage Structure (Cont.) Drive attached to computer via I/O bus Busses vary, including EIDE, ATA, SATA, USB, Fiber Channel, SCSI, SAS, Firewire § SCSI is a set of parallel interface standards developed by ANSI § allows more hard disks per computer compared to IDE § SCSI is harder to configure compared to SATA and IDE § EIDE has a 133 megabytes per second speed rate, while SATA has up to a 150 megabytes per second speed rate. Host controller in computer uses bus to talk to disk controller built into drive or storage array OPERATING SYSTEMS Solid-State Disks Nonvolatile memory used like a hard drive Many technology variations Can be more reliable than HDDs More expensive per MB Maybe have shorter life span Less capacity But much faster Standard Bus interfaces can be too slow -> connect directly to the system bus (PCI, for example) No moving parts, so no seek time or rotational latency OPERATING SYSTEMS Magnetic Tape Was early secondary-storage medium Evolved from open spools to cartridges Relatively permanent and holds large quantities of data Access time slow Random access ~1000 times slower than disk Mainly used for backup, storage of infrequently-used data, transfer medium between systems OPERATING SYSTEMS Magnetic Tape (Cont.) Kept in spool and wound or rewound past read-write head Moving to the correct spot on a tape can take minutes, but once positioned, tape drives can write data at speeds comparable to disk drives. Tape capacities vary greatly, depending on the particular kind of tape drive, with current capacities exceeding several terabytes. Some tapes have built-in compression that can more than double the effective storage. Tapes and their drivers are usually categorized by width, including 4, 8, and 19 millimeters and 1/4 and 1/2 inch. Some are named according to technology, such as LTO-5 and SDLT. THANK YOU Chandravva Hebbi Department of Computer Science Engineering [email protected] OPERATING SYSTEMS Storage Management Chandravva Hebbi Department of Computer Science OPERATING SYSTEMS Mass-Storage Structure – Disk Scheduling FCFS, SSTF, SCAN, C-SCAN, LOOK Chandravva Hebbi Department of Computer Science OPERATING SYSTEMS Slides Credits for all PPTs of this course The slides/diagrams in this course are an adaptation, combination, and enhancement of material from the following resources and persons: 1. Slides of Operating System Concepts, Abraham Silberschatz, Peter Baer Galvin, Greg Gagne - 9th edition 2013 and some slides from 10th edition 2018 2. Some conceptual text and diagram from Operating Systems - Internals and Design Principles, William Stallings, 9th edition 2018 3. Some presentation transcripts from A. Frank – P. Weisberg 4. Some conceptual text from Operating Systems: Three Easy Pieces, Remzi Arpaci-Dusseau, Andrea Arpaci Dusseau OPERATING SYSTEMS Disk Scheduling n The operating system is responsible for using hardware efficiently — for the disk drives, this means having a fast access time and disk bandwidth n Minimize seek time n Seek time seek distance n Disk bandwidth is the total number of bytes transferred, divided by the total time between the first request for service and the completion of the last transfer OPERATING SYSTEMS Disk Scheduling (Cont.) n There are many sources of disk I/O request l OS l System processes l Users processes n I/O request includes input or output mode, disk address, memory address, number of sectors to transfer n OS maintains queue of requests, per disk or device n Idle disk can immediately work on I/O request, busy disk means work must queue l Optimization algorithms only make sense when a queue exists OPERATING SYSTEMS Disk Scheduling (Cont.) n Note that drive controllers have small buffers and can manage a queue of I/O requests (of varying “depth”) n Several algorithms exist to schedule the servicing of disk I/O requests n The analysis is true for one or many platters n We illustrate scheduling algorithms with a request queue (0-199) 98, 183, 37, 122, 14, 124, 65, 67 Head pointer 53 OPERATING SYSTEMS FCFS Illustration shows total head movement of 640 cylinders OPERATING SYSTEMS FCFS Illustration shows total head movement of 640 cylinders 45+85+146+85+108+110+59 +2 = 640 cylinders OPERATING SYSTEMS SSTF n Shortest Seek Time First selects the request with the minimum seek time from the current head position n SSTF scheduling is a form of SJF scheduling; may cause starvation of some requests n Illustration shows total head movement of 236 cylinders OPERATING SYSTEMS SSTF n Shortest Seek Time First selects the request with the minimum seek time from the current head position n SSTF scheduling is a form of SJF scheduling; may cause starvation of some requests n Illustration shows total head movement of 236 cylinders 12+2+30+23+84+24+2+59 = 236 cylinders OPERATING SYSTEMS SCAN n The disk arm starts at one end of the disk, and moves toward the other end, servicing requests until it gets to the other end of the disk, where the head movement is reversed and servicing continues. n SCAN algorithm Sometimes called the elevator algorithm n But note that if requests are uniformly dense, largest density at other end of disk and those wait the longest OPERATING SYSTEMS SCAN (Cont.) OPERATING SYSTEMS SCAN (Cont.) Number of cylinder moves= 16+23+79+2+31+24+2+59=236 OPERATING SYSTEMS C-SCAN n Provides a more uniform wait time than SCAN n The head moves from one end of the disk to the other, servicing requests as it goes l When it reaches the other end, however, it immediately returns to the beginning of the disk, without servicing any requests on the return trip n Treats the cylinders as a circular list that wraps around from the last cylinder to the first one n Total number of cylinders? OPERATING SYSTEMS C-SCAN (Cont.) OPERATING SYSTEMS C-SCAN (Cont.) = (65-53)+(67-65)+(98-67)+(122-98)+(124-122)+(183-124)+(199-183)+(199-0)+(14-0)+(37-14) =12+2+31+24+2+59+16+199+14+23 = 382 OPERATING SYSTEMS LOOK & C-LOOK n LOOK a version of SCAN, C-LOOK a version of C-SCAN n Arm only goes as far as the last request in each direction, then reverses direction immediately, without first going all the way to the end of the disk n Total number of cylinders? OPERATING SYSTEMS C-LOOK (Cont.) Total head movements incurred while servicing these requests = (65 – 53) + (67 – 65) + (98 – 67) + (122 – 98) + (124 – 122) + (183 – 124) + (183 – 14) + (37 – 14) = 12 + 2 + 31 + 24 + 2 + 59 + 169 + 23 = 322 OPERATING SYSTEMS C-LOOK (Cont.) OPERATING SYSTEMS Selecting a Disk-Scheduling Algorithm n SSTF is common and has a natural appeal n SCAN and C-SCAN perform better for systems that place a heavy load on the disk l Less starvation n Performance depends on the number and types of requests n Requests for disk service can be influenced by the file-allocation method l And metadata layout n The disk-scheduling algorithm should be written as a separate module of the operating system, allowing it to be replaced with a different algorithm if necessary n Either SSTF or LOOK is a reasonable choice for the default algorithm OPERATING SYSTEMS Selecting a Disk-Scheduling Algorithm (Cont.) n What about rotational latency? l Difficult for OS to calculate n The rotational latency can be nearly as large as the average seek time. n It is difficult for the operating system to schedule for improved rotational latency, though, because modern disks do not disclose the physical location of logical blocks. n Disk manufacturers have been alleviating this problem by implementing disk-scheduling algorithms in the controller hardware built into the disk drive. n How does disk-based queueing effect OS queue ordering efforts? l If the OS sends a batch of requests to the controller, the controller can queue them and then schedule them to improve both the seek time and the rotational latency. THANK YOU Chandravva Hebbi Department of Computer Science Engineering [email protected] OPERATING SYSTEMS Storage Management Chandravva Hebbi Department of Computer Science OPERATING SYSTEMS Mass-Storage Structure – Swap Space and RAID Chandravva Hebbi Department of Computer Science OPERATING SYSTEMS Slides Credits for all PPTs of this course The slides/diagrams in this course are an adaptation, combination, and enhancement of material from the following resources and persons: 1. Slides of Operating System Concepts, Abraham Silberschatz, Peter Baer Galvin, Greg Gagne - 9th edition 2013 and some slides from 10th edition 2018 2. Some conceptual text and diagram from Operating Systems - Internals and Design Principles, William Stallings, 9th edition 2018 3. Some presentation transcripts from A. Frank – P. Weisberg 4. Some conceptual text from Operating Systems: Three Easy Pieces, Remzi Arpaci-Dusseau, Andrea Arpaci Dusseau OPERATING SYSTEMS Swap-Space Management n Swap-space — Virtual memory uses disk space as an extension of main memory l Less common now due to memory capacity increases n Swap-space can be carved out of the normal file system, or, more commonly, it can be in a separate disk partition (raw) n Swap-space management l 4.3BSD allocates swap space when process starts; holds text segment (the program) and data segment l Kernel uses swap maps to track swap-space use l Some systems allow the use of multiple swap spaces – both files and dedicated swap partitions OPERATING SYSTEMS Swap-Space Management (Cont.) l Solaris 2 allocates swap space only when a dirty page is forced out of physical memory, not when the virtual memory page is first created 4File data written to swap space until write to file system requested 4Other dirty pages go to swap space due to no other home 4Text segment pages thrown out and reread from the file system as needed n What if a system runs out of swap space? n Some systems allow multiple swap spaces OPERATING SYSTEMS Data Structures for Swapping on Linux Systems q Linux allows one or more swap areas to be established. q A swap area may be in either a swap file on a regular file system or a dedicated swap partition. q Each swap area consists of a series of 4-KB page slots, which are used to hold swapped pages. q Associated with each swap area is a swap map—an array of integer counters, each corresponding to a page slot in the swap area. q 0 => page slot is available q 3 => swapped page is mapped to 3 different processes OPERATING SYSTEMS RAID Basics q RAID is a Disk Organization technique Ø Addresses performance and reliability issues Ø Multiple disk drives provide (or improve) reliability via redundancy q Many systems today need to store many terabytes of data q Don’t want to use single, large disk Ø too expensive Ø failures could be catastrophic q Would prefer to use many smaller disks q Redundant Array of Independent (inexpensive) Disks OPERATING SYSTEMS RAID Basics (Cont.) q Basic idea is to connect multiple disks together to provide qlarge storage capacity qfaster access to reading data qredundant data q Many different levels of RAID systems qdiffering levels of redundancy, error checking, capacity, and cost OPERATING SYSTEMS Striping Take file data and map it to different disks Allows for reading data in parallel Transfer rate is improved by striping data across the disks Ø Bit-level striping and block-level striping (most common) file data block 0 block 1 block 2 block 3 Disk 0 Disk 1 Disk 2 Disk 3 OPERATING SYSTEMS Block-level Striping qWith n disks, block i of a file goes to disk (i mod n) + 1 § Requests for different blocks can run in parallel if the blocks reside on different disks § A request for a long sequence of blocks can utilize all disks in parallel OPERATING SYSTEMS Mirroring Keep two copies of data on two separate disks Gives good error recovery – if some data is lost, get it from the other source Expensive – requires twice as many disks Write performance can be slow – have to write data to two different spots Read performance is enhanced – can read data from file in parallel OPERATING SYSTEMS RAID Structure n Mean time to failure n If MTTF of a single disk is 100,000 hours, MTTF of some disk in an array of 100 disks = 100000/100 = 1000 hours or 41.66 days n Mean time to repair – time taken to replace a failed disk and to restore the data on it n Mean time to data loss based on above factors n If mirrored disks fail independently (i.e., not related to power failures and natural disasters), consider disk with MTTF of 100,000 hours and MTTR is 10 hours l Mean time to data loss of a mirrored disk system is 100, 0002 / (2 10) = 500 106 hours, or 57,000 years! OPERATING SYSTEMS RAID (Cont.) n Frequently combined with NVRAM to improve write performance n Several improvements in disk-use techniques involve the use of multiple disks working cooperatively n Disk striping uses a group of disks as one storage unit n RAID is arranged into six different levels n RAID schemes improve performance and improve the reliability of the storage system by storing redundant data l Mirroring or shadowing (RAID 1) keeps duplicate of each disk l Striped mirrors (RAID 1+0) or mirrored stripes (RAID 0+1) provides high performance and high reliability l Block interleaved parity (RAID 4, 5, 6) uses much less redundancy OPERATING SYSTEMS RAID (Cont.) n RAID within a storage array can still fail if the array fails, so automatic replication of the data between arrays is common n Frequently, a small number of hot-spare disks are left unallocated, automatically replacing a failed disk and having data rebuilt onto them OPERATING SYSTEMS RAID Levels OPERATING SYSTEMS RAID Levels P indicates error-correcting bits and C indicates a second copy of the data P + Q redundancy scheme uses 2 parity values P and Q OPERATING SYSTEMS RAID Levels 0 and 1 q RAID level 0 refers to disk arrays with striping at the level of blocks Ø No redundancy (such as mirroring or parity bits) Ø lots of disks means low Mean Time To Failure (MTTF) q RAID level 1 refers to disk mirroring. Ø A complete file is stored on a single disk Ø A second disk contains an exact copy of the file Ø Provides complete redundancy of data Ø Read performance can be improved Ø file data can be read in parallel Ø Write performance suffers Ø must write the data out twice Ø Most expensive RAID implementation Ø requires twice as much storage space OPERATING SYSTEMS RAID Level-2 q RAID level 2 is also known as memory-style error correcting code (ECC) organization. q Stripes data across disks similar to Level-0 Ø difference is data is bit interleaved instead of block interleaved Ø For ex, the first bit of each byte can be stored in disk 1, the second bit in disk 2, and so on until the eighth bit is stored in disk 8; the error-correction bits are stored in further disks. q Uses ECC to monitor correctness of information on disk Ø Multiple disks record the ECC information to determine which disk is in fault Ø A parity disk is then used to reconstruct corrupted or lost data q RAID level 2 requires only 3 disks’ overhead for 4 disks of data, unlike RAID level 1, which requires 4 disks’ overhead. OPERATING SYSTEMS RAID Level-3 q RAID level 3, or bit-interleaved parity organization; q One big problem with Level-2 is the number of extra disks needed to detect which disk had an error q Modern disks can already determine if there is an error Ø using ECC codes with each sector q So just need to include a parity disk Ø if a sector is bad, the disk itself tells us, and use the parity disk to correct it q Transfer rate for reading or writing a single block is faster than RAID level 1. q But supports fewer I/Os per second, since every disk has to participate in every I/O request. q Has performance problem due to the expense of computing and writing the parity. OPERATING SYSTEMS RAID Level-4 q RAID level 4 interleaves file blocks Ø allows multiple small I/O’s to be done at once q Consists of block-level striping with dedicated parity. q Still use a single disk for parity q Now the parity is calculated over data from multiple blocks Ø Level-2,3 calculate it over a single block q If an error detected, need to read other blocks on other disks to reconstruct data Doing multiple small reads is now faster than before However, writes are still very slow – this is because of calculating and writing the parity blocks Also, only one write is allowed at a time – all writes must access the check disk so other writes have to wait OPERATING SYSTEMS RAID Level-5 q RAID level 5 stripes file data and checks data over all the disks Ø no longer a single check disk Ø no more write bottleneck q Consists of block-level striping with distributed parity. Ø Unlike RAID 4, parity information is distributed among the drives q Drastically improves the performance of multiple writes Ø they can now be done in parallel q Slightly improves reads Ø one more disk to use for reading q read and write performance close to that of RAID Level-1 q requires as much disk space as Levels-3,4 OPERATING SYSTEMS RAID Level-6 q RAID 6 is like RAID 5, but the parity data are written to two drives. Ø That means it requires at least 4 drives and can withstand 2 drives dying simultaneously. q Write data transactions are slower than RAID 5 due to the additional parity data that have to be calculated. OPERATING SYSTEMS RAID level 10 - combining RAID 1 and RAID 0 q Provides security by mirroring all data on secondary drives while using striping across each set of drives to speed up data transfers. q Rebuild time is very fast q Half of the storage capacity goes to mirroring Ø so compared to large RAID 5 or RAID 6 arrays, this is an expensive way to have redundancy. OPERATING SYSTEMS RAID (0 + 1) and (1 + 0) Disks are striped and then the stripe is mirrored to another, equivalent stripe. If a single disk fails, an entire stripe is inaccessible. Disks are mirrored in pairs and then the resulting mirrored pairs are striped. This is better than 0+1 when a single disk fails OPERATING SYSTEMS Selecting a RAID level n One consideration is rebuild performance. l If a disk fails, the time needed to rebuild its data can be significant. l This may be an important factor if a continuous supply of data is required, as it is in high-performance or interactive database systems. l Furthermore, rebuild performance influences the mean time to failure. l Rebuild performance varies with the RAID level used. 4 Rebuilding is easiest for RAID level 1, since data can be copied from another disk. OPERATING SYSTEMS Selecting a RAID level (Cont.) 4For the other levels, we need to access all the other disks in the array to rebuild data in a failed disk. 4Rebuild times can be hours for RAID 5 rebuilds of large disk sets. n RAID level 0 is used in high-performance applications where data loss is not critical. n RAID level 1 is popular for applications that require high reliability with fast recovery. n RAID 0 + 1 and 1 + 0 are used where both performance and reliability are important—for example, for small databases. n Due to RAID 1’s high space overhead, RAID 5 is often preferred for storing large volumes of data. OPERATING SYSTEMS Selecting a RAID level (Cont.) n RAID system designers and administrators of storage have to make several other decisions as well. l How many disks should be in a given RAID set? l How many bits should be protected by each parity bit? l If more disks are in an array, data-transfer rates are higher, but the system is more expensive. l If more bits are protected by a parity bit, the space overhead due to parity bits is lower, but the chance that a second disk will fail before the first failed disk is repaired is greater, and that will result in data loss. THANK YOU Chandravva Hebbi Department of Computer Science Engineering [email protected] OPERATING SYSTEMS I/O Management, System Protection and Security Arya S S Department of Computer Science OPERATING SYSTEMS System Protection - Goals, Principles and Domain of Protection Arya S S Department of Computer Science OPERATING SYSTEMS Slides Credits for all PPTs of this course The slides/diagrams in this course are an adaptation, combination, and enhancement of material from the following resources and persons: 1. Slides of Operating System Concepts, Abraham Silberschatz, Peter Baer Galvin, Greg Gagne - 9th edition 2013 and some slides from 10th edition 2018 2. Some conceptual text and diagram from Operating Systems - Internals and Design Principles, William Stallings, 9th edition 2018 3. Some presentation transcripts from A. Frank – P. Weisberg 4. Some conceptual text from Operating Systems: Three Easy Pieces, Remzi Arpaci-Dusseau, Andrea Arpaci Dusseau OPERATING SYSTEMS Goals of Protection In one protection model, computer consists of a collection of objects, hardware or software Each object has a unique name and can be accessed through a well-defined set of operations Protection problem - ensure that each object is accessed correctly and only by those processes that are allowed to do so Protection provides a mechanism for the enforcement of the policies governing resource use Some policies are fixed in the design of the system, while others are formulated by the management of a system and by the individual users as well (to protect their own files and programs). OPERATING SYSTEMS Principles of Protection Guiding principle – principle of least privilege Programs, users and systems should be given just enough privileges to perform their tasks Failure of a component does the minimum damage and allows the minimum damage to be done. OS follows this principle for its features, programs, system calls and data structures Can be static (during life of system, during life of process) Or dynamic (changed by process as needed) – domain switching, privilege escalation OPERATING SYSTEMS Principles of Protection (Cont.) Must consider “grain” aspect Rough-grained privilege management easier, simpler, but least privilege now done in large chunks 4 For example, traditional Unix processes either have abilities of the associated user, or of root Fine-grained management more complex, more overhead, but more protective 4 Provide/disable privileges as needed 4 Create audit trail for privileged function access 4 File ACL lists, RBAC OPERATING SYSTEMS Domain of Protection A computer system is a collection of processes and objects. Hardware objects (CPU, memory segments, printers, disks, and tape drives) Software objects (files, programs, and semaphores) Each object has a unique name and can be accessed only through well-defined and meaningful operations. A process should be allowed to access only those resources for which it has authorization. A process should be able to access only those resources that it currently requires to complete its task.(need-to-know principle) Need-to-know principle, is useful in limiting the amount of damage a faulty process can cause in the system OPERATING SYSTEMS Domain Structure A process operates within a protection domain, which specifies the resources that the process may access. Each domain defines a set of objects and the types of operations that may be invoked on each object. The ability to execute an operation on an object is an access right Access-right= where rights-set is a subset of all valid operations that can be performed on the object. System with 3 protected domains OPERATING SYSTEMS Domain Structure(Cont.) The association between a process and a domain may be either static or dynamic. Static: the set of resources available to the process is fixed throughout the process’s lifetime. But a process may execute in two different phases, for eg ,need read access in one phase and write access in another phase. So the fixed domain should contain both read and write access for that process. This violates need to know principle. So we must allow the contents of domain to be modified during each phase of the process. If the association is dynamic, a mechanism is available to allow domain switching. OPERATING SYSTEMS Domain Structure (Cont.) A domain can be realized in a variety of ways: User Process Procedure Standard dual-mode (monitor-user mode) model of operating-system execution is followed in system. But in a multi programming environment these two protection domains are insufficient, since users want to be protected from one another. So a more elaborate scheme is used in UNIX and MULTICS. THANK YOU Arya S S Department of Computer Science Engineering [email protected] OPERATING SYSTEMS I/O Management, System Protection and Security Arya S S Department of Computer Science OPERATING SYSTEMS Domain of Protection: Unix, MULTICS examples Arya S S Department of Computer Science OPERATING SYSTEMS Slides Credits for all PPTs of this course The slides/diagrams in this course are an adaptation, combination, and enhancement of material from the following resources and persons: 1. Slides of Operating System Concepts, Abraham Silberschatz, Peter Baer Galvin, Greg Gagne - 9th edition 2013 and some slides from 10th edition 2018 2. Some conceptual text and diagram from Operating Systems - Internals and Design Principles, William Stallings, 9th edition 2018 3. Some presentation transcripts from A. Frank – P. Weisberg 4. Some conceptual text from Operating Systems: Three Easy Pieces, Remzi Arpaci-Dusseau, Andrea Arpaci Dusseau OPERATING SYSTEMS Domain Implementation (UNIX) Domain = user-id Domain switch accomplished via file system 4 Each file has associated with it a domain bit (setuid bit) 4 When file is executed and setuid = on, then user-id is set to owner of the file being executed 4 When execution completes user-id is reset Domain switch accomplished via passwords su command temporarily switches to another user’s domain when other domain’s password provided Domain switching via commands sudo command prefix executes specified command in another domain (if original domain has privilege or password given) OPERATING SYSTEMS Domain Implementation (MULTICS) The protection domains are organized hierarchically into a ring structure. Each ring corresponds to a single domain Multiplexed Information and Computing Service - was a cooperative project led by MIT along The rings are numbered from 0 to 7. with General Electric and Bell Labs. -was started in 1964 and has influenced all modern A process executing in domain Do has the most operating systems from microcomputers to mainframes --was the first major OS to be designed as a secure privileges. system -- had hardware support for ring-oriented security MULTICS has a segmented address space. Each segment is a file, and each segment is associated with one of the rings. It includes three access bits to control reading, writing, and execution OPERATING SYSTEMS Multics ring Structure Let Di and Dj be any two domain rings. If j < i ⇒ Di ⊆ Dj OPERATING SYSTEMS Multics Domain Implementation A current-ring-number counter is associated with each process, identifying the ring in which the process is executing currently. When a process is executing in ring i, it cannot access a segment associated, with ring j (j < i). It can access a segment associated with ring k (k > i). Domain switching in MULTICS occurs when a process crosses from one ring to another by calling a procedure in a different ring. OPERATING SYSTEMS Multics Domain Switching Ring field of the segment descriptor include the following: 1.Access bracket. A pair of integers, b1 and b2, such that b1 b2. 3. List of gates. Identifies the entry points (or gates) at which the segments may be called. If a process operating in ring i calls a segment whose bracket is such that b1 4 with M ∈ R k 4 If triple found, operation is allowed to continue; otherwise an exception or condition is raised But table could be large -> won’t fit in main memory Virtual memory techniques can be used for managing this table Difficult to group objects consider an object that all domains can read, this object must have a separate entry in every domain) OPERATING SYSTEMS Implementation of Access Matrix (Cont.) Option 2 – Access lists for objects Each column implemented as an access list for one object i.e. specifying user names and the types of access allowed for each user (empty entries can be discarded) Resulting per-object list consists of ordered pairs defining all domains with non-empty set of access rights for the object Easily extended to contain default set -> If M ∈ default set, also allow access For efficiency, check the default set first and then search the access list OPERATING SYSTEMS Implementation of Access Matrix (Cont.) Each column = Access-control list for one object Defines who can perform what operation Domain 1 = Read, Write Domain 2 = Read Domain 3 = Read Each Row = Capability List (like a key) For each domain, what operations allowed on what objects Object F1 – Read Object F4 – Read, Write, Execute Object F5 – Read, Write, Delete, Copy OPERATING SYSTEMS Implementation of Access Matrix (Cont.) Option 3 – Capability list for domains Instead of object-based (i.e column wise), list is domain based (i.e row wise) Capability list for domain is list of objects together with operations allowed on them Object represented by its name or address, called a capability Execute operation M on object Oj, process requests operation and specifies capability as parameter 4 Possession of capability means access is allowed Capability list associated with domain but never directly accessible to a process executing in that domain 4 Rather, protected object, maintained by OS and accessed by the user indirectly 4 Like a “secure pointer” 4 Idea can be extended up to the application level OPERATING SYSTEMS Implementation of Access Matrix (Cont.) Option 4 – Lock-key Compromise between access lists and capability lists Each object has a list of unique bit patterns, called locks Each domain has a list of unique bit patterns called keys (managed by the OS) Process in a domain can only access object if domain has a key that matches one of the locks OPERATING SYSTEMS Comparison of Implementations Many trade-offs to consider Global table is simple, but can be large Access lists correspond to needs of users 4 Access rights for a particular domain is non-localized, so difficult to determine the set of access rights for each domain 4 Every access to an object must be checked – Many objects and access rights -> slow (i.e not suitable for large system with long access lists) Capability lists useful for localizing information for a given process 4 But revocation capabilities can be inefficient Lock-key effective and flexible, keys can be passed freely from domain to domain, easy revocation OPERATING SYSTEMS Comparison of Implementations (Cont.) Most systems use combination of access lists and capabilities First access to an object -> access list searched 4 If allowed, capability created and attached to process – Additional accesses need not be checked 4 After last access, capability destroyed 4 Consider file system with ACLs per file recorded in a new entry in a file table (file table maintained by the OS such as UNIX and protection is ensured) OPERATING SYSTEMS Access Control Protection can be applied to non-file resources Oracle Solaris 10 provides role-based access control (RBAC) to implement least privilege Privilege is right to execute system call or use an option (ex: write access for a file) within a system call Can be assigned to processes Users assigned roles granting access to privileges and programs 4 Enable role via password to gain its privileges Similar to access matrix OPERATING SYSTEMS Revocation of Access Rights Various options to remove the access right of a domain to an object Immediate vs. delayed (i.e. when revocation will occur) Selective vs. general (i.e. select group of users or all the users) Partial vs. total (i.e. subset of the rights or all the rights) Temporary vs. permanent (can access right be revoked and obtained later?) Access List – Delete access rights from access list Simple – search access list and remove entry, revocation is easy Immediate, general or selective, total or partial, permanent or temporary OPERATING SYSTEMS Revocation of Access Rights (Cont.) Capability List – Scheme required to locate capability in the system before capability can be revoked Reacquisition – periodic delete from each domain, with reacquire and denial if revoked by a process Back-pointers – set of pointers from each object to all capabilities of that object , follow these pointers for revocation (adopted in Multics) Indirection – capability points to global table entry which in turn points to object – delete entry from global table, selective revocation not allowed Keys – unique bit pattern associated with a capability, generated when capability is created 4 Master key associated with object, key matches master key for access 4 Revocation – create new master key (with a new value) 4 Policy decision of who can create and modify keys – object owner or others? THANK YOU Arya S S Department of Computer Science Engineering [email protected] OPERATING SYSTEMS Storage Management Suresh Jamadagni Department of Computer Science OPERATING SYSTEMS Storage Management – System calls Suresh Jamadagni Department of Computer Science OPERATING SYSTEMS Slides Credits for all PPTs of this course The slides/diagrams in this course are an adaptation, combination, and enhancement of material from the following resources and persons: 1. Slides of Operating System Concepts, Abraham Silberschatz, Peter Baer Galvin, Greg Gagne - 9th edition 2013 and some slides from 10th edition 2018 2. Some conceptual text and diagram from Operating Systems - Internals and Design Principles, William Stallings, 9th edition 2018 3. Some presentation transcripts from A. Frank – P. Weisberg 4. Some conceptual text from Operating Systems: Three Easy Pieces, Remzi Arpaci-Dusseau, Andrea Arpaci Dusseau OPERATING SYSTEMS File access permission There are nine permission bits for each file, divided into three categories The term user in the first three rows refers to the owner of the file The first rule is that to open any type of file by name, user must have execute permission in each directory mentioned in the name, including the current directory. The execute permission bit for a directory is often called the search bit For example, to open the file /usr/include/stdio.h, user would need execute permission in the directory /, execute permission in the directory /usr, and execute permission in the directory /usr/include and appropriate permission for the file stdio.h OPERATING SYSTEMS File access permission The read permission for a file determines whether user can open an existing file for reading: the O_RDONLY and O_RDWR flags for the open function. The write permission for a file determines whether user can open an existing file for writing: the O_WRONLY and O_RDWR flags for the open function. User must have write permission for a file to specify the O_TRUNC flag in the open function. User cannot create a new file in a directory unless they have write permission and execute permission in the directory. To delete an existing file, user needs write permission and execute permission in the directory containing the file. Users do not need read permission or write permission for the file itself OPERATING SYSTEMS Set-User-ID and Set-Group-ID Every process has six or more IDs associated with it The real user ID and real group ID identify who the user really is. These two fields are taken from an entry in the password file when the user logs in The effective user ID, effective group ID, and supplementary group IDs determine file access permissions for the user The saved set-user-ID and saved set-group-ID contain copies of the effective user ID and the effective group ID, respectively, when a program is executed. OPERATING SYSTEMS Set-User-ID and Set-Group-ID We can set the real user ID and effective user ID with the setuid function. We can set the real group ID and the effective group ID with the setgid function. If the process has superuser privileges, the setuid function sets the real user ID, effective user ID, and saved set-user-ID to uid. If the process does not have superuser privileges, but uid equals either the real user ID or the saved set- user-ID, setuid sets only the effective user ID to uid. The real user ID and the saved set-user-ID are not changed. If neither of these two conditions is true, errno is set to EPERM and −1 is returned OPERATING SYSTEMS access and faccessat When a user tries to open a file, the kernel performs access tests based on the effective user and group IDs Sometimes, however, a process wants to test accessibility based on the real user and group IDs. This is useful when a process is running as someone else, using either the set-user-ID or the set-group-ID feature. The access and faccessat functions base their tests on the real user and group IDs. OPERATING SYSTEMS umask The umask function sets the file mode creation mask for the process and returns the previous value The cmask argument is formed as the bitwise OR of any of the nine constants The file mode creation mask is used whenever the process creates a new file or a new directory. OPERATING SYSTEMS chmod, fchmod, and fchmodat The chmod, fchmod, and fchmodat functions allow us to change the file access permissions for an existing file The chmod function operates on the specified file, whereas the fchmod function operates on a file that has already been opened. The fchmodat function behaves like chmod when the pathname argument is absolute or when the fd argument has the value AT_FDCWD and the pathname argument is relative To change the permission bits of a file, the effective user ID of the process must be equal to the owner ID of the file, or the process must have superuser permissions OPERATING SYSTEMS chown, fchown, fchownat, and lchown The chown functions allow users to change a file’s user ID and group ID, but if either of the arguments owner or group is −1, the corresponding ID is left unchanged The fchown function changes the ownership of the open file referenced by the fd argument. lchown and fchownat (with the AT_SYMLINK_NOFOLLOW flag set) change the owners of the symbolic link itself, not the file pointed to by the symbolic link OPERATING SYSTEMS File truncation truncate and ftruncate functions truncate an existing file to length bytes. If the previous size of the file was greater than length, the data beyond length is no longer accessible. If the previous size was less than length, the file size will increase and the data between the old end of file and the new end of file will read as 0 (i.e., a hole is probably created in the file) THANK YOU Suresh Jamadagni Department of Computer Science Engineering [email protected]