combinepdf.pdf
Document Details
Uploaded by RefinedBowenite
Tags
Related
Full Transcript
Intro to UNIX Commands San José State University Slides are adapted from slides created by Dr. Andreopoulos / Dr. Genya 1 Course Goals • Offer you essential skills for data science • Become an expert at skills for high-performance computing • Acquire better coding skills for data wrangling 2 ...
Intro to UNIX Commands San José State University Slides are adapted from slides created by Dr. Andreopoulos / Dr. Genya 1 Course Goals • Offer you essential skills for data science • Become an expert at skills for high-performance computing • Acquire better coding skills for data wrangling 2 History • Ken Thompson working at Bell Labs in 1969 wanted a small MULTICS • He wrote UNIX which was initially written in assembly and could handle only one user at a time • Dennis Ritchie and Ken Thompson ported an enhanced UNIX to a PDP-11/20 in 1970 • Ritchie ported the language BCPL to UNIX in 1970, cutting it down to fit and calling the result “B” • In 1973 Ritchie and Thompson rewrote UNIX in “C” and enhanced it some more • Since then it has been enhanced and enhanced and enhanced and … 6 What is UNIX good for? • Offers a generic interface to many types of computing hardware • Multi-user and multi-tasking: Supports many users running many programs at the same time, all sharing (transparently) the same computer system • Open source: Promotes information sharing • Geared for high programmer productivity. “Expert friendly” • Generic framework allows flexible tailoring of environment for each user. • Services include: File system, Security, Process/Job Scheduling, Network services. 4 What is UNIX good for? - Stability and Reliability - Portability - Diverse Hardwares - Powerful Command-Line Interface (CLI) - Multiuser and Multitasking (Pioneered) - Security (Privacy - Open Source) - Open Standards - Geared for high programmer productivity. “Expert friendly” - Large Software Ecosystem - Community and Collaboration - Variety of Flavors : Linux, Ubuntu, Raspberry Pi, MacOs - Longevity, Scalability : small embedded devices to large server clusters 5 What is UNIX good for? Interested in system administration, perhaps? This is the course for you. ● Data scientists: Command line is what makes UNIX useful for a data scientist ● Web servers ● Some surveys show nearly 70% of web servers run on Unix and Unix-like OSs 6 OS history : 0 7 Optional readings 5th Internet edition available online: https://linuxcommand.org/tlcl.php/ Ch. 1-3, 6-8, 11-12 Ch. 1-4 Ch. 1-5 0 8 Course Advanced Topics • UNIX for software package management (yum, deb) • UNIX for software development (make, cmake, compilers, debuggers) • Unix systems programming (system call interface, Unix kernel) • Concurrent programming (threads, process synchronization) 1 0 9 Remote Computing Server and ssh ● SSH or Secure Shell ○ A cryptographic network protocol for operating network services securely over an unsecured network 0 10 IBM Virtual Machine (VM) Every student will have one "dedicated" IBM account on a VM for doing data analysis. The prof will create your account. Then you must login and change your password. Notes: Do not reuse a password that you use elsewhere, pick something else. Do not store sensitive material on these servers, as they will be deleted after the course. 0 11 IBM Virtual Machine (VM) with RedHat Linux Windows: → Download Putty which uses ssh to log into servers Mac: → Open a terminal window and then do ssh IP_address 0 12 Git and Github Used for submitting assignments and worksheets How assignment / worksheet submissions work, since you will be working on IBM server: ● Create a Github repo “cs131” ○ ● ● ● In the repo, create a directory for each assignment and worksheet From the IBM server push your code to your Github repo Submit on Canvas the exact git command for the grader to pull your repo Make sure your repo is public 13 Git and Github Shared folder 0 14 Git and Github Remote repo commands Local repo A commands Local repo B 15 Virtual Private Network (VPN) “Physical” private network Dedicated Private Link Company A Branch 1 Network Provider Network Provider Company A Branch 2 ISP Company A Branch 2 Virtual Private Network (VPN) Virtual Link Company A Branch 1 ISP Internet 16 Analogy: Organizational Memo To: Bob (Room 214) Memo Room #210 Room #214 17 Analogy: Organizational Memo To: Bob (Room 214) Memo Room #1110 Room #214 Building X Building Y 18 Analogy: Organizational Memo To: Bob (Room 214) Memo To: Building Y To: Bob To: Bob (Room 214) (Room 214) Memo Memo Room #1110 Shipping Division Building X Shipping Division Room #214 Building Y 19 Summary 20 Basic UNIX Structure and OS Concepts 1 Multics → Unix Flashback of a few points : → Over Engineered : Ambitious Project at the time → Complexity: difficult to develop, maintain, and use. Unix aimed for simplicity and minimalism → Modularity : Multics was built for mainframe computing architecture How do we make it lean? Why do we want a leaner version of the OS ? Welcome to OS! Operating System (OS): An operating system is like the boss of your computer. It manages all the tasks, like running programs, handling files, and talking to hardware (like your screen and keyboard). It makes sure everything runs smoothly. Kernel: Think of the kernel as the core of the operating system. It's like the brain of the boss. It talks directly to the hardware, controls how programs use the computer's resources, and keeps everything organized and safe. The interface to the kernel is a layer of software called the system calls. Libraries of common functions are built on top of the system call interface, but applications are free to use both. The shell is a special application that provides an interface for running other applications. 4 Polling question Operating system a: is a collection of programs b: provides user interface c: is a resource manager d: all of the above e: none of the above 5 Polling question The kernel layer manages all the hardware dependent functions 1) True 2) False 6 UNIX Kernel • A large C program that implements a general interface to a computer to be used for writing programs: fd = open(“/dev/tty”, O_WRONLY); write(fd, “Hello world!”, 12); Applications Programs UNIX system services UNIX kernel in C computer 7 c lib C and libc C Application Programs libc - C Interface to UNIX system services UNIX system services UNIX kernel in C computer Diff ? "C" refers to the C programming language used to write the Unix operating system itself, while "libc" is the C Standard Library that provides a standardized set of functions and routines used by C programs, 33 9 What is a shell? ● A shell is a computer program that presents a command line interface ● Allows you to control your computer using commands entered with a keyboard ○ instead of controlling graphical user interfaces (GUIs) with a mouse/keyboard combination. ● On a Mac or Linux machine, you can access a shell through a program called Terminal ● The shell allows you to manipulate thousands of files (search, edit, track), store your history (reproducibility) 10 Shell • The shell (sh) is a program (written in C) that interprets commands typed to it, and carries out the desired actions. • The shell is that part of Unix that most users see. Therefore there is a mistaken belief that sh is Unix. • sh is an application program running under Unix • Other shells exists (ksh, csh, tcsh, bash) Shell UNIX system services UNIX kernel in C computer 37 What is a “Shell”? ▪ The “Shell” is simply another program on top of the kernel which provides a basic human-OS interface. • It is a command interpreter ⬥ Built on top of the kernel ⬥ Enables users to run services provided by the UNIX OS • A script or Shell program is a series of commands in a file ⬥Saves having to retype commands to perform common tasks ▪ How to know what shell you use echo $SHELL 38 UNIX Shells ▪ sh Bourne Shell (Original Shell) (Steven Bourne of AT&T) ▪ bash Bourne Again Shell (GNU Improved Bourne Shell) ▪ csh C-Shell (C-like Syntax)(Bill Joy of Univ. of California) ▪ ksh Korn-Shell (Bourne+some C-shell)(David Korn of AT&T) ▪ tcsh Turbo C-Shell (More User Friendly C-Shell). 39 Which Shell should you Use? ▪ All Linux versions use the Bash shell (Bourne Again Shell) as the default shell • Bash/Bourn/ksh/sh prompt: $ • All UNIX system include C shell and its predecessor Bourne shell. • Csh/tcsh prompt: % •sh (Bourne shell) was considered better for programming •csh (C-Shell) was considered better for interactive work. •tcsh and korn were improvements on c-shell and bourne shell respectively. •bash is largely compatible with sh and also has many of the nice features of the other shells 40 Which one do you have currently? The shell is a fundamental component of the operating system and is responsible for executing user commands, managing processes, handling input/output redirection, and providing features like scripting and automation. ▪ To check shell: • $ echo $SHELL (shell is a pre-defined variable) ▪ To switch shell: • $ exec shellname (e.g., $ exec bash or simply type $ bash) Tools and Applications vi cat more date gcc gdb … SH UNIX system services UNIX kernel in C computer 41 Questions? 17 Section #2 UNIX File Abstraction and File System Organization 18 What is a File? • A file is the most basic entity in a UNIX system. • Several different kinds of files: – Regular File – Directory – Character Special – Block Special – Socket – Symbolic Link – Hard Link • They are accessed through a common interface (i.e. you need only learn how to use one set of systems calls to be able to access any sort of file.) 44 Regular Files • Regular files are used to store: – English Text – Numerical Results – Program Text – Compiled Machine Code – Executable Programs – Databases – Bit-mapped Images – etc... 45 Regular Files • A regular file is a named, variable length, sequence of bytes. • UNIX itself assumes no special structure to a regular file beyond this. • Most UNIX utility programs, however, do assume the files have a certain structure. • E.g. a regular text file: $ cat file hello world! $ ls -l file -rw-r--r-1 bill 13 May 8 16:44 file 46 Directories & Filenames homes bill file tmp cs131 accounts • Directories are special kinds of files that contain references to other files and directories. • Directory files can be read like a regular file, but UNIX does not let you write to them. • There are two ways of specifying a filename – absolute: /homes/bill/file cs131/accounts – relative: • With an absolute pathname the search for the file starts at the root directory. 47 Relative Pathnames • With a relative pathname the search for the file starts at the current working directory. • Every process under UNIX has a CWD (current working directory). This can be changed by means of a system call. • e.g. $ pwd /homes/bill $ cd cs131 $ pwd /homes/bill/cs131 $ cd / $ pwd / 48 Commands that will “grow” on you man ls cd pwd mkdir rmdir rm 24 Commands that will “grow” on you grep egrep fgrep cut tr join sed awk 25 Polling question Which command lists your files? 1) 2) 3) 4) ls -latr ls ls -lat All of these commands 26 Wildcards on command line and in regex’s ? matches exactly one character * matches zero or more characters Try ls with wildcards ls *.txt ls ?.txt 27 Using wildcards vs. find and xargs Wildcards, e.g.: ls ./*.txt These get expanded by the Bash shell command line As a result you cannot work with directories that have many-many files if you use wild-cards If you use find on the other hand this will run inside the program Thus you can search directories with many-many files find ./ -type f -name “*.txt” 28 How to tar files List files: tar tvf tarred_file.tar | grep xxx Extract tar.gz: tar zxvf tarred_file.tar.gz Extract tar: tar xvf tarred_file.tar Compress: tar cvf tarred_file.tar directory tar cvf tarred_file.tar --exclude “*pattern*” directory 29 man command for help manual Try in terminal (Mac or Ubuntu): $ man tar Most commands differ between linux distributions, good idea to check the man/help page. $ tar --help $ tar -h $ tar –usage Or, use stackoverflow and google 30 Sockets and Pipes • Pipes are special files used to pass bytes between two processes. Process writeread A Process B Pipe • Sockets are similar, but are used to connect two processes on different machines across a network. 56 CS 131.01 Processing Big Data: Tools and Techniques 2. Dealing with Files San José State University Slides are adapted from slides created by Dr. Andreopoulos 1 Optional readings 5th Internet edition available online: https://linuxcommand.org/tlcl.php/ Ch. 1-3, 6-8, 11-12 Ch. 1-4 Ch. 1-5 2 Bash shortcut tip of the day Pressing <Tab> auto-completes your line 3 Viewingand Editing Files ● ● Different ways to display the contents of files. How to use the nano and vi editor. 4 Displaying the Contents of Files cat file more file less file head file Display the contents of file. Pager: Browse through a text file. Pager, uses less memory. Output the beginning (or top) portion of file. tail file Output the ending (or bottom) portion of file. 5 Head and Tail ● ● Displays only 10 lines by default Change this behavior with head -n ○ ○ n = number of lines tail -15 file.txt 6 Polling question What does this do? head -n 10 filename A) B) C) Print the first 10 lines of filename Print the last 10 lines of filename None of these 7 Polling question What does this do? tail -100 filename A) B) C) Print the first 100 lines of filename Print the last 100 lines of filename None of these 8 Viewing Files in Real Time tail -f file Follow the file. Displays data as it is being written to the file. 9 Demo 10 Nano Editor ● ● ● ● Nano is a simple editor. Easy to to learn. Not as advanced as vi or emacs. If nano isn't available, look for pico. 11 Demo - Nano 12 Summary ● ● There are various commands that display the contents of files. The Nano editor is easy to use and learn. 13 Editing Fileswith Vi 14 What You Will Learn ● How to use the vi editor. 15 The Vi Editor ● ● ● ● Has advanced and powerful features Not intuitive Harder to learn than nano Requires a time investment 16 The Vi Editor vi [file] Edit file. vim [file] Same as vi, but more features. view [file] Starts vim in read-only mode. 17 Vi Command Mode and Navigation k j h l w b Up one line. Down one line. Left one character. Right one character. Right one word. ^ $ Left one word. Go to the beginning of the line. Go to the end of the line. 19 Vi Navigation Keys 20 Vi Insert Mode (to exit you type Escape) i Insert at the cursor position. I Insert at the beginning of the line. a Append after the cursor position. A Append at the end of the line. o Insert right after the current line O Insert right before the current line 21 Vi Line Mode :w :w! :q :q! :wq :x Writes (saves) the file. Forces the file to be saved. Quit and display message if file modified. Quit without saving changes. Write and quit. Same as :wq. 22 Vi Modes : default is navigation/command mode Mode Key When you are in Command mode (Esc): Command Esc Insert iIaAoO Line (enter cmd) : Search / or ? ‘n’ is find next 24 Polling question / searches forward, ? searches backward in vi A) B) True False 25 Polling question / searches forward, ? searches backward in vi A) B) True False Find next is ‘n’ (same as / without parameters) 26 Vi - Deleting Text x dw dd D Delete a character. Delete a word. Delete a line. Delete from the current position. 27 Vi - Copying and Pasting yy Yank (copy) the current line. p Paste the most recent deleted or yanked text below the current line. P Paste above the current line. <num>yy Yanks that many lines from the cursor position. 28 vi - Undo / Redo Undo u Redo Ctrl-R 29 Vi - Searching /<pattern> ?<pattern> Start a forward search. Start a reverse search. 30 Vi - Search and replace :%s/<pattern>/replace/g :%s/ctrl+Vctrl+M//g Global search/replace. Remove newlines leftover from Windows. 31 Summary ● ● More advanced than nano Vi has four modes 32