COMP2401 Chapter 1 - Systems Programming and C Basics PDF
Document Details
Uploaded by PleasurableDune
2020
COMP2401
Tags
Summary
This document is an introduction to systems programming and C programming. It describes the difference between applications programming and systems programming and provides examples of systems programs.
Full Transcript
Chapter 1 Systems Programming and C Basics What is in This Chapter ? This first chapter of the course explains what Systems Programming is all about. It explains how it is closely linked to the operating system. A few basic tools are explained for use with the gcc compiler under a Linux Ubu...
Chapter 1 Systems Programming and C Basics What is in This Chapter ? This first chapter of the course explains what Systems Programming is all about. It explains how it is closely linked to the operating system. A few basic tools are explained for use with the gcc compiler under a Linux Ubuntu environment running within a VirtualBox application. It then introduces you to the C programming language in terms of the basic syntax as it is compared to JAVA syntax. A few simple programs are created to show how to display information, compute simple math calculations, deal properly with random numbers, getting user input., using arrays and calling functions. #include int main() { printf("Hello world\n"); return 0; } COMP2401 - Chapter 1 – Systems Programming and C Basics Fall 2020 1.1 Systems Programming and Operating Systems In COMP1405 and COMP1406, you developed various programs and applications. The goal was to write programs that “accomplished” something interesting in that it provided a service for the user … usually resulting in an “app” that interacted with the user. Examples of common applications are internet browsers, word processors, games, database access programs, spreadsheets, etc.. So, what you have been doing in your courses has been: Applications Programming is the programming of software to provide services for the user directly. Systems Programming, on the other hand is different. It has a different focus … and can be defined as follows: Systems Programming is the programming of software that provides services for other software … or for the underlying computer system. So, when you are doing systems programming, you are writing software that does not typically have a front-end GUI that interacts with the user. It is often the case where the software runs “behind-the-scenes”… sometimes as a process/thread running in the background. Some examples of systems programs are: 1. Firmware (e.g., PC BIOS and UEFI). 2. Operating systems (e.g., Windows, Mac OSX, GNU/Linux, BSD, etc...). 3. Game Engines (e.g., Unreal Engine 4, Unity 3D, Torque3D) 4. Assemblers (e.g., GNU AS, NASM, FASM, etc...). 5. Macro Processors (e.g., GNU M4). 6. Linkers and Loaders (e.g., GNU ld which is part of GNU binutils). 7. Compilers and Interpreters (e.g., gcc, python, Java VM). 8. Debuggers (e.g., gdb). 9. Text editors (e.g., vim). 10. Operating system shell (e.g., bash). 11. Device Drivers (e.g., for Bluetooth, network cards, etc..) Systems software involves writing code at a much more lower level than typical application software. It is often closely tied to the actual hardware of the machine that it is running on. In fact, it often uses the operating system directly through system calls. So, when you write systems software, it is important to have a good understanding of the machine that it will be running on. Applications programming is at a higher level than systems programming, and so it is closer to the way we think as humans. It is more natural and can make use of high level programming languages and specialized libraries. System software is the layer between the hardware and application software. It can deal directly with the hardware and usually controls it. It is therefore, more naturally written in a lower level programming language. In general, it provides “efficient” services to applications. -2- COMP2401 - Chapter 1 – Systems Programming and C Basics Fall 2020 The goal of writing systems software is to make efficient use of resources (e.g., computer memory, disk space, CPU time, etc..). In some cases, performance may be critical (e.g., like a fast game engine). In fact, it is often the case that small improvements in efficiency can save a company a lot of money. Therefore, in this course, we will be concerned about writing efficient code … something we didn’t focus on too much in COMP1405/1406. In this course, we will also be trying to get a better grasp of the computer’s operating system. An Operating System is system software that manages computer hardware and software resources and provides common services for computer programs. Operating systems are the layer of software that sits between the computer’s hardware and the user applications. It is considered the “boss” of the computer … as it manages everything that is going on visibly, as well as behind the scenes. Some operating systems provide time-sharing features that schedule tasks at various times depending on how the computer’s resources (e.g., memory, disk storage, printers and devices) are allocated at any different time. There are various operating systems out there of which you may have heard of: Windows Mac OSX Unix Linux Android Chrome OS -3- COMP2401 - Chapter 1 – Systems Programming and C Basics Fall 2020 The operating system acts as the intermediary between the applications and the computer’s resources. Applications perform system calls to gain access to the resources: So, in a sense, the operating system provides services for the applications. It provides some core functionality to users as well as through programs. Here is some of the functionality that is provided by the operating system, although there is a lot more than this: File I/O provides file system organization and structure allows access to read and write files provides a measure of security by allowing access permissions to be set Device I/O allows communication with devices through their drivers e.g., mouse, keyboard, printer, bluetooth, network card, game controllers… manages shared access (i.e., printer queue) -4- COMP2401 - Chapter 1 – Systems Programming and C Basics Fall 2020 Process Management allows the starting/stopping/alteration of running executables can support multitasking o (i.e., multiple processes running concurrently) allocates memory for each process o needed for program instructions, variables & keeping track of function calls/returns. Virtual memory provides memory to every process o dedicated address space, allocated when process starts up appears (to the process) as a large amount of contiguous memory but in reality, some is fragmented in main memory & some may even be on disk programs use virtual memory addresses as opposed to physical memory addresses Scheduling allows applications to share CPU time as well as device access Operating systems vary from one to another. The Windows operating system: expensive when compared to some others has limitations that make it harder to work with is a closed system that has very restricted access to OS functions was designed to make the computer simpler to use … primarily for business users and the average home user who are not computer savvy. Hence, the system was designed to allow access at a higher level. Unix-based operation systems: free … open source a more open system that allows broad access to OS functions o "root" or super-user can do anything (extremely dangerous) family of options (Linux, Solaris, BSD, Mac OS X, many others) OS of choice for complex or scientific application development -5- COMP2401 - Chapter 1 – Systems Programming and C Basics Fall 2020 The language of choice for programming Unix-based operating systems is C. closer to hardware used to write Unix OS, device drivers very fast, with negligible runtime footprint 1.2 Tools for Systems Programming We will now discus 4 tools that are essential for systems programming: Shells A Shell is a command line user interface that allows access to an operating system’s services. A shell allows the user to type in various commands to run other programs. It also serves as a command line interpreter. Multiple shells can be run at the same time (in their own separate windows). In Unix, there are three major shells: sh - Bourne shell bash - Bourne-again shell (default shell for Linux) csh - C shell The shells differ with respect to their command line shortcuts as well as in setting environment variables. A shell allows you to run programs with command line arguments (i.e., parameters that you can provide when you start the program in the shell). The parameters (a.k.a. arguments or options) are usually preceded by a dash – character. There are some common shell commands that you can make use of within a shell: e.g., alias, cd, pwd, set, which There are some common system programs that you can make use of within a shell: e.g., grep, ls, more, time, sort You can even get some help by accessing a kind of “user manual” through access of what are called man pages with the man command. We will make use of various shell commands and programs throughout the course. -6- COMP2401 - Chapter 1 – Systems Programming and C Basics Fall 2020 Text Editors A Text Editor is a program that allows you to produce a text-based file. There are a LOT of text editors out there. They are basic. Some are built into standard Operating System packages. There is a big advantage to knowing how to use one of these common editors … as they are available on any machine. It is good to choose one and “stick with it”, as you will likely develop an expertise with it. There are some common ones such as: vi/vim, emacs and gedit. You will need to use one for writing your programs in this course and for building make files (discussed later). There is a bit of a learning curve for these editors, as they all require you to use various “hot keys” and commands in order to be quick and efficient at editing. The commands allow you to write programs without the use of a mouse … which is sometimes the case on some systems when you don’t have device drivers working/installed. Compilers A Compiler is computer software that transforms source code written in one programming language into another target programming language. In this course, we will make use of the GNU compiler. GNU is a recursive acronym for "GNU's Not Unix!". It was chosen because GNU's design is Unix-like, but differs from Unix by being free software and containing no Unix code. The command for using the compiler is gcc. There are many options that you can provide when you run the compiler. For example, -o allows you to specify the output file and -c allows you to create the object code. You will learn how to use these command options (and others) as the course goes on. < > Compilers produce code that is meant to be run on a specific machine. Therefore, your code MUST be compiled on the same platform that it runs on. Linux-compiled code, for example, will not run on Windows, Unix nor MacOS machines. -7- COMP2401 - Chapter 1 – Systems Programming and C Basics Fall 2020 Debuggers A Debugger is a program that is used to test and debug other programs. There are two main advantages of using a debugger: It allows you to control the running (i.e., execution) of your code. o can start/stop/pause your program o good to slow things down in time-critical and resource- sharing scenarios It allows you to investigate what is happening in your program o can view your variables in the midst of your program o can observe the control flow of your program to find out whether certain methods are being called and in what order The goal is always to debug … to find out where your program is going wrong or crashing. The command for using the compiler is gdb. You use it to run the program … after it has been compiled. However, to use it, the program must have already been compiled with the -g option. When using the debugger, there are various commands that you can apply such as run, break, display, step, next and continue. -8- COMP2401 - Chapter 1 – Systems Programming and C Basics Fall 2020 1.3 Writing Your First C Program The process of writing and using a C program is as follows: 1. Writing: write your programs as.c files. 2. Compiling: send these.c files to the gcc compiler, which will produce.o object files. 3. Linking: the.o files are then linked with various libraries to produce an executable file. 4. Running: run your executable file. Getting your C programs to run requires a little more work than getting a JAVA program to run. As with JAVA, your source code must be compiled. Instead of producing.class files, the C compiler will produce.o files which are called object files. These object files are “linked” together, along with various library files to produce a runnable program which is called an executable file. This executable file is in machine code that is meant to be run on a specific computer platform. It is not portable to other platforms. Our First Program The first step in using any new programming language is to understand how to write a simple program. By convention, the most common program to begin with is always the "hello world" program which when run... should output the words "Hello World" to the computer screen. We will describe how to do this now. -9- COMP2401 - Chapter 1 – Systems Programming and C Basics Fall 2020 In this course, you will use either emacs, vim or gedit to write your programs. I will be using examples that make use of gedit. Once you have your Terminal window open (you will learn how to do this in the first tutorial), then you start up your editor by specifying the editor name followed by the name of the file that you want to write … in this case it will be helloWorld.c. You use the & symbol at the end of the command line to indicate that you want to run the editor in the background. This allows you to keep the editor open while you are compiling and running/testing your code. If you don’t use the & character, then you must close the gedit editor in order to continue to work again in the Terminal window. Of course, you can always work with a second Terminal window if you’d like, but it is easiest to simply run the editor in the background. student@COMPBase:~$ gedit helloWorld.c & Here is a window that shows the editor with some code in it: When compared to JAVA (shown on the right), you will notice some similarities as well as some differences. Below, on the left, is our first C program that we will write: C program JAVA program #include public class HelloWorldProgram { public static void main(String[] args) { int main() { System.out.println("Hello World"); printf("Hello world\n"); } return 0; } } Here are a few points of interest in regard to writing C programs: - 10 - COMP2401 - Chapter 1 – Systems Programming and C Basics Fall 2020 1. The #include on the first line tells the compiler to include a header file. A Header File is a file containing C declarations and macro definitions to be shared between several source files. This this case, stdio.h is the name of the header file. This is a standard file that defines three variable types, several macros, and various functions for performing input and output. We need it here because we will be printing something out to the screen using printf. 2. Unlike JAVA, we don’t need to define the name of a class. So we begin right away with the main() function. There are no public/private/protected access modifiers in C, so we leave those out. You will notice that the main() function returns an int, whereas in JAVA it was void. Also, we are not required in C to specify that there will be command- line arguments … so we do not need to declare that as a parameter to the main() function. 3. The procedure for printing is simply printf(), where we supply a string to be printed … and also some other parameters as options (more on this later). If we want to ensure that additional text will appear on a new line, we must make sure that we include the \n character inside the string. 4. The main() function should return an integer. By convention, negative numbers (e.g., - 1) should be returned when there was an error in the program and 0 when all went well. However, by allowing a variety of integers to be returned, we can allow various error codes to be returned so as to more precisely what had gone wrong. 5. Finally, notice that C uses braces and semi-colons in the same way that JAVA does. So... to summarize, our C programs will have the following basic format: #include #include #include int main() { ; ; ; } You should ALWAYS line up ALL of your brackets using the Tab key on the keyboard. Now that the program has been written, you can compile it in the same Terminal window by using the gcc -c command as follows: student@COMPBase:~$ gedit helloWorld.c & student@COMPBase:~$ gcc -c helloWorld.c student@COMPBase:~$ - 11 - COMP2401 - Chapter 1 – Systems Programming and C Basics Fall 2020 This will produce an object file called helloWorld.o. You can view the file by using the ls command: student@COMPBase:~$ gedit helloWorld.c & student@COMPBase:~$ gcc -c helloWorld.c student@COMPBase:~$ ls helloWorld.c helloWorld.o student@COMPBase:~$ Then, we need to link it (with our other files and standard library files) to produce an executable (i.e., runnable) file. We do this as well with the gcc -o compiler as follows: student@COMPBase:~$ gedit helloWorld.c & student@COMPBase:~$ gcc -c helloWorld.c student@COMPBase:~$ ls helloWorld.c helloWorld.o student@COMPBase:~$ gcc -o helloWorld helloWorld.o student@COMPBase:~$ After the -o is the name of the file that we want to be the runnable file. We follow it with a list of all object files that we want to join together. In this case, there is just one object file. This will produce an executable file called helloWorld. You can view the file by using the ls command: student@COMPBase:~$ gedit helloWorld.c & student@COMPBase:~$ gcc -c helloWorld.c student@COMPBase:~$ ls helloWorld.c helloWorld.o student@COMPBase:~$ gcc -o helloWorld helloWorld.o student@COMPBase:~$ ls helloWorld helloWorld.c helloWorld.o student@COMPBase:~$ You can then run the helloWorld file directly from the command line, but we need to tell it to run in the current directory by using./ in front of the file name: student@COMPBase:~$ gedit helloWorld.c & student@COMPBase:~$ gcc -c helloWorld.c student@COMPBase:~$ ls helloWorld.c helloWorld.o student@COMPBase:~$ gcc -o helloWorld helloWorld.o student@COMPBase:~$ ls helloWorld helloWorld.c helloWorld.o student@COMPBase:~$./helloWorld Hello World student@COMPBase:~$ Notice that when you run your program, any output from the program will appear in the shell window from which it has been run. As a side point, you can link the files without first compiling. The linking stage will compile first by default: student@COMPBase:~$ gcc -o helloWorld helloWorld.c student@COMPBase:~$ - 12 - COMP2401 - Chapter 1 – Systems Programming and C Basics Fall 2020 1.4 C vs. Java C code is very similar to JAVA code with respect to syntax. Provided here is a brief explanation of a few of the similarities & differences between the two languages. To begin, note that commenting code is the same in C as it is in JAVA: Commenting in C and JAVA // single line comment Displaying Information to the System Console: System.out.print("The avg is " + avg); System.out.println(" hours"); JAVA System.out.println(String.format("Mark is %d years old and weighs %f pounds", a, w)); printf("The avg is %d", avg); C printf(" hours\n"); printf("Mark is %d years old and weighs %f pounds\n", a, w); Notice that the print statement is easier to use in C. In JAVA, things like integers and objects could be appended to strings with a + operator. We cannot do that in C. Instead, we do something similar to the String.format() function in JAVA by supplying a list of parameters for the string to be printed. Inside the string we use the % character to indicate that a parameter is to be inserted there. There are many possible flags that can be used in the format string for the printf. You will want to look them up. The general format for each parameter is: %[flags][width][.precision][length] Here is a table showing what the various values may be for type: type Description Example Output %d integer printf("%d", 256) 256 printf("%d", -256) -256 %u unsigned integer printf("%u", 256) 256 printf("%u", -256) 4294967040 - 13 - COMP2401 - Chapter 1 – Systems Programming and C Basics Fall 2020 %f floating point printf("%f", 3.14159265359) 3.141593 (6-dec precision) printf("%f", 314159265.359) 314159265.359000 %g floating point printf("%g", 3.14159265359) 3.14159 (exp. precision) printf("%g", 314159265.359) 3.14159e+08 %c character printf("%c", 65) A %s string printf("%s", "Hello") Hello %x hexadecimal printf("%x", 250) 0fa %X printf("%X", 250) 0FA %o octal printf("%o", 250) 372 The width parameter allows us to specify the minimum number of “spaces” that the output will take up. We can use this width parameter to have things lined up in a table. If the width is too small, and the number has more digits than the specified width … the width parameter will not affect the output. Here are some examples: printf("%5d", 256) // 256 printf("%5f", 3.14159265359) //3.141593... No effect printf("%5c", 65) // A printf("%5s", "Hello") //Hello printf("%10d", 256) // 256 printf("%10f", 3.14159265359) // 3.141593 printf("%10c", 65) // A printf("%10s", "Hello") // Hello The precision parameter works differently depending on the value being used. When used with floating point flag f it allows us to specify how many digits we want to appear after the decimal place. When used with floating point flag g it allows us to specify how many digits in total we want to be used in the output (including the ones to the left of the decimal): printf("%2.3f\n", 3.14159265359); //3.142... rounds up printf("%2.3g\n", 3.14159265359); //3.14 printf("%2.3f\n", 3141592.65359); //3141592.654 printf("%2.3g\n", 3141592.65359); //3.14e+06 When used with string flag s, the precision parameter allows us to indicate how many characters will be displayed from the string: printf("%2.1s\n", "Hello"); // H printf("%2.3s\n", "Hello"); //Hel printf("%2.5s\n", "Hello"); //Hello printf("%10.1s\n", "Hello"); // H printf("%10.3s\n", "Hello"); // Hel printf("%10.5s\n", "Hello"); // Hello When used with integer, unsigned integer, octal and hexadecimal flags d, u, o, x, X, the precision parameter allows us to indicate how many leading zeros will be displayed: - 14 - COMP2401 - Chapter 1 – Systems Programming and C Basics Fall 2020 printf("%6.1X\n", 250); // FA printf("%6.2X\n", 250); // FA printf("%6.3X\n", 250); // 0FA printf("%6.4X\n", 250); // 00FA printf("%6.5X\n", 250); // 000FA printf("%6.1d\n", 250); // 250 printf("%6.2d\n", 250); // 250 printf("%6.3d\n", 250); // 250 printf("%6.4d\n", 250); // 0250 printf("%6.5d\n", 250); // 00250 printf("%6.5d\n", -250); //-00250 When used with integer, unsigned integer, octal and hexadecimal flags d, u, o, x, X, the flags parameter also allows us to indicate how many leading zeros will be displayed. When used with numbers, the 0 flag allows leading zeros to be inserted and the + allows a plus sign to be inserted for positive numbers (normally not shown): printf("%6d\n", 250); // 250 printf("%06d\n", 250); //000250 printf("%+6d\n", 250); // +250 When used with numbers or strings, the - flag allows everything to be left-aligned: printf("%-6d\n", 250); //250 printf("%-+6d\n", 250); //+250 printf("%-.1s\n", "Hello"); //H printf("%-.3s\n", "Hello"); //Hel printf("%-.5s\n", "Hello"); //Hello There are more options than this, but we will not discuss them any further. You may google for more information. In C, there are 4 main primitive variable types (but int has 3 variations). Variables are declared the same way as in JAVA (except literal float values do not need the ‘ f ’ character). Variables in C Variables in JAVA int days = 15; int days = 15; char gender = 'M'; char gender = 'M'; float amount = 21.3; float amount = 21.3f; double weight = 165.23; double weight = 165.23; char age = 19; byte age = 19; short int years = 3467; short years = 3467; long int seconds = 17102397834; long seconds = 17102397834; char hungry = 1; boolean hungry = true; Interestingly, there are no booleans in the basic C language. Instead, any value which is non-zero is considered to be true when used in a boolean expression (more on this below). - 15 - COMP2401 - Chapter 1 – Systems Programming and C Basics Fall 2020 Also, char is used differently since it is based on ASCII code, not the UNICODE character set. So the valid ranges of char is -128 to +127 (more on this later). Fixed values/numbers are defined using the #define keyword in C as opposed to final. Also, the = sign is not used to assign the value. Typically, these fixed values are defined near the top of the program file. You should ALWAYS name them using uppercase letters with multiple words separated by underscore characters. Fixed Values in C: Fixed Values in JAVA: #define DAYS_IN_YEAR 365 final int DAYS_IN_YEAR = 365; #define RATE 4.923 final float RATE = 4.923f; #define NEWLINE '\n' final char NEWLINE = '\n'; In C, both IF and SWITCH statements work “almost” the same way as in JAVA: IF statements: SWITCH statements: if ((grade >= 80) && (grade = 50) { case 2: printf("%d", grade); printf("Case2"); break; printf(" Passed!\n"); case 3: } printf("Case3"); break; else case 4: printf("Grade too low.\n"); printf("Case4"); break; default: printf("Default"); } However, there are differences, since there are no boolean types in C. We need to fake it by using integers (or one-character bytes as a char). Consider this example in C and Java: Booleans “faked” in C: Booleans in JAVA: char tired = 1; boolean tired = true; char sick = 0; boolean sick = false; if (sick && tired) if (sick && tired) printf("I give up\n"); System.out.println("I give up"); In the C example, tired is considered to be true since it is non-zero, whereas sick is considered to be false. When the && is used, the result is always either 1 or 0, indicating true or false. So, if both sick and tired were set to 1, then (sick && tired) would result in 1, not 2. The same holds true for the || operator which is used for the OR operation. The boolean negation character ! will change a non-zero value to 0 and a zero value to 1. - 16 - COMP2401 - Chapter 1 – Systems Programming and C Basics Fall 2020 So … char tired = 5; char sick = 0; tired = !tired; // tired will now be 0 sick = !sick; // sick will now be 1 If it makes your code easier to read, you can always define fixed values for TRUE and FALSE that can be used in your programs: #define TRUE 1 #define FALSE 0 Then you can do things like this: char tired = TRUE; char sick = FALSE; There is also another kind of conditional operator called the ternary/conditional operator which uses the ? and : character in sequence. It is a short form of doing an IF statement: tired ? printf("tired\n") : printf("not tired\n"); It does the same as this: if (tired) printf("tired\n"); else printf("not tired\n"); However, the ?: is often used to provide a returned value that can be used in a calculation: int hoursWorked = 45; int bonus = (hoursWorked > 40) ? 25 : 0; printf("%d\n", bonus); In C, the FOR and WHILE loops work the same way: FOR loops: WHILE loops: int total = 0; int speed = 0; for (int i=1; i