The Pentium Microprocessor PDF

The Pentium Microprocessor James L. Antonakos AKA JoeyWebScrapper AKA JoeyWebScrapper THE PENTIUM MICROPROCESSOR James L. Antonakos Broome Community College...

The Pentium Microprocessor James L. Antonakos AKA JoeyWebScrapper AKA JoeyWebScrapper THE PENTIUM MICROPROCESSOR James L. Antonakos Broome Community College Prentice Hall Upper Saddle River, New Jersey Columbus, Ohio AKA JoeyWebScrapper Library of Congress Cataloging-in-Publication Data Antonakos, James L. The pentium microprocessor / James L. Antonakos. p. cm. Includes index. ISBN 0-02-303614-1 1. Pentium (Microprocessor) I. Title. QA76.8.P46A64 1997 004.165—dc20 96-43995 CIP Editor: Charles E. Stewart, Jr. Production Editor: Mary Harlan Editorial/Production Supervision: Custom Editorial Productions, Inc. Design Coordinator: Julia Zonneveld Van Hook Cover photo: © Uniphoto Cover Designer: Rod Harris Production Manager: Pamela D. Bennett Marketing Manager: Debbie Yarnell Illustrations: Custom Editorial Productions, Inc. This book was set in Times Roman by Custom Editorial Productions, Inc. and was printed and bound by R.R. Donnelley & Sons Company. The cover was printed by Phoenix Color Corp. © 1997 by Prentice-Hall, Inc. Simon & Schuster/A Viacom Company Upper Saddle River, New Jersey 07458 All rights reserved. No part of this book may be reproduced, in any form or by any means, without permission in writing from the publisher. Pentium is a trademark of Intel Corporation. Printed in the United States of America 10 987654321 ISBN □-□E-3D3bl4-l Prentice-Hall International (UK) Limited, London Prentice-Hall of Australia Pty. Limited, Sydney Prentice-Hall Canada Inc., Toronto Prentice-Hall Hispanoamericana, S. A., Mexico Prentice-Hall of India Private Limited, New Delhi Prentice-Hall of Japan, Inc., Tokyo Simon & Schuster Asia Pte. Ltd., Singapore Editora Prentice-Hall do Brasil, Ltda., Rio de Janeiro AKA JoeyWebScrapper To my son, Turner James Antonakos. Daddy loves you! AKA JoeyWebScrapper AKA JoeyWebScrapper PREFACE The rapid advancement of microprocessors into our everyday affairs has both simplified and complicated our lives. Whether we use a computer in our job, or come in contact with one elsewhere, most of us have used a computer at one time or another. Most people know that a microprocessor is lurking somewhere inside the machinery, but what a micro¬ processor is, and what it does, remains a mystery. PURPOSES OF THIS BOOK This book is intended to help remove the mystery concerning the Pentium microprocessor through detailed coverage of its hardware and software and examples of many different applications. Some of the more elaborate applications are visible to us. A large collection of personal computers use Pentium-based architecture. The book is intended for 2- or 4-year electrical engineering, engineering technology, and computer science students. Professional people, such as engineers and technicians, will also find it a handy reference. The material is intended for a one-semester course in microprocessors. Prior knowledge of digital electronics, including combinational and se¬ quential logic, decoders, memories, Boolean algebra, and operations on binary numbers, is helpful. This presumes knowledge of standard computer-related terms, such as RAM, EPROM, TTL, and so forth. Appendix C is included as a reference on binary numbers and arithmetic for those students who would like a quick review. OUTLINE OF COVERAGE For those individuals who have no prior knowledge of microprocessors, Chapter 1, Microprocessor-Based Systems, is a good introduction to the microprocessor, how it functions internally, and how it is used in a small system. Chapter I is a study of the over¬ all operation of a microprocessor-based system. Chapter 2, An Introduction to the Pentium Microprocessor, highlights the main features of the Pentium. Data types, addressing modes, and instructions are surveyed. Other processors in the 80x86 family are examined as well. AKA JoeyWebScrapper V VI PREFACE Chapter 3, Pentium Instructions, Part 1, and Chapter 4, Pentium Instructions, Part 2, introduce the entire real-mode instruction set of the Pentium. These instructions include data transfer, string, arithmetic, logical, bit manipulation, program transfer, and pro¬ cessor control instructions. Addressing modes, flags, and the structure of a source file are also covered. Chapter 5, Interrupt Processing, covers the basic sequence of an interrupt, as well as multiple interrupts, special interrupts, and interrupt service routines. The instructor may choose to cover this chapter after Chapter 6 to get right to the programming examples. Chapter 6, An Introduction to Programming the Pentium, contains the first real programming efforts. Numerous programming examples show how the Pentium performs routine functions involving binary and BCD mathematics, string operations, data-table ma¬ nipulation, and number conversions. The concept of a software driver program is developed. Chapter 7, Programming with DOS and BIOS Function Calls, opens the power of the personal computer to the student. DOS interrupt 21H and other interrupts are in¬ cluded to show how a Pentium program running on a personal computer accesses DOS and the hardware of the machine. The keyboard, video display, speaker, and printer are all used in applications. Chapter 8, Advanced Programming Applications, introduces the student to many advanced concepts, such as linking multiple object files, instruction execution time, inter¬ rupt handling, memory management, coprocessor programming, and macro usage. Chapter 9, Using Disks and Files, introduces the student to the operation of the disk system and the organization of a diskette. The different structures used with all disks, such as the boot sector and FAT, are covered, as well as the many different interrupt func¬ tions that support disk access. A number of example programs are included to illustrate how data can be read from and written to a diskette. The hardware operation of the Pentium is covered in Chapter 10, Hardware De¬ tails of the Pentium. Descriptions of each CPU pin are included, as are explanations of the various methods employed by the Pentium to access data over its busses. The operation of the Pentium’s superscaler arechitecture, internal pipelining, branch prediction, instruc¬ tion and data caches, and floating-point unit are discussed. Finally, Chapter 11, Protected Mode Operation, presents the details associated with protected mode operation. The virtual memory techniques made possible by the use of segments and paging are explained. Important issues such as protection, exceptions, multitasking, and input/output are also discussed. Virtual-8086 mode is also covered. USES OF THIS BOOK Due to the information presented, some chapters are much longer than others. Even so, it is possible to cover certain sections of selected chapters out of sequence, or to pick and choose sections from various chapters. Chapters 3 and 4 could be covered in this way, with emphasis placed on additional addressing modes or groups of instructions at a rate deemed appropriate by the instructor. Some instructors may wish to cover hardware (Chapter 10) before programming (Chapters 3 through 9). There is no reason this cannot be done. To aid the instructor, answers to selected odd-numbered end-of-chapter study ques¬ tions are included in the text and are also provided in a detailed solutions manual. The so¬ lutions manual is designed in such a way that solutions to all odd-numbered questions are AKA JoeyWebScrapper PREFACE VII grouped together, followed by solutions to all even-numbered questions. This allows the instructor to release selected odd-only or even-only answers to students, while retaining others for testing purposes. The appendixes to this text present a full list of Pentium instructions, their allowed addressing modes, flag usage, and instruction times. In addition, references on DEBUG, Codeview, and the assembler are included, as is detailed information on DOS and BIOS interrupts. In summary, over 180 illustrations and 50 different applications are used to give the student sufficient exposure to the Pentium. Furthermore, even though this book deals only with the Pentium, the serious microprocessor student should also be exposed to other CPUs. But to try to cover two or more different microprocessors in one text does not do ei¬ ther microprocessor justice. For this reason, all attention is paid to the Pentium, and not to other CPUs. THE COMPANION DISKETTE The diskette included with the book contains all of the source files presented in the book. The files are stored in separate directories related to their specific chapters. The majority of the programs are designed for real mode operation. The diskette contains an executable file called README.COM, which explains the diskette contents in detail. ACKNOWLEDGMENTS I would like to thank my editor, Charles Stewart, and his assistant, Kate Linsner, for their help while I was putting this book together. I would also like to thank my Ph.D. advisor, Kanad Ghose, for teaching me so much about computer architecture. In addition, I would like to thank Sharon Rudd, who managed the book through production. James L. Antonakos antonakos_j @ sunybroome.edu AKA JoeyWebScrapper DEBUG < SPEED.SCR will cause DEBUG to read the commands from the SPEED.SCR file instead. Either way, the final register display from DEBUG looks like this: AX=01F4 BX=D6 9 0 CX=0000 DX=1140 SP=FFEE BP=0000 SI = 0000 DI = 0000 DS = 229B ES = 229B SS = 229B CS = 229B IP=0112 NV UP El NG NZ NA PO NC 229B:0112 C6F70A MOV BH,0A When the div instruction executes, the value 058B1 140 is divided by 2D690, giving a final result in AX of 01F4 (which indicates a time of 500 seconds, or 8 minutes and 20 seconds). Try running DEBUG with the db 66 statements removed. The results are drastically different. Flag Register Figure 2.4 shows the eleven flag assignments within the lower 16 bits of the flag register. The flags are divided into two groups: control flags and status flags. The control flags are IF (interrupt enable flag), DF (direction flag), and TF (trap flag). The status flags are CF (carry flag), PF (parity flag), AF (auxiliary carry flag), ZF (zero flag), SF (sign flag), OF AKA JoeyWebScrapper 2.6 PENTIUM DATA ORGANIZATION 25 15 0 -1- - NT I OPL OF DF IF TF SF ZF - AF - PF - CF _1 FIGURE 2.4 Lower word of flag register (overflow flag), NT (nested task), and IOPL (input/output privilege level). Most of the instructions that require the use of the ALU affect the flags. Remember that the flags allow ALU instructions to be followed by conditional instructions. The content/operation of each flag is as follows: CF: Contains carry out of MSB of result PF: Indicates if result has even parity AF: Contains carry out of bit 3 in AL ZF: Indicates if result equals zero SF: Indicates if result is negative OF: Indicates that an overflow occured in result IF: Enables/Disables interrupts DF: Controls pointer updates during string operations TF: Provides single-step capability for debugging IOPL: Priority level of current task NT: Indicates if current task is nested The upper 16 bits of the flag register are used for protected mode operation. See Chapter 11 for details. 2.6 PENTIUM DATA ORGANIZATION The Pentium microprocessor has the capability of performing operations on many differ¬ ent types of data. In this section, we will examine what some of the more common data types are and how they are represented and used by the processor. Bits, Bytes, and Words The Pentium contains instructions that directly manipulate single bits and other instruc¬ tions that use 8-, 16-, and even 32-bit numbers. By common practice, 8-bit binary num¬ bers are referred to as bytes. Processor register halves AL, BH, and CL are examples of where bytes might be stored and used. Sixteen-bit numbers are known as words and require an entire processor register for storage. Registers DX, BP, and SP are used to hold word data types. In register DX, DH will contain the upper 8 bits of the number, and DL will hold the lower 8 bits. AKA JoeyWebScrapper 26 CHAPTER 2 AN INTRODUCTION TO THE PENTIUM MICROPROCESSOR FIGURE 2.5 Storing differ¬ AH AL AH AL ent data types in registers 3F 3F Byte value 3F AX or AX BH BL 14 3E Word value 143E BX DH DL AH AL 78 9A 34 5B Double-word 789A345B DX AX or 78 9A 34 5B EAX Some instructions (particularly multiply and divide) allow the use of 32-bit numbers. These data types are called double-words (or long-words). In this case, the 32-bit number is stored in registers DX and AX, with DX holding the upper 16 bits of the number. Even though an extended register, such as EAX, may be used to store a 32-bit number, it is difficult to work with extended registers in real mode. These data types are illustrated in Figure 2.5. It is important to keep track of the data type being used in an instruction, because in¬ correct or undefined types may lead to incorrect program assembly or execution. One of the differences between the Intel line of microprocessors and those made by other manufacturers is Intel’s way of storing 16-bit numbers in memory. A method that began with the 8-bit 8080 and has been used on all upgrades from the 8085 to the Pentium is a technique called byte-swapping. This technique is sometimes confusing for those unfamiliar with it, but becomes clear after a little exposure. When a 16-bit number must be written into the system’s byte-wide memory, the low-order 8 bits are written into the first memory location and the high-order 8 bits are written into the second location. Figure 2.6 shows how the 2 bytes that make up the 16-bit hexadecimal number 2055 are written into locations 18000 and 18001, with the low-order 8 bits (55) going into the first location (18000). This is what is known as byte-swapping. The lower byte is always written first, followed by the high byte. Byte-swapping is one of the most significant differences between Intel processors and other machines, such as Motorola’s 68000™ (which does not swap bytes). Reading the 16-bit number out of memory is performed automatically by the processor with the aid of certain instructions. The Pentium knows that it is reading the lower byte first and puts it in the correct place. Programmers who manipulate data in memory must remember to use the proper practice of byte-swapping or discover that their programs do not give the correct, or expected, results. It should be easy to remember the mechanics of byte-swapping because the lower byte is always read/written to the lower memory address. FIGURE 2.6 Word storage i using Intel byte-swapping Low byte of word High byte of word I I AKA JoeyWebScrapper 2.6 PENTIUM DATA ORGANIZATION 27 Assembler Directives DB, DW, DUP, and EQU Representing data types in a source file requires the use of special assembler directives designed to create the required data and perform all appropriate byte-swapping. Consider the following sample portion of a list file: 0000.DATA 0000 03 NUMl DB 3 0001 64 NUM2 DB 100 0002 00 NUM3 DB p 0003 OF 30 C8 3A CE NUMS DB 15,48,200,3AH,0CEH 0008 48 65 6C 6C 6F 24 MSG DB 'Hello$' 000E 0006 WX DW 6 0010 03E8 WY DW 1000 0012 1234 ABCD WZ DW 1234H,0ABCDH 0016 0000 TEMP DW p 0018 000A SCORES DB 10 DUP(0) 0022 0007 TIMES DW 7 DUP(?) = 000D TOP EQU 13 = 157C MORE EQU 5500 In this example, byte and word data types are defined through the use of the DB (define byte) and DW (define word) assembler directives. As you can see, DB and DW both allow one or more numbers in their data field or no numbers at all, as is the case with the ? in the data field. DB and DW will convert any numbers in their data fields into hexa¬ decimal and store the numbers in the appropriate place within the object file. If an ASCII character string is found in the data field (as in ’HelloS’) the ASCII byte associated with each character is generated. Because each DB or DW statement in the source file may generate different lengths of data, the assembler keeps track of a program counter to indicate the starting address of each data group. For example, the bytes for the fourth DB statement begin at address 0003 within the data segment, and the word for the second DW statement is located at address 0010 within the data segment. In many cases, we supply a label with a particular DB or DW statement to use with instructions found elsewhere in the source file. The labels are assigned the value of the starting address in each DB or DW statement. Examine the sample data again. Do you see why the address associated with MSG is 0008? DB and DW statements can be used to simply reserve byte or word space by using ? in the data field. When many reserved bytes or words are needed, the DUP (duplicate) as¬ sembler directive is used. Although it is legal to use DB ?,?,?,?,? in a source file, a simple DB 5 DUP(?) should be used instead. This saves the programmer the effort of having to count the number of question marks entered. DUP has an additional advantage. Suppose that 2000 reserved bytes are needed. The statement DB 2000 DUP(?) is clearly more de¬ sirable than two hundred DB ?,?,?,?,?,?,?,?,?,? statements. In our example, DUP is used to create both byte (SCORES) and word (TIMES) data tables. One last assembler directive is very useful because it defines a value that can be used in other source statements but does not generate any code. This is the EQU (equate) di¬ rective. Notice how EQU is used to assign the value 13 to TOP and the value 5500 to MORE. Any instruction in the source program that uses TOP and MORE will automatically AKA JoeyWebScrapper 28 CHAPTER 2 AN INTRODUCTION TO THE PENTIUM MICROPROCESSOR use the values 13 and 5500, respectively. From a practical viewpoint, suppose you have written a 5000-line source program that contains the instruction MOV AL, 7 7 in many different places. If, for some reason, you had to change all the 77s to 88s, you might be in for a long editing session. A simple solution would be to define the value 77 with an EQU statement, as in VAL EQU 77 then use the EQU label in each instruction, like so: MOV AL,VAL If it is necessary to change 77 to 88, only the EQU statement has to be edited. This brief introduction to data types should help us when we examine the Pentium’s instruction set. 2.7 PENTIUM INSTRUCTION TYPES The Pentium’s instruction set is composed of six main groups of instructions. A discussion of instruction specifics will be postponed until Chapter 3. Examining the instructions briefly here, however, will give a good overall picture of the capabilities of the processor. All instructions are part of the original 8086 instruction set, unless otherwise indicated. Data Transfer Instructions Data transfer instructions are used to move data among registers, memory, and the outside world. Also, some instructions directly manipulate the stack, while others may be used to alter the flags. The data transfer instructions are: IN Input byte or word from port LAHF Load AH from flags LDS Load pointer using data segment LEA Load effective address LES Load pointer using extra segment MOV Move to/from register/memory OUT Output byte or word to port POP Pop word off stack POPF Pop flags off stack PUSH Push word onto stack PUSHF Push flags onto stack SAHF Store AH into flags AKA JoeyWebScrapper 2.7 PENTIUM INSTRUCTION TYPES 29 XCHG Exchange byte or word XLAT Translate byte Additional 80286 instructions: INS Input string from port OUTS Output string to port POPA Pop all registers PUSHA Push all registers Additional 80386 instructions: LFS Load pointer using FS LGS Load pointer using GS LSS Load pointer using SS MOVSX Move with sign extended MOVZX Move with zero extended POPAD Pop all double (32-bit) registers POPD Pop double register POPFD Pop double flag register PUSHAD Push all double registers PUSHD Push double register PUSHFD Push double Bag register Additional 80486 instruction: BSWAP Byte swap New Pentium instruction: MOV Move to/from control register Arithmetic Instructions These instructions make up the arithmetic group. Byte and word operations are available on almost all instructions. A nice addition are the instructions that multiply and divide. Previous 8-bit microprocessors did not include these instructions, forcing the programmer to write subroutines to perform multiplication and division when needed. Addition and subtraction of both binary and BCD operands are also allowed. The arithmetic instructions are: AAA ASCII adjust for addition AAD ASCII adjust for division AAM ASCII adjust for multiply AAS ASCII adjust for subtraction ADC Add byte or word plus carry AKA JoeyWebScrapper 30 CHAPTER 2 AN INTRODUCTION TO THE PENTIUM MICROPROCESSOR ADD Add byte or word CBW Convert byte or word CMP Compare byte or word CWD Convert word to double-word DAA Decimal adjust for addition DAS Decimal adjust for subtraction DEC Decrement byte or word by one DIV Divide byte or word (unsigned) IDIV Integer divide byte or word IMUL Integer multiply byte or word INC Increment byte or word by one MUL Multiply byte or word (unsigned) NEG Negate byte or word SBB Subtract byte or word and carry SUB Subtract byte or word Additional 80386 instructions: CDQ Convert double-word to quad-word CWDE Convert word to double-word Additional 80486 instructions: CMPXCHG Compare and exchange XADD Exchange and add New Pentium instruction: CMPXCHG8B Compare and exchange 8 bytes Bit Manipulation Instructions Instructions capable of performing logical, shift, and rotate operations are contained in this group. Many common Boolean operations (AND, OR, NOT) are available in the logical instructions. These, as well as the shift and rotate instructions, operate on bytes or words. Single-bit operations are available on processors from the 80386 and up. The bit manipulation instructions are: AND Logical AND of byte or word NOT Logical NOT of byte or word OR Logical OR of byte or word RCL Rotate left through carry byte or word RCR Rotate right through carry byte or word ROL Rotate left byte or word ROR Rotate right byte or word AKA JoeyWebScrapper 2.7 PENTIUM INSTRUCTION TYPES 31 SAL Arithmetic shift left byte or word SAR Arithmetic shift right byte or word SHR Logical shift right byte or word SHL Logical shift left byte or word TEST Test byte or word XOR Logical exclusive-OR of byte or word Additional 80386 instructions: BSE Bit scan forward BSR Bit scan reverse BT Bit test BTC Bit test and complement BTR Bit test and reset BTS Bit test and set SHLD Shift left double precision SHRD Shift right double precision SETcc Set byte on condition String Instructions String operations simplify programming whenever a program must interact with a user. User commands and responses are usually saved as ASCII strings of characters, which may be processed by the proper choice of string instruction. The string instructions are: CMPS Compare byte or word string LODS Load byte or word string MOVS Move byte or word string MOVSB (MOVSW) Move byte string (word string) REP Repeat REPE(REPZ) Repeat while equal (zero) REPNE(REPNZ) Repeat while not equal (not zero) SCAS Scan byte or word string STOS Store byte or word string Program Transfer Instructions This group of instructions contains all jumps, loops, and subroutine (called procedure) and interrupt operations. The great majority of jumps are conditional, testing the proces¬ sor flags before execution. The program transfer instructions are: CALL Call procedure (subroutine) INT Interrupt AKA JoeyWebScrapper 32 CHAPTER 2 AN INTRODUCTION TO THE PENTIUM MICROPROCESSOR INTO Interrupt if overflow IRET Return from interrupt JA (JNBE) Jump if above (not below nor equal) JAE (JNB) Jump if above or equal (not below) JB (JNAE) Jump if below (not above nor equal) JBE (JNA) Jump if below or equal (not above) JC Jump if carry set JCXZ Jump if CX equals zero JE (JZ) Jump if equal (zero) JG (JNLE) Jump if greater (not less nor equal) JGE (JNL) Jump if greater or equal (not less) JL (JNGE) Jump if less (not greater nor equal) JLE (JNG) Jump if less or equal (not greater) JMP Unconditional jump INC Jump if no carry JNE (JNZ) Jump if not equal (not zero) JNO Jump if no overflow JNP (JPO) Jump if no parity (parity odd) JNS Jump if no sign JO Jump if overflow JP (JPE) Jump if parity (parity even) JS Jump if sign LOOP Loop unconditional LOOPE (LOOPZ) Loop if equal (zero) LOOPNE (LOOPNZ) Loop if not equal (not zero) RET Return from procedure (subroutine) Additional 80286 instructions: BOUND Check index against array bounds ENTER Enter a procedure LEAVE Leave a procedure Additional 80386 instructions: IRETD Interrupt return JECXZ Jump if ECX is zero Processor Control Instructions This last group of instructions performs small tasks that sometimes have profound effects on the operation of the processor. Many of the instructions manipulate the flags. AKA JoeyWebScrapper 2.7 PENTIUM INSTRUCTION TYPES 33 The processor control instructions are: CLC Clear carry flag CLD Clear direction flag CLI Clear interrupt enable flag CMC Complement carry flag ESC Escape to external processor HLT Halt processor LOCK Lock bus during next instruction NOP No operation STC Set carry flag STD Set direction flag STI Set interrupt enable flag WAIT Wait for TEST pin activity Additional 80286 instructions (protected mode only): ARPL Adjust requested privilege level CLTS Clear task switched flag LAR Load access rights LGDT Load global descriptor table LIDT Load interrupt descriptor table LLDT Load local descriptor table LMSW Load machine status word LSL Load segment limit LTR Load task register SGDT Store global descriptor table SIDT Store interrupt descriptor table SLDT Store local descriptor table SMSW Store machine status word STR Store task register VERR Verify segment for reading VERW Verify segment for writing Additional 80486 instructions: 1NVD Invalidate cache INVLPG Invalidate TLB entry WBINVD Write back and invalidate cache New Pentium instructions: CPUID CPU identification RDMSR Read from model-specific register AKA JoeyWebScrapper 34 CHAPTER 2 AN INTRODUCTION TO THE PENTIUM MICROPROCESSOR RDTSC Read from time stamp counter RSM Resume from system management mode WRMSR Write to model-specific register In Chapter 3 we will begin examining each of the Pentium’s instructions in detail, and see many examples of how they are used. 2.8 PENTIUM ADDRESSING MODES The Pentium offers the programmer a wide number of choices when referring to a memory location. Many people believe that the number of addressing modes contained in a micro¬ processor is a measure of its power. If that is so, the Pentium should be counted among the most powerful processors. Many of the addressing modes are used to generate a physical address in memory. Recall from Figure 2.3 that a 20-bit address is formed by the sum of two 16-bit address values. One of the four segment registers will always supply the first 16-bit address. The second 16-bit address is formed by a specific addressing mode operation. The resulting 20-bit address points to one specific location in the Pentium’s 1MB real mode addressing space. We will see that there are a number of different ways the second part of the address may be generated. Protected mode addressing will be covered in the last chapter. Real Mode Addressing Space All addressing modes eventually create a physical address that resides somewhere in the 00000 to FFFFF addressing space of the processor. Figure 2.7 shows a brief memory map of the Pentium’s real mode addressing space, which is broken up into 16 blocks of 64KB each. Each 64KB block is called a segment, A segment contains all the memory locations that can be reached when a particular segment register is used. For example, if the data segment contains 0000, then addresses 00000 through 0FFFF can be generated when using the data segment. If, instead, register DS contains 1800, then the range of addresses becomes 18000 through 27FFF. It is important to see that a segment can begin on any 16-byte boundary. So, 00000, 00010, 00020, 035A0, 10800, and CCE90 are all acceptable starting addresses for a segment. Altogether, 1,048,576 bytes can be accessed by the processor. This is commonly referred to as 1 megabyte. Small areas of the addressing space are reserved for special operations. At the very high end of memory, locations FFFF0 through FFFFF are assigned the role of storing the initial instruction used after a RESET operation. At the low end of memory, locations 00000 through 003FF are used to store the addresses for all 256 interrupts (although not all are commonly used in actual practice). This dedication of addressing space is common among processor manufacturers, and may force designers to conform to specific methods or tech¬ niques when building systems around the Pentium. For instance, EPROM is usually mapped into high memory, so that the starting execution instructions will always be there at power-on. Addressing Modes The simplest addressing mode is known as immediate. Data needed by the processor are actually included in the instruction. For example: MOV CX,1024 AKA JoeyWebScrapper 2.8 PENTIUM ADDRESSING MODES 35 Physical Physical Address Address FFFFF FFFFF FOOOO FFFFO EOOOO DOOOO COOOO BOOOO AOOOO 90000 80000 ,048,576 bytes 70000 60000 50000 40000 Physical Address 30000 003 FF 20000 1 0000 00000 * Each block is 64KB FIGURE 2.7 Addressing space of the Pentium in real mode contains the immediate data value 1024. This value is converted into binary and includ¬ ed in the code of the instruction. When data must be moved between registers, register addressing is used. This form of addressing is very fast, because the processor does not have to access external memory (except for the instruction fetch). An example of register addressing is: ADD AL,BL AKA JoeyWebScrapper 36 CHAPTER 2 AN INTRODUCTION TO THE PENTIUM MICROPROCESSOR where the contents of registers AL and BL are added together, with the result stored in reg¬ ister AL. Notice that both operands are names of internal Pentium registers. The programmer may refer to a memory location by its specific address by using di¬ rect addressing. Two examples of direct addressing are: MOV AX, [3 000] and MOV BL,COUNTER In each case, the contents of memory are loaded into the specified registers. The first instruction uses square brackets to indicate that a memory address is being supplied. Thus, 3000 and are allowed to have two different meanings. 3000 means the number 3000, whereas means the number stored at memory location 3000. The second instruction uses the symbol name COUNTER to refer to memory. COUNTER must be defined somewhere else in the program for it to be used this way. When a register is used within the square brackets, the processor uses register indi¬ rect addressing. For example: MOV BX,[SI] instructs the processor to use the 16-bit quantity stored in the SI (source index) register as a memory address. A slight variation produces indexed addressing, which allows a small offset value to be included in the memory operand. Consider this example: MOV BX, [SI + 10] The location accessed by the instruction is the sum of the SI register and the offset value 10. When the register used is the base pointer (BP), based addressing is employed. This addressing mode is especially useful when manipulating data in large tables or arrays. An example of based addressing is: MOV CL,[BP + 4] Including an index register (SI or DI) in the operand produces based-indexed address¬ ing. The address is now the sum of the base pointer and the index register. An example might be: MOV [BP + DI],AX When an offset value is also included in the operand, the processor uses based-indexed with displacement addressing. An example is: MOV DL, [BP + SI + 2] Obviously, the Pentium intends the base pointer to be used in many different ways. Other addressing modes are used when string operations must be performed. The processor is designed to access I/O ports, as well as memory locations. When port addressing is used, the address bus contains the address of an I/O port instead of a memory location. I/O ports may be accessed two different ways. The port may be specified in the operand field, as in: IN AL,8 OH AKA JoeyWebScrapper 2.8 PENTIUM ADDRESSING MODES 37 or indirectly, via the address contained in register DX: OUT DX,AL Using DX allows a port range from 0000 to FFFF, or 65,536 individual I/O port locations. Only 256 (00 to FF) are allowed when the port address is included as an immediate operand. All of these addressing modes will be covered again in detail in Chapter 3. 32-Bit Addressing Modes The Pentium architecture supports an additional method of generating addresses, especial¬ ly designed to take advantage of protected mode operation. In protected mode, addresses are 32 bits wide, spanning a 4-gigabyte range. The 32-bit addresses are generated as indi¬ cated in Figure 2.8. As usual, a segment register is used as part of the address calculation. Unlike real mode addressing, any general purpose register may be used as a base register or index register. The only exception is ESP, which can be used as a base register but not an index register. FIGURE 2.8 Generating a 32-bit address Effective Ex: MOV EAX,|EBX||ECX *4 + 6| address t t t Base Index Scale AKA JoeyWebScrapper 38 CHAPTER 2 AN INTRODUCTION TO THE PENTIUM MICROPROCESSOR A scale factor is also included to multiply the contents of the index register by 1,2, 4, or 8. This is very useful when dealing with arrays of data composed of bytes, words, double-words, or quad-words. The MOV instruction shown in Figure 2.8 multiplies index register ECX by a scale factor of four. Though designed for protected mode applications, the 32-bit addressing modes may be used while running in real mode by using an address size prefix byte before the instruc¬ tion that uses the 32-bit addressing mode. This is illustrated in Example 2.2. Example 2.2: Examine the following portion of a list file for a real mode program with instructions using 32-bit registers or addressing modes: 0010 B4 09 MOV AH, 9 0012 8D 16 0000 R LEA DX,TABLE 0016 CD 21 I NT 21H 0018 B4 09 MOV AH, 9 001A 66 8D IE 0000 R LEA EBX,TABLE 001F 66 BA 00000002 MOV EDX, 2 0025 67 8D 14 93 LEA DX, [EBX] [EDX* 4 0029 CD 21 I NT 2 1H The second LEA instruction and following MOV instruction both contain extended regis¬ ters in their operand fields. Since real mode code is being generated, the default register size is 16 bits. Thus, the operand size prefix byte 66H is used prior to the code for the LEA and MOV instructions to switch to 32-bit mode. In a similar fashion, the address size prefix byte 67H is used before the third LEA in¬ struction to allow processing of the 32-bit addressing mode specified by EBX, EDX, and the scale factor. As with the operand size prefix, the address size prefix works for a single instruction only, and must precede all instructions that utilize 32-bit addressing. 2.9 INTERRUPTS An interrupt is an event that occurs while the processor is executing an instruction. The instruction might be part of a group of instructions in a main program, such as a word processing application. The interrupt temporarily suspends execution of the main pro¬ gram in favor of a special routine that services the interrupt. When interrupt processing is complete, the processor is returned back to the exact place in the main program where it left off. For example, a timer interrupt might occur while the word processing appli¬ cation is in the middle of a spell checking procedure. Spell checking is suspended and the timer interrupt service routine takes over. The routine might simply increment the sec¬ onds counter on the time-of-day clock. When the interrupt is finished, spell checking resumes. Let us take a look at how interrupts are implemented by the Pentium and how one particular interrupt is used to control the computer when DOS is running. AKA JoeyWebScrapper 2.9 INTERRUPTS 39 Hardware and Software interrupts The Pentium microprocessor is capable of responding to 256 different types of interrupts. These interrupts are generated in a number of different ways. External hardware interrupts are caused by activating the processors NMI and INTR signals. NMI is a nonmaskable interrupt and cannot be ignored by the CPU. INTR is a maskable interrupt that the proces¬ sor may choose to ignore depending on the state of an internal interrupt enable flag. Internal interrupts are caused by execution of an INT instruction. INT is followed by an interrupt number from 0 to 255, giving the programmer the option of generating any number of specific interrupts during program execution. We will see later that machines based on the Pentium that contain a software disk operating System (DOS) have very spe¬ cific functions assigned to certain interrupts (INT 21H, for example) that allow the user to read the keyboard, write text to the screen, control disk drives, and so forth. Some interrupts are generated internally by the processor itself. Divide error is one example. This interrupt is caused when division by zero is detected in the execution unit (during execution of the IDIV or DIV instructions). The processor can also generate single-step interrupts at the end of every instruction if a certain flag called the trap flag is set. Another internal interrupt is INTO (interrupt on overflow). The Interrupt Vector Table All interrupts use a dedicated table in memory for storage of their interrupt service routine (ISR) addresses. The table is called an interrupt pointer table (or interrupt vector table) and is 1024 bytes long, enough storage space for 256 4-byte entries. Because an ISR address occupies 4 bytes of storage, the table holds addresses for all 256 interrupts. Each ISR address is composed of a 2-byte CS value and a 2-byte instruction pointer address. Thus, if the table entry for a type-0 interrupt (divide error) was CS:01()0 and IP:0400, the divide error ISR code would have to be located at physical address 01400. Interrupt processing will be covered in detail in Chapter 5. For now, let us examine one particularly useful interrupt. A Brief Look at DOS Interrupt 21H One of the most useful DOS interrupts is number 21H. This interrupt was chosen as the entry point into DOS for programmers writing their own DOS applications. Although there are many other interrupts assigned to specific functions by DOS, INT 21 H is loaded with so many different functions we rarely need to use others. It is possible to use DEBUG to determine where the code for DOS’s INT 21 H rou¬ tine (or any other interrupt) is located. Because each interrupt vector (CS:1P) requires 4 bytes, the offset into memory for INT 21H's vector is four times 21H, or address 00084H. The following DEBUG session uses this address to find out where the ISR for INT 21 H is located: C> debug -d 0:80 L 10 0000:0080 94 10 16 01 B4 16 26 07-4F 03 FB 0A 8A 03 FB 0A -q AKA JoeyWebScrapper 40 CHAPTER 2 AN INTRODUCTION TO THE PENTIUM MICROPROCESSOR The CS:IP addresses for INT 21H are shown in bold. In this example, CS = 0726H and IP = 16B4H (remember that words are byte-swapped). This gives a 20-bit memory address of 08914H. To examine the actual instructions within the INT 21H service routine, use DEBUG’s unassemble command with the CS:IP addresses shown (eg., -u 726:16b4). The way INT 21H is used is simply a matter of loading specific registers with data and issuing an INT 21H. For example, to read the computer’s time we use the following two instructions: MOV AH,2CH ;get system time function number INT 21H ;DOS call DOS will return the time as follows: CH = hours CL = minutes DH = seconds DL = 1 OOths of seconds Many of the programs we will examine in later chapters use INT 21H and other interrupts. 2.10 THE 8086: THE FIRST 80x86 MACHINE This section and those that follow describe the historical evolution of the 80x86 family. This is done to gain an appreciation for what kinds of improvements have been made to the 80x86 architecture prior to the design of the Pentium. We begin with the 8086. The 8086 microprocessor is a 16-bit machine with a 16-bit data bus and a 20-bit ad¬ dress bus. This allows for 220, or 1MB of addressing space. The instruction set and ad¬ dressing modes presented in this chapter first became available with the 8086, which operates in one mode only: real mode. Every processor in the 80x86 family to come after the 8086 supports the initial instruction set. When the 8086 is reset (or first turned on), the processor fetches its first instruction from address FFFF0H. On the PC, this address enables the motherboard's system ROM, which begins the process of booting DOS. All of the 80x86 machines, even the Pentium, follow this mechanism when reset. A slightly re-engineered version of the 8086 was the 8088 microprocessor, which is identical to the 8086 except for the use of an external data bus that is only 8 bits wide. This forces the 8088 to access memory twice as often as the 8086, resulting in a slight perfor¬ mance penalty in terms of execution speed. Both the 8086 and the 8088 were used in the first PCs to hit the market. A souped-up version of the 8086, called the 80186, contained special hardware such as programmable timers, counters, interrupt controllers, and address decoders. The 80186 was never used in the PC, but was ideal for systems that required a minimum of hardware. AKA JoeyWebScrapper 2.11 A SUMMARY OF THE 80286 41 2.11 A SUMMARY OF THE 80286 The next major improvement in Intel’s line of microprocessors was the 80286 High- Performance Microprocessor with Memory Management and Protection™. The 80286 does not contain the internal DMA controllers, timers, and other enhancements. Instead, the 80286 concentrates on the features needed to implement multitasking, an operating system environment that allows many programs or tasks to run seemingly simultaneously. In fact, the 80286 was designed with this goal in mind. A 24-bit address bus gives the processor the capability of accessing 16MB of storage. The internal memory management feature increases the storage space to 1 gigabyte of virtual address space. That’s over 1 billion locations of virtual memory! Virtual addressing is a concept that has gained much popularity in the computing industry. Virtual memory allows a large program to execute in a smaller physical memory. For example, if a system using the 80286 contained SMB of RAM, memory management and virtual addressing permits the system to run a program containing 12MB of code and data, or even multiple programs in a multitasking environ¬ ment, all of which may be larger than SMB. To implement the complicated addressing functions required by virtual addressing, the 80286 has an entire functional unit dedicated to address generation. This unit is called the address unit. It provides two modes of addressing: 8086 real address mode and pro¬ tected virtual address mode. The 8086 real address mode is used whenever an 8086 pro¬ gram executes on the 80286. The 1MB addressing space of the 8086 is simulated on the 80286 by the use of the lower 20 address lines. Processor registers and instructions are to¬ tally upward compatible with the 8086. Protected virtual address mode uses the full power of the 80286, providing memory management, additional instructions, and protection features, while at the same time re¬ taining the ability to execute 8086 code. The processor switches from 8086 real address mode to protected mode when a special instruction sets the protection enable bit in the ma¬ chine’s status word. Addressing is more complicated in protected mode, and is accom¬ plished through the use of segment descriptors stored in memory. The segment descriptor is the device that really makes it possible for an operating system to control and protect memory. Certain bits within the segment descriptor are used to grant or deny access to memory in certain ways. A section of memory may be write protected or made to execute only by the setting of proper bits in the access rights byte of the descriptor. Other bits are used to control how the segment is mapped into virtual memory space and whether the de¬ scriptor is for a code segment or a data segment. Special descriptors, called gate descrip¬ tors, are used for other functions. Four types of gate descriptors are call gates, task gates, interrupt gates, and trap gates. They are used to change privilege levels (there are four), switch tasks, and specify interrupt service routines. The instruction set of the 80286 is identical to that of the 8086, with additional in¬ structions thrown in to handle the new features. Many of the instructions are used to load and store the different types of descriptors found in the 80286. Other instructions are used to manipulate task registers, change privilege levels, adjust the machine status word, and verify read/write accesses. Clearly, the 80286 differs greatly from the 8086 in the services it offers, while at the same time filling a great need for designers of operating systems. AKA JoeyWebScrapper 42 CHAPTER 2 AN INTRODUCTION TO THE PENTIUM MICROPROCESSOR 2.12 A SUMMARY OF THE 80386 Intel continued its upward-compatible trend with the introduction of the 386 High Performance 32-bit CHMOS Microprocessor with Integrated Memory Management™. Software written for the 8088, 8086, 80186, and 80286 will also run on the 386. A 132- pin Grid Array™ package houses the 386, which offers a full 32-bit data bus and 32-bit address bus. The address bus is capable of accessing over 4 gigabytes of physical memo¬ ry. Virtual addressing pushes this to over 64 trillion bytes of storage. The register set of the 386 is compatible with earlier models, including all eight general purpose registers plus the four segment registers. Although the general purpose registers are 16 bits wide on all earlier machines, they can be extended to 32 bits on the 386. Their new names are EAX, EBX, ECX, and so on. Two additional 16-bit data segment registers are in¬ cluded, FS and GS. Like the 80286, the 386 has two modes of operation: real mode and pro¬ tected mode. When in real mode, segments have a maximum size of 64KB. When in protected mode, a segment may be as large as the entire physical addressing space of 4 giga¬ bytes. The new extended Bags register contains status information concerning privilege levels, virtual mode operation, and other Bags concerned with protected mode. The 386 also contains three 32-bit control registers. The first, machine control register, contains the ma¬ chine status word and additional bits dealing with the coprocessor, paging, and protected mode. The second, page fault linear address, is used to store the 32-bit address that caused the last page fault. In a virtual memory environment, physical memory is divided up into a number of fixed size pages. Each page will at some time be loaded with a portion of an exe¬ cuting program or other type of data. When the processor determines that a page it needs to use has not been loaded into memory, a page fault is generated. The page fault instructs the processor to load the missing page into memory. Ideally, a low page-fault rate is desired. The third control register, page directory base address, stores the physical memory address of the beginning of the page directory table. This table is up to 4KB in length and may contain up to 1024 page directory entries, each of which points to another page table area, whose information is used to generate a physical address. The segment descriptors used in the 80286 are also used in the 386, as are the gate descriptors and the four levels of privilege. Thus, the 386 functions much the same as the 80286, except for the increase in physical memory space and the enhancements involving page handling in the virtual environment. The computing power of each of the processors that have been presented can be aug¬ mented with the addition of a floating point coprocessor. All sorts of mathematical opera¬ tions can be performed with the coprocessors with 80-bit binary precision. The 8087 coprocessor is designed for use with the 8088 and 8086, the 80287 with the 80286, and the 80387 with the 386. 2.13 A SUMMARY OF THE 80486 This processor is the next in Intel’s upward-compatible 80x86 architecture. Surprisingly, there are only a few differences between the 80486 and the 80386, but these differences create a significant performance improvement. AKA JoeyWebScrapper 2.14 A SUMMARY OF THE PENTIUM 43 Like the 80386, the 80486 is a 32-bit machine containing the same register set as the 80386 and all of the 80386’s instruction set with a few additional instructions. The 80486 has a similar 4-gigabyte addressing space using the same addressing features. The first improvement over the 80386 is the addition of an 8KB cache memory. A cache is a very high-speed memory, with an access time usually ten times faster than that of conventional RAM used for external processor memory. The 80486's internal cache is used to store both instructions and data. Whenever the processor needs to access memory, it will first look in the cache for it. If the data are found in the cache, they are read out much faster than if they had to come from external RAM or EPROM. This is known as a cache hit. If the data is not found in the cache, the processor must then access the slower external memory. This is called a cache miss. The processor tries to keep the cache’s hit ratio as high as possible. Consider the following example: RAM Access Time = 70 nS Cache Access Time = 10 nS Hit Ratio = 0.85 Average Memory Access Time = 0.85 x (10 nS) Hit + (1 -0.85) x (10 nS + 70 nS) Miss = 20.5 nS The average memory access time for a hit ratio of 0.85 is less than 21 nS! This is due to the following reasoning: If data are found in the cache (85% of the time), the access time is only 10 nS. If data are not found (15% of the time), the access time equals 80 nS (the cache access time plus the RAM access time), because the processor had to read the cache to find out the data were not there. If you consider that a large portion of a program (or even an entire program) might fit within the 8K cache, you will agree that the program will execute very quickly, because most instruction fetches will be for code already in the cache. This architectural improve¬ ment significantly increases the processing speed of the 80486. Some of the new 80486 in¬ structions are included to help maintain the cache. The 80486 has two other improvements. Although it executes the same instruction set as the 80386, the 80486 does so with a redesigned internal architecture. This new de¬ sign allows many 80486 instructions to execute with fewer clock cycles than those re¬ quired by the 80386. This reduction in clock cycles adds additional speed to the 80486's execution. Also, the 80486 comes with an on-chip coprocessor. You might recall that the 80386 can be connected to an external 80387 coprocessor to enhance performance. The 80486 has the equivalent of an 80387 built right into it! And because the coprocessor is closer to the CPU, data are transferred quicker, which leads to another performance boost. Thus, although the 80386 and 80486 share many similarities, the 80486's differ¬ ences create a much more powerful processor. 2.14 A SUMMARY OF THE PENTIUM The newest, and fastest, chip in the Intel high-performance microprocessor line is the Pentium. As usual, upward compatibility has been maintained. The Pentium will run all AKA JoeyWebScrapper 44 CHAPTER 2 AN INTRODUCTION TO THE PENTIUM MICROPROCESSOR programs written for any machine in the 80x86 line, though it does so at a speed double that of the fastest 80486. And the Pentium does so with a radically new architecture! There are two major computer architectures in use: CISC and RISC. CISC stands for Complex Instruction Set Computer. RISC stands for Reduced Instruction Set Computer. All of the 80x86 machines prior to the Pentium can be considered CISC machines. The Pentium itself is a mixture of both CISC and RISC technologies. The CISC aspect of the Pentium provides for upward compatibility with the other 80x86 architectures. The RISC aspects lead to additional performance improvements. Some of these improvements are separate 8KB data and instruction caches, dual integer pipelines, and branch prediction. Refer back to Figure 2.2 for another look at the Pentium architecture. Splitting the 80486’s integrated instruction/data cache into two separate caches pre¬ vents data and instruction accesses from interfering with each other. This helps keep a steady stream of data flowing into the instruction and integer pipelines. Adding a second integer pipeline (the 80486 has a single integer pipeline) leads to times when two instructions may execute at once. The Pentium has special internal cir¬ cuitry to recognize when both pipelines may be used. This architectural improvement is borrowed from the RISC world of microprocessing, where multiple pipelines are em¬ ployed to gain a performance increase. Another significant addition is that of a branch prediction unit. This circuit keeps track of branch instructions in an executing program and predicts when they will be taken and not taken. By loading special internal data buffers with prefetched instructions, the branch prediction unit can keep the instruction pipeline running smoothly, even when a branch instruction changes the flow of instructions. This is another technique borrowed from RISC technology, and leads to a speeding up of execution in programs that employ many branch instructions. These differences between the Pentium and the earlier 80x86 machines give it a sig¬ nificant speed improvement. This was possible by blending CISC and RISC technology together. The benefit to us as programmers lies in the fact that all Intel processors from the 8086 up, including the Pentium, run the same basic instruction set that we will learn about beginning in the next chapter, but they just do it faster and faster. 2.15 SUMMARY This chapter has taken an introductory look at the Pentium microprocessor and the 80x86 family of upward-compatible microprocessors. The software model of the Pentium was examined first, showing all the 32-bit general purpose registers (EAX, EBX, ECX, EDX, EBP, ESI, EDI, and ESP) and the six 16-bit segment registers (CS, DS, SS, ES, FS, and GS). We then examined the Pentium architecture. To improve execution speed, the Pentium is capable of prefetching the next 32 bytes of code from memory. In real mode, the Pentium contains only 16-bit registers, allowing the generation of 20-bit physical addresses, giving the processor a 1-megabyte addressing space. One of the six segment registers is always involved in a memory access. The general purpose registers were shown to have specific tasks assigned to them by default, such as the use of AX in multiply and divide operations. Techniques to use 32-bit registers and addressing modes were also demonstrated. AKA JoeyWebScrapper 2.15 SUMMARY 45 A technique called Intel byte-swapping was also introduced, which accounts for the way a 16-bit number is stored in memory (low byte first). This technique is rarely seen on other microprocessors. The entire instruction set was presented to give you a feel for the type of operations the Pentium is capable of performing. This discussion was followed by a brief explanation of what a segment is, and what addressing modes are available. Some examples were shown to illustrate the use of different addressing modes. This was followed by an expla¬ nation of the Pentium’s interrupt structure, and a summary of the entire 80x86 family. In the next chapter we will take a detailed look at the software operation of the Pentium. STUDY QUESTIONS 1. Name all of the Pentium’s general purpose registers and some of their special functions. 2. How are the segment registers used to form a 20-bit address? 3. a) If CS contains 03E0H and IP contains 1F20H, from what address is the next instruc¬ tion fetched? b) If SS contains 0400H and SP contains 3FFEH, where is the top of the stack located? c) If a data segment begins at address 24000H, what is the address of the last location in the segment? 4. Explain what the instruction and data cache are used for. 5. Are the U and V pipelines identical in operation? 6. a) Show the DB statement needed to define a list of numbers called FACTORS that contains all the integer factors of the decimal value 50. b) Show how a DW statement can be written to reserve 250 words of the value 7. 7. What is a segment? 8. Two memory locations, beginning at address 3000H, contain the b) js 34H and 12H. What is the word stored at location 3000H? See Figure 2.9 for details. 9. What is Intel byte-swapping? 10. Count the number of different instructions available on the Pentium. How many are there? 1 1. How many addressing modes does the Pentium provide? 12. What is a physical address? 13. Why is register addressing so fast? 14. What do square brackets mean when they appear in an operand (e.g., MOV AX,)? FIGURE 2.9 For question 8 3000 3001 AKA JoeyWebScrapper 46 CHAPTER 2 AN INTRODUCTION TO THE PENTIUM MICROPROCESSOR 15. What is the difference between MOV AX, lOOOH and MOV AX,[1000H]7 16. How does port addressing differ from memory addressing? 17. What is an interrupt? 18. Name one instruction that can cause an interrupt. 19. How many interrupts does the Pentium support? 20. What are some of the differences between real mode and protected mode? 21. List the important features of the 80286. 22. What is one advantage of virtual memory? 23. What is a page fault? 24. Compare two 386 systems, one containing 512KB of RAM, the second containing 4 megabytes. How would the number of page faults compare when: a) a 220KB application is executed on both machines? b) a 6-megabyte application is executed on both machines? 25. Which has the greater effect on the number of page faults, physical memory size or the size of the program being executed? 26. Why would we resist building a complete physical memory for the 386? Does the reason apply to the 8086? 27. Why would anyone possibly need 4.3 billion bytes for a program? Can you think of any applications that may require this much memory? 28. List three differences between the 80286 and the 80386. 29. List three differences between the 80386 and the 80486. 30. Compute the average memory access time from the following information: RAM Access Time = 80 nS Cache Access Time = 10 nS Hit Ratio = 0.92 31. What makes the Pentium so different from other 80x86 CPUs? 32. Use DEBUG to find the address of INT 21H on your DOS machine. AKA JoeyWebScrapper CHAPTER 3 Pentium Instructions, Part 1: Addressing Modes, Flags, and Data Transfer and String Instructions OBJECTIVES In this chapter you will learn about: The style of source files written in Pentium assembly language The different addressing modes of the Pentium The operation and use of the processor flags Data transfer and string instructions 3.1 INTRODUCTION This chapter is intended to introduce you to the first part of the Pentium's instruction set, and the ways that different addressing modes and data types can be used to make the instructions do the most work for you. The combination of instructions and addressing modes found in the Pentium makes the job of writing code much easier and more efficient than before. In this chapter we will examine the various addressing modes, flags, and conven¬ tions used when representing data. We will take a detailed look at the data transfer and string instructions, leaving the remainder of the instruction set for Chapter 4. Section 3.2 introduces the conventions followed when writing Pentium assembly language source code. Section 3.3 explains the different instruction types available; this is followed by coverage of the Pentium’s addressing modes in Section 3.4. Processor flags are detailed in Section 3.5 to set the stage for the first two instruction groups, data transfer and string instructions, which are presented in Sections 3.6 and 3.7, respectively. 47 AKA JoeyWebScrapper 48 CHAPTER 3 PENTIUM INSTRUCTIONS, PART 1 3.2 ASSEMBLY LANGUAGE PROGRAMMING Program execution in any microprocessor system consists of fetching binary information from memory and decoding that information to determine the instruction represented. The infor¬ mation in memory may have been programmed into an EPROM or downloaded from a sep¬ arate system. But where did the program come from and how was it written? As humans, we have trouble handling many pieces of information simultaneously and thus have difficulty writing programs directly in machine code, the binary language understood by the micro¬ processor. It is much easier for us to remember the mnemonic SUB AX,AX than the corre¬ sponding machine code 2BC0. For this reason, we write source files containing all the instruction mnemonics needed to execute a program. The source file is converted into an object tile containing the actual binary information the machine will understand by a special program called an assembler. Some assemblers allow the entire source file to be written and assembled at one time. Other assemblers, called single-line assemblers, work with one source line at a time and are restricted in operation. These kinds of assemblers are usually found on small microprocessor-based systems that do not have disk storage and text editing capability. The assembler discussed here is not a single-line assembler but a cross-assembler. Cross-assemblers are programs written in one language, such as C, that translate source statements into a second language: the machine code of the desired processor. Figure 3.1 shows this translation process. The source file in the example, TOTAF.ASM, is presented as input to the assembler. The assembler will convert all source statements into the correct binary codes and place these into the object file TOTAF.OBJ. Usually, the object file con¬ tains additional information concerning program relocation and external references, and thus is not yet ready to be loaded into memory and executed. A second file created by the assembler is the list file, TOTAF.FST, which contains all the original source file text plus the additional code generated by the assembler. The list file may be displayed on the screen, or printed. The object file may not be printed or displayed, since it is just code. A Sample Source File Fet us look at a sample source file, a subroutine designed to find the sum of 16 bytes stored in memory. It is not important at this time that you understand what each instruction does. FIGURE 3.1 Source pro¬ gram assembly Source file (TOTAL.ASM) Object file List file (TOTAL. OBJ) (TOTAL.LST) AKA JoeyWebScrapper 3.2 ASSEMBLY LANGUAGE PROGRAMMING 49 We are simply trying to get a feel for what a source file might look like and what conven¬ tions to follow when we write our own programs. D 1 ORG 8000H TOTAL: MOV AX,7 0 0 OH ;load address of data area MOV DS, AX ;init data segment register MOV AL, 0 ;clear result MOV BL, 16 ;init loop counter MOV SI, 0 ;init data pointer ADDUP: ADD AL,[SI] ;add data value to result INC SI /increment data pointer DEC BL /decrement loop counter JNZ ADDUP /jump if counter not zero MOV [SI],AL /save sum RET /and return END The first line of source code contains a command that instructs the assembler to load its pro¬ gram counter with 8000H. The ORG (for origin) command is known as an assembler pseu¬ do-opcode, a fancy name for a mnemonic that is understood by the assembler but not by the microprocessor. ORG does not generate any source code; it merely sets the value of the assembler’s program counter. This is important when a section of code must be loaded at a particular place in memory. The ORG statement is a good way to generate instructions that will access the proper memory locations when the program is loaded into memory. Hexadecimal numbers are followed by the letter H to distinguish them from decimal numbers. This is necessary since 8000 decimal and 8000 hexadecimal differ greatly in magnitude. For the assembler to tell them apart, we need a symbol that shows the differ¬ ence. Some assemblers use $8000; others use &H8000. It is really a matter of whose soft¬ ware you purchase. All examples in this book will use the 8000H form. The second source line contains the major components normally used in a source statement. The label TOTAL is used to point to the address of the first instruction in the subroutine. ADDUP is also a label. Single-line assemblers do not allow the use of labels. The opcode is represented by MOV and the operand field by AX,7000H. The order of the opera

The Pentium Microprocessor PDF

Document Details

Tags

Related

Summary

Full Transcript

Upgrade to continue