CPU (Central Processing Unit) Guide PDF
Document Details
Uploaded by IssueFreeRadon6091
TU Dublin
Tags
Summary
This document provides an overview of microprocessors, including their components, functionalities, and relevant concepts like FLOPS and supercomputing. It also details different computing scales and memory types. The content covers both theoretical and practical aspects of microcomputing.
Full Transcript
# Microprocessors (CPU's) - A microprocessor is the main component of a microcomputer system and is also called as CPU (Central Processing Unit). - A microcomputer system consists of a minimum of memory (e.g. RAM & ROM) & I/O devices (Graphics Card, I/O devices, ect). - A microcomputer is a program...
# Microprocessors (CPU's) - A microprocessor is the main component of a microcomputer system and is also called as CPU (Central Processing Unit). - A microcomputer system consists of a minimum of memory (e.g. RAM & ROM) & I/O devices (Graphics Card, I/O devices, ect). - A microcomputer is a programmable machine. Modern computers are electronic and digital. The two principal characteristics of a computer are: - It responds to a specific set of instructions in a well-defined manner. - It can execute a prerecorded list of instructions (a program). # Byte Scale (Unit of Storage/Capacity) | Byte | Value | |---|---| | Megabyte | 1,000,000 | | Gigabyte | 1,000,000,000 | | Terabyte | 1,000,000,000,000 | | Petabyte | 1,000,000,000,000,000 | | Exabyte | 1,000,000,000,000,000,000 | | Zettabyte | 1,000,000,000,000,000,000,000 | | Yottabyte | 1,000,000,000,000,000,000,000,000 | # Definition: FLOP - Computer systems use floating-point numbers to represent extremely large numbers that would otherwise require many digits to record. - ICT professionals use the term "flops" to indicate how quickly computers can calculate these numbers. - The use of terms like "gigaflop" correspond to other terms like "gigabyte," which represents one billion individual bytes of data storage. - It's important to note that in terms of processing speed and power, even the average device such as a laptop or desktop computer has already advanced beyond the capacity of a single gigaflop - A gigaflop is equal to one billion floating-point operations per second. Floating-point operations are the calculations of floating-point numbers. # Definition: FLOP ## Scientific Prefixes in Computing | Name | Unit | Value | |---|---|---| | kiloFLOPS | KFLOPS | $10^3$ | | megaFLOPS | MFLOPS | $10^6$ | | gigaFLOPS | GFLOPS | $10^9$ | | teraFLOPS | TFLOPS | $10^{12}$ | | petaFLOPS | PFLOPS | $10^{15}$ | | exaFLOPS | EFLOPS | $10^{18}$ | | zettaFLOPS | ZFLOPS | $10^{21}$ | | yottaFLOPS | YFLOPS | $10^{24}$ | # Supercomputing There is a timeline of supercomputers. The timeline shows the year the supercomputer was built and a visual representation of its processing power. The most powerful supercomputer is Summit. # Oak Ridge Facility ## Sumit Supercomputer - Cost: €200 million # Scales in Computing - **Terascale:** Refers to methods and processes for using supercomputers capable of performing at least 1 TFLOPS or storage systems capable of storing at least 1 TB. - Tera = $1 \times 10^{12}$ (1024 GBytes). - **Petascale:** Refers to methods and processes for using supercomputers capable of performing at least 1 PFLOPS or storage systems capable of storing at least 1 PB. - Peta = $1 \times 10^{15}$ (approx. 1 million Gbytes). - **Exascale:** Refers to methods and processes for using supercomputers capable of performing at least 1 EFLOPS or storage systems capable of storing at least 1 EB. - Exa = $1 \times 10^{18}$. # Comparisons - Computing Power - PlayStation 4: 1.843 TFLOPS - 2013 - Xbox One S: 1.4 TFLOPS - 2016 - PlayStation 4 Pro: 4.2 TFLOPS - 2016 - Nintendo Switch: 1 TFLOPS - 2017 - Laptop Desktops: ~6 TFLOPS - 2017 - Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz: 45.93 GFLOPS As of June 2018, the fastest machine was the Summit Supercomputer: - IBM PowerPC and Nvidia machine at Oak Ridge. - It hit 148.6 petaflops ($148.6 \times 10^{15}$) on 2.2M cores. - Equivalent of about 63 gigaflops per core (CPU). - It's theoretical peak is 200 petaflops ($200 \times 10^{15}$ FLOPS). - Works with 10PetaBytes of RAM Memory (=10 Million GBytes of # Block diagrams ## (Microcomputer & Microprocessor) ### Block diagram of a basic computer system The diagram shows a basic computer system. The diagram includes: - CPU - ROM - RAM - I/O interface - I/O devices - Microprocessor - Microcomputer - Address bus - Data bus - Control bus # Hardware - All general-purpose computers require the following hardware components: - **Memory:** Enables a computer to store data and programs. - **Mass storage device:** Allows a computer to permanently retain large amounts of data. Common mass storage devices include disk drives and tape drives. - **Input device:** Usually a keyboard and mouse are the input device through which data and instructions enter a computer. - **Output device:** A display screen, printer, or other device that lets you see what the computer has accomplished. - **Central processing unit (CPU):** The heart of the computer, this is the component that actually executes instructions. - **A "Bus"** - Data Bus for transmitting information, and an Address Bus for accessing the locations of devices and memory. # Software - The programs and data stored in a microcomputer is called as software. - Programs can be written in low level languages or high level languages. - A low level language can be binary language or assembly language. - A CPU recognizes only binary language which is called as machine language. - Assembly language instructions contain alphabets and/or numeric characters. To run assembly language programs a converter called as assembler is required. - High level languages are more user friendly and contain simple words of English language. To run high level programs, converters such as compilers or interpreters are required. # Input & Output Devices - Input devices are used to input electrical or physical information in a microcomputer system in digital form. - In embedded applications, commonly used input devices are simple switches and sensors. - In general purpose microcomputers, input devices can be scanners, keyboard, mouse, network card etc. - Output devices are used to display or perform required operation. - In embedded applications (CPU in a Washing Machine, for example) commonly used output devices are LED display units, LCD display units, stepper motors etc. - In general purpose computers output devices are mainly LCD screens, LED screens, Printers, network card etc. # Memories - Memory in a microcomputer system is used to store data and programs temporarily or permanently. - The memories of primary concern for the CPU are only RAM & ROM which are called as primary memory or main memory. The CPU, at any one time, can only communicates with RAM & ROM. - Other than primary memories, there are also secondary memories which are used for mass storage of data and programs and are transferred to the primary memory when required to be executed by the CPU. Examples of secondary memories are Hard Disks, DVDs/CD's, flash drives etc. # Memory Classification - The image shows a tree diagram of Memory Classification. - The root node is Memory. - The first level nodes are Primary memory or Main Memory, Cache Memory and Secondary Memory or Auxiliary Memory. - Other nodes are: - **Primary or Main Memory** - Read/write Memory - Read only Memory - ROM - PROM - EPROM - EEPROM - Flash ROM - Sequential Access Memory (SAM) - Random Access Memory (RAM) - Static Ram (SRAM) - Dynamic Ram (DRAM) - SDRAM (Synchronous DRAM) - DDRRAM (Double Data Rate RAM) - **Secondary or Auxiliary Memory** - Hard Disk Drive (HDD) - Floppy Disk (FD) - Compact Disk (CD) - Digital Versatile Disk (DVD) - Flash Drive # Internal Structure and Basic Operation of Microprocessor - The image shows a block diagram of a microprocessor, including: - ALU - Register Section - Control and timing section - Address bus - Data bus - Control bus # Memory Mapping for I/O Devices The image shows the Memory Mapping for I/O devices. - The top of the System Address Space is 8 GB. - The space is divided into several sections. - Each section has a specific purpose: - FLASH APIC - Reserved - PCI Memory Range - DRAM Range - DOS Compatibility-Memory # Memory Mapping In the following table, the increased maximum resources of computers that are based on 64-bit versions of Windows and the 64-bit Intel processor are compared with existing 32-bit resource maximums. | Architectural Component | 64-bit Windows | 32-bit Windows | |---|---|---| | Virtual memory | 16 terabytes | 4 GB | | Paging file size | 256 terabytes | 16 terabytes | | Hyperspace | 8 GB | 4 MB | | Paged pool | 128 GB | 470 MB | | Non-paged pool | 128 GB | 256 MB | | System cache | 1 Terabyte | 1 GB | # Explanation: Virtual Memory This is a method of extending the available physical memory on a computer. In a virtual memory system, the operating system creates a pagefile, or swapfile, and divides memory into units called pages. Recently referenced pages are located in physical memory, or RAM. If a page of memory is not referenced for a while, it is written to the pagefile. This is called "swapping" or "paging out" memory. If that piece of memory is then later referenced by a program, the operating system reads the memory page back from the pagefile into physical memory, also called "swapping" or "paging in" memory. The total amount of memory that is available to programs is the amount of physical memory in the computer in addition to the size of the pagefile. An important consideration in the short term is that even 32-bit applications will benefit from increased virtual memory address space when they are running in Windows x64 Editions. # Other Explanations - **Paging file:** This is a disk file that the computer uses to increase the amount of physical storage for virtual memory. - **Hyperspace:** This is a special region that is used to map the process working set list and to temporarily map other physical pages for such operations as zeroing a page on the free list (when the zero list is empty and the zero page is needed), invalidating page table entries in other page tables (such as when a page is removed from the standby list), and in regards to process creation, setting up the address space of a new process. - **Paged pool:** This is a region of virtual memory in system space that can be paged in and out of the working set of the system process. Paged pool is created during system initialization and is used by Kernel-mode components to allocate system memory. Uniproccessor systems have two paged pools, and multiprocessor systems have four. Having more than one paged pool reduces the frequency of system code blocking on simultaneous calls to pool routines. - **Non-paged pool:** This is a memory pool that consists of ranges of system virtual addresses that are guaranteed to be resident in physical memory at all times and thus can be accessed from any address space without incurring paging input/output (I/O). Non-paged pool is created during system initialization and is used by Kernel-mode components to allocate system memory. - **System cache:** These are pages that are used to map open files in the system cache. # Theoretical Limits for Address Busses - 16 bit = 65,536 bytes (64 Kilobytes) - 32 bit = 4,294,967,295 bytes (4 Gigabytes) - 64 bit = 18,446,744,073,709,551,616 (16 Exabytes) # System (Bus) The image shows a block diagram of a computer system, including: - CPU Chip or Microprocessor - ALU (calculating) - Internal communication - Registers (temporary storage) - Control Section - Storage/input Internal memory - RAM (Read/write) - ROM (Read only) - Input Interface - Input Devices - Keyboard - Mouse - Joy stick - Scanner - Light pen - Bus System - Output Interface - Output Devices - Monitor - Printer - Storage/Input External Memory - Floppy disc drive - Hard disc drive - Magnetic tape # System BUS (Three Types) The image shows a diagram of a system bus. It has three types of bus: - Control bus - Address bus - Data bus # The Diagram of a Microprocessor The image shows a diagram of a microprocessor. It includes the following components: - CPU chip or Microprocessor - ALU (calculating) - Internal communication - Registers (temporary storage) - Control Section - DATA - ADDR - CTRL - Clock - Programme memory (ROM) - Data memory (RAM) - Operator devices - Communication devices - Process devices # Explanation of a Microprocessor - The microprocessor is a semiconductor device (Integrated Circuit) manufactured by the VLSI (Very Large Scale Integration) technique. It includes the ALU, register arrays and control circuit on a single chip. - A system designed using a microprocessor as its CPU is called a microcomputer. - The Microprocessor based system (single board microcomputer) consists of microprocessor as CPU, semiconductor memories like EPROM and RAM, input device, output device and interfacing devices. - The memories, input device, output device and interfacing devices are called peripherals. - The popular input devices are keyboard and floppy disk and the output devices are printer, LED/LCD displays, CRT monitor, etc - In the μP based system, the microprocessor is the master and all other peripherals are slaves. The master controls all the peripherals and initiates all operations. The work done by the processor can be classified into the following three groups: - Work done internal to the processor. - Work done external to the processor. - Operations initiated by the slaves or peripherals. - The work done internal to the processors are addition, subtraction, logical operations, data transfer operations, etc. - The work done external to the processor are reading/writing the memory and reading/writing the I/O devices or the peripherals. If the peripheral requires the attention of the master then it can interrupt the master and initiate an operation. - The microprocessor is the master, which controls all the activities of the system. To perform a specific job or task, the microprocessor has to execute a program stored in memory. The program consists of a set of instructions. It issues address and control signals and fetches the instruction and data from memory. # Explanation of a Microprocessor ## Buses: - The buses are group of lines that carries data, address or control signals - The CPU Bus has multiplexed lines, i.e., same line is used to carry different signals - The CPU interface is provided to demultiplex, the multiplexed lines, to generate chip select signals and additional control signals. - The system bus has separate lines for each signal. - All the slaves in the system are connected to the same system bus. At any time instant communication takes place between the master and one of the slaves. ## Peripheral Devices: - The EPROM memory is used to store permanent programs and data. - The RAM memory is used to store temporary programs and data. - The input device is used to enter the program, data and to operate the system. - The output device is used for examining the results. Since the speed of I/O devices does not match with the speed of microprocessor, an interface device is provided between system bus and I/O devices. Generally I/O devices are slow devices. # Microprocessor (µP) Clock - In general, the clock refers to a microchip that regulates the timing and speed of all computer functions. - In the clock chip is a crystal (piezoelectrical crystal like quartz) that vibrates at a specific frequency when electricity is applied. - The shortest time any computer is capable of performing is one clock, or one vibration of the clock chip. - The speed of a computer processor is measured in clock speed, for example, - 1 MHz is one million cycles, or vibrations, a second. - 2 GHz is two billion cycles, or vibrations, a second. # System BUS - In Real Life A picture of a circuit board shows the system bus in real life. # Motherboard - The image shows a motherboard. - The image also shows some of its components: - Intel Socket LGA 1366 Connector - Intel X58 Chipset - DDR3 1333 MHz Triple Channel Memory Slots (Up to 24 GB) - PCI-Express Slots - 3 PCI-E X16, 2 PCI-E X1 - PCI Slots - Intel ICH10R Chipset - DLED2 Display - Serial ATA Headers - Easy Buttons - Easy OC Switch - PS/2 Mouse Port - PS/2 Keyboard Port - USB 2.0 Ports - Dual Gigabit LAN Ports - eSATA Ports - 1394 Port - Clear CMOS Button - Back Panel Connectors # A CPU Cache - A CPU cache is a small memory location within the CPU itself used by the CPU of a computer to reduce the average time to access data from the main memory. - A cache will store recent data. The image shows a picture of a CPU, cache and RAM and a diagram of the components of a CPU, including: - Registers - L1 Cache - L2 Cache - RAM # A CPU Cache Levels - When the processor needs to read from or write to a location in main memory, it first checks whether a copy of that data is in the cache. - If so, the processor immediately reads from or writes to the cache, which is much faster than reading from or writing to main memory. - Cache's are faster then accessing memory, but may not always "prefetch" or hold the data the CPU needs. - An issue with speed versus accuracy is the fundamental tradeoff between cache latency and hit rate. - Larger caches have better hit rates but longer latency. - To address this tradeoff, many computers use multiple levels of cache, with small fast caches backed up by larger, slower caches - Multi-level caches generally operate by checking the fastest, level 1 (L1) cache first; if it hits, the processor proceeds at high speed. - If that smaller cache misses, the next fastest cache (level 2, L2) is checked, and so on, before external memory is checked. # 4004 - First microprocessor (1971) - For Busicom calculator - Characteristics - 10 µm process - 2300 transistors - 400 - 800 kHz - 4-bit word size - 16-pin DIP package - Masks hand cut from Rubylith - Drawn with color pencils - 1 metal, 1 poly (jumpers) - Diagonal connectors # 80286 - Virtual memory (1982) - IBM PC AT - Characteristics - 1.5 µm process - 134k transistors - 6-12 MHz - 16-bit word size - 68-pin PGA - Regular datapaths and internal ROMS # 80386 - 32-bit processor (1985) - Modern x86 ISA - Characteristics - 1.5-1 µm process - 275k transistors - 16-33 MHz - 32-bit word size - 100-pin PGA - 32-bit datapath, - microcode ROM, - synthesized control # Pentium - Superscalar (1993) - 2 instructions per cycle - Separate 8KB I$ & D$ - Characteristics - 0.8-0.35 μm process - 3.2M transistors - 60-300 MHz - 32-bit word size - 296-pin PGA - Caches, datapath, - FPU, control # Pentium 4 - Deep pipeline (2001) - Very fast clock - 256-1024 KB L2$ - Characteristics - 180 – 90 nm process - 42-125M transistors - 1.4-3.4 GHz - 32-bit word size - 478-pin PGA - Units start to become invisible on this scale # Intel™ iCores - Generational Multi-Core CPU's - Many more contact points - 4 Physical CPU's = 8 logical (HyperThreading) - Architecture 22nm technology - 20 MB Intel® Smart Cache - Intel® 64 architecture - Level 1 Cache: 16KB data cache - Level 2 Cache Smart Cache dividing up 20MB between 8 cores. - Up to 3.5 Ghz with Intel® Turbo Boost Technology - 64GB of addressable memory # Intel™ iCore 7i The image shows a diagram of an iCore 7i processor, which includes: - Processor Graphics - 4 Cores - Shared L3 Cache - Memory Controller I/O - System Agent, Display Engine & Memory Controller # Summary - $10^4$ increase in transistor count, clock frequency over 30 years! The image shows a table of Intel microprocessors over three decades. The table includes: - Processor - Year - Feature Size (µm) - Transistors - Frequency (MHz) - Word size - Package # Newer Intel™ CPU's A table shows a comparison of different Intel CPU's, including: - Brand - Processor Number - Price - TDP - Cores/Threads - CPU Base Freq (GHz) - Max Turbo Freq (GHz) - DDR3 (MHz) - L3 Cache - Intel® HD Graphics 2500/4000 - Graphics Base Render Frequency - Graphics Max Dynamic Frequency # Growth in Processing The image shows a chart of the growth in processing power over time. The chart shows the calculations per second per $1000 for different technologies. - The technologies include: - ElectroMechanical - Solid-State Relay - Vacuum tube - Transistor - Integrated Circuit - UNIVAC I - DEC PDP-1 - COMPAQ DESKPRO 386 - ALTAIR 8800 - IBM 1130 - COLOSSUS - IBM 704 - IBM SSEC - Tabulator - Hollerith Tabulator - National Ellis 3000 - Bell Calculator Model 1 - Analytical Engine - IBM PC - DEC PDP-10 - Apple II - IBM AT-80286 - Pentium - Pentium II - Pentium III - Pentium 4 - Core 2 Duo - Core i7 Quad - Optical, Quantum, DNA Computing - Mouse Brain - Human Brain # Growth in Scale (Moore's Law) The image shows a chart of growth in scale of transistors over years. The chart has a line showing the Moore's Law. - There is also a separate line for each generation: - G6 - G5 - G4 - G3 - Pentium 4 - Pentium M - Pentium III - Pentium MMX - Pentium Pro - Pentium 604e - Pentium 604 - Pentium 601 - Pentium 603e - 80486 - 68040 - 80386 - 68030 - 80286 - 68020 - 68000 - 8008 - 8086 - 8088 - 8085 - 8080 - 8008 - 6800 - 4004 - Sandy Bridge - Ivy - Nehalem # Moore's Law - Moore's Law is the observation made in 1965 by Gordon Moore, co-founder of Intel. - It States that the number of transistors per square inch on integrated circuits had doubled every year since the integrated circuit was invented. - Moore predicted that this trend would continue for the foreseeable future. - In subsequent years, the pace slowed down a bit, but data density has doubled approximately every 18 months. - This is the current definition of Moore's Law. - Most experts, including Moore himself, expect Moore's Law to hold true until 2020-2025. - The limitation which exists is that once transistors can be created as small as atomic particles, then there will be no more room for growth in the CPU market where speeds are concerned. # The Future of CPU's? - **Non-silicon alternative** - carbon nano-tubes - molecular computing with organic molecules - **Optical Computers - light instead of electricity** - **Quantum computing - computing with atoms and their component parts** # Carbon Nano Tubes The image shows a diagram of carbon nano tubes used in a CPU. - Faster - Dissipate Heat - Better Energy Efficiency