Professional Documents
Culture Documents
Lecture 4
OBJECTIVES
this lecture enables the student to:
OBJECTIVES
this lecture enables the student to:
Explain the function of the EU (execution unit) and BIU (bus interface unit). Describe pipelining and how it enables the CPU to work faster.
Registers Describe function and purpose of each programvisible register in the 8086-Core2 microprocessors, including 64-bit extensions. List the bits of the flag register and briefly state the purpose of each bit.
OBJECTIVES
this lecture enables the student to:
(cont)
Lecture Outline
1. Brief History of the x86 Family
2. Typical Pins of the x86 Processor 3. Internal Processor Architecture 4. Memory Architecture 5. I/O Operations 6. The Microprocessor-Based Personal Computer System
Memory Architecture State the purpose of the code segment, data segment, stack segment, and extra segment. Explain the difference between a logical address and a physical address. Describe how memory is accessed using real and protected mode memory-addressing techniques. Describe the "little endian" storage convention of x86 microprocessors. State the purpose of the stack. Explain the function of PUSH and POP instructions.
BRIEF HISTORY OF THE x86 FAMILY Evolution from 8080/8085 to 8086 In 1978, Intel Corporation introduced the 16-bit 8086 microprocessor, a major improvement over the previous generation 8080/8085 series.
The 8086 capacity of 1 megabyte of memory exceeded the 8080/8085 maximum of 64K bytes of memory. 8080/8085 was an 8-bit system, which could work on only 8 bits of data at a time.
Data larger than 8 bits had to be broken into 8-bit pieces to be processed by the CPU.
BRIEF HISTORY OF THE x86 FAMILY Evolution from 8086 to 8088 The 8086 microprocessor has a 16-bit data bus, internally and externally.
All registers are 16 bits wide, and there is a 16-bit data bus to transfer data in and out of the CPU
There was resistance to a 16-bit external bus as peripherals were designed around 8-bit processors. A printed circuit board with a 16-bit data also bus cost more.
BRIEF HISTORY OF THE x86 FAMILY Success of the 8088 The 8088-based IBM PC was an great success, because IBM & Microsoft made it an open system.
Documentation and specifications of the hardware and software of the PC were made public
Making it possible for many vendors to clone the hardware successfully & spawn a major growth in both hardware and software designs based on the IBM PC.
BRIEF HISTORY OF THE x86 FAMILY 80286, 80386, and 80486 Intel introduced the 80286 in 1982, which IBM picked up for the design of the PC AT.
16-bit internal & external data buses. 24 address lines, for 16mb memory. (224 = 16mb) Virtual memory.
BRIEF HISTORY OF THE x86 FAMILY 80286, 80386, and 80486 Virtual memory is a way of fooling the processor into thinking it has access to an almost unlimited amount of memory.
By swapping data between disk storage and RAM.
BRIEF HISTORY OF THE x86 FAMILY 80286, 80386, and 80486 In 1985 Intel introduced 80386 (or 80386DX).
32-bit internally/externally, with a 32-bit address bus. Capable of handling memory of up to 4 gigabytes. (232) Virtual memory increased to 64 terabytes. (246)
BRIEF HISTORY OF THE x86 FAMILY 80286, 80386, and 80486 On the 80486, in 1989, Intel put a greatly enhanced 80386 & math coprocessor on a single chip.
Plus additional features such as cache memory.
Cache memory is static RAM with a very fast access time.
Later Intel introduced 386SX, internally identical, but with a 16-bit external data bus & 24-bit address bus.
This makes the 386SX system much cheaper.
All programs written for the 8088/86 will run on 286, 386, and 486 computers.
Since general-purpose processors could not handle mathematical calculations rapidly, Intel introduced numeric data processing chips.
Math coprocessors, such as 8087, 80287, 80387.
BRIEF HISTORY OF THE x86 FAMILY Pentium & Pentium Pro In 1992, Intel released the Pentium. (not 80586)
A name can be copyrighted, but numbers cannot.
Pentium is fully compatible with previous x86 processors but includes several new features.
Separate 8K cache memory for code and data. 64-bit bus, and a vastly improved floating-point processor.
BRIEF HISTORY OF THE x86 FAMILY Pentium & Pentium Pro The Pentium is packaged in a 273-pin PGA chip
BICMOS technology, combines the speed of bipolar transistors with power efficiency of CMOS technology 64-bit data bus, 32-bit registers & 32-bit address bus.
Capable of addressing 4gb of memory.
BRIEF HISTORY OF THE x86 FAMILY Pentium II In 1997 Intel introduced the Pentium II processor
7.5-million-transistor processor featured MMX (MultiMedia Extension) technology incorporated into the CPU.
For fast graphics and audio processing.
BRIEF HISTORY OF THE x86 FAMILY Pentium III In 1999 Intel released Pentium III.
9.5-million-transistor processor. 70 new instructions called SIMD.
Enhance video/audio performance in 3-D imaging, and streaming audio.
BRIEF HISTORY OF THE x86 FAMILY Pentium 4 The Pentium 4 debuted late in 1999.
Speeds of 1.4 to 1.5 GHz. System bus operates at 400 MHz
BRIEF HISTORY OF THE x86 FAMILY Intel 64 Architecture Intel has selected Itanium as the new brand name for the first product in its 64-bit family of processors.
Formerly called Merced.
New cache and pipelining technology & expansion of the multimedia instruction set make the P4 a high-end media processing microprocessor.
Lecture Outline
1. Brief History of the x86 Family
8088 MICROPROCESSOR 8088 is a 40-pin microprocessor chip that can work in two modes: minimum mode and maximum mode.
Maximum mode is used to connect to 8087 coprocessor.
If a coprocessor is not needed, 8088 is used in minimum mode.
Adres Yolu A19 - A0 Veri Yolu D7 - D0 ALE RD Yol Kontrol WR READY INTA IO/M DT/R DEN
20
INTR NMI
Kesmeler
X2 RESET OUT SOD SID TRAP RST 7.5 RST 6.5 RST 5.5 INTR INTA AD0
8088 (min)
HOLD HLDA
Yol Hakemlii
8085A
32 31 30 29 28 27 26 25 24 23 22 21
8085A
AD1 AD2
eitli
Yol Durum
(a)
(b)
Pins 9-16 (AD0AD7) are used for both data and addresses in 8088.
AD stands for "address/data.
The ALE (address latch enable) pin signals whether the information on pins AD0AD7 is address or data.
When data is to be sent out or in, ALE is low, which indicates that AD0AD7 will be used as data buses (D0D7). The process of separating address and data from pins AD0AD7 is called demultiplexing.
The last 4 bits of the address come from A16A19, pin numbers 3538.
74 LS373 D Latch
In any system, all addresses must be latched to provide a stable, high-drivecapability address bus.
A 19 / S6 A 16 / S3 A 15 A8
A7
A0
ALE
'373
OE
AD7
AD0
D7
D0
Veri Yolu
Kontrol Yolu
Kontrol Yolu
BHE / S7 A 19 / S6
OE '373 G
BHE A 19 A 16
A 19 / S6 A 16 / S3
A 16 / S3
A 15
A8
Adres Yolu
A 15
A8
'244 OE
A7
A0
A7
A0
ALE
G '373 OE
G '373 OE
ALE
'373
OE
AD15
AD0
D 15
D0
Veri Yolu
AD7
D7
D0
Veri Yolu
Bus Timing
Bir Yol evrimi T1
CLK IO / M, SS0
T2
T3
T4
BHE / S7 A 19 / S6 '373 OE G
BHE A 19 A 16
A 16 / S3
A 15
A8
Adres Yolu
ALE
A7
A0
AD7 - AD0 RD DT / R
A7 - A0
Veri eri D7 - D0
ALE
G '373 OE
G '373 OE
Okuma evrimi
AD15
AD8 '245
D 15
D8
DEN
G DIR
Veri Yolu
AD7 - AD0 WR DT / R
A7 - A0
Veri Dar
D7 - D0
AD7
AD0 '245
D7
D0
Yazma evrimi
G DIR DEN DT / R
DEN
8088 MICROPROCESSOR Control Bus 8088 can access both memory and I/O devices for read and write operations, four operations, which need 4 control signals:
MEMR (memory read); MEMW (memory write). IOR (I/O read); IOW (I/O write).
8088 MICROPROCESSOR Control Bus 8088 provides three pins for control signals:
RD, WR, and IO/M.
RD & WR pins are both active-low. IO/M is low for memory, high for I/O devices.
8088 MICROPROCESSOR Control Bus 8088 provides three pins for control signals:
RD, WR, and IO/M.
RD & WR pins are both active-low. IO/M is low for memory, high for I/O devices.
Four control signals are generated: IOR; IOW; MEMR; MEMW. All of these signals must be active-low.
Address,Data,and Control Buses in 8088-based System
If reading/writing takes more than 4 clocks, wait states (WS) can be requested from the CPU.
ALE Timing
Pins 2432 have different functions depending on whether 8088 is in minimum or maximum mode.
In maximum mode, 8088 needs supporting chips to generate the control signals.
8088 MICROPROCESSOR Other Pins Functions of 8088 pins 2432 in minimum mode.
8088 MICROPROCESSOR Other Pins Functions of 8088 pins 2432 in minimum mode.
8088 MICROPROCESSOR Other Pins MN/MX (minimum/maximum) - minimum mode is selected by connecting MN/MX (pin number 33) directly to +5 V.
Maximum mode is selected by grounding this pin.
8088 MICROPROCESSOR Other Pins INTR (interrupt request) - an active-high leveltriggered input signal continuously monitored by the microprocessor for an external interrupt.
This pin & INTA are connected to the 8259 interrupt controller chip.
NMI (nonmaskable interrupt) - an edge-triggered (low to high) input signal to the processor that will make the microprocessor jump to the interrupt vector table after it finishes the current instruction.
Cannot be masked by software.
READY - an input signal, used to insert a wait state for slower memories and I/O.
It inserts wait states when it is low.
TEST - in maximum mode, an input from the 8087 math coprocessor to coordinate communications.
Not used In minimum mode.
8088 MICROPROCESSOR Other Pins RESET - terminates present activities of the processor when a high is applied to the RESET input pin.
Lecture Outline
1. Brief History of the x86 Family 2. Typical Pins of the x86 Processor
A presence of high will force the microprocessor to stop all activity and set the major registers to the values shown at right.
INSIDE THE 8088/86 There are two ways to make the CPU process information faster:
Increase the working frequency.
Using technology available, with cost considerations.
INSIDE THE 8088/86 Pipelining 8085 could fetch or execute at any given time.
The idea of pipelining in its simplest form is to allow the CPU to fetch and execute at the same time.
Pipelined Architecture
Mikroilemci Fetch 1 Dec 1 Execute 1 Megul (read) Fetch 2 Dec 2 Execute 2 Megul (write) Fetch 3
INSIDE THE 8088/86 Pipelining Intel implemented pipelining in 8088/86 by splitting the internal structure of the into two sections:
The execution unit (EU) and the bus interface unit (BIU).
These two sections work simultaneously.
Yol
Megul
Megul (a)
Megul
EU Fetch 1
Dec/Exec 1 Fetch 2
Dec/Exec 2 Fetch 3
Dec/Exec 3 Fetch 4
The BIU accesses memory and peripherals, while the EU executes instructions previously fetched.
Read
BIU
Read
Write
This works only if the BIU keeps ahead of the EU, so the BIU of the 8088/86 has a buffer, or queue
The buffer is 4 bytes long in 8088 and 6 bytes in 8086.
Yol
Megul
Megul
Megul
Megul (b)
Megul
Megul
Megul
Bus Timing: (a) 8085A microprocessor ve 8085A bus timing, (b) 8086/8088 EU, BIU and 8086/8088 bus timing.
INSIDE THE 8088/86 Pipelining If an instruction takes too long to execute, the queue is filled to capacity and the buses will sit idle In some circumstances, the microprocessor must flush out the queue.
When a jump instruction is executed, the BIU starts to fetch information from the new location in memory and information fetched previously is discarded. The EU must wait until the BIU fetches the new instruction
In computer science terminology, a branch penalty.
INSIDE THE 8088/86 Registers In the CPU, registers store information temporarily.
One or two bytes of data to be processed. The address of data.
General-purpose registers in 8088/86 processors can be accessed as either 16-bit or 8-bit registers
All other registers can be accessed only as the full 16 bits.
In a pipelined CPU, too much jumping around reduces the efficiency of a program.
INSIDE THE 8088/86 Registers (16-bit Architecture) The bits of a register are numbered in descending order, as shown:
FLAG REGISTER (16-bit Architecture) The flag register is a 16-bit register sometimes referred to as the status register.
Although 16 bits wide, only some of the bits are used.
The rest are either undefined or reserved by Intel.
FLAG REGISTER Six flags, called conditional flags, indicate some condition resulting after an instruction executes.
Many Assembly language instructions alter flag register bits & some instructions function differently based on the information in the flag register.
These six are CF, PF, AF, ZF, SF, and OF. The remaining three, often called control flags, control the operation of instructions before they are executed.
FLAG REGISTER Bits of the flag register Flag register bits used in x86 Assembly language programming, with a brief explanation each:
CF (Carry Flag) - Set when there is a carry out, from d7 after an 8-bit operation, or d15 after a 16-bit operation.
Used to detect errors in unsigned arithmetic operations.
FLAG REGISTER Bits of the flag register Flag register bits used in x86 Assembly language programming, with a brief explanation each:
ZF (Zero Flag) - Set to 1 if the result of an arithmetic or logical operation is zero; otherwise, it is cleared. SF (Sign Flag) - Binary representation of signed numbers uses the most significant bit as the sign bit.
After arithmetic or logic operations, the status of this sign bit is copied into the SF, indicating the sign of the result.
PF (Parity Flag) - After certain operations, the parity of the result's low-order byte is checked.
If the byte has an even number of 1s, the parity flag is set to 1; otherwise, it is cleared.
AF (Auxiliary Carry Flag) - If there is a carry from d3 to d4 of an operation, this bit is set; otherwise, it is cleared.
Used by instructions that perform BCD (binary coded decimal) arithmetic.
TF (Trap Flag) - When this flag is set it allows the program to single-step, meaning to execute one instruction at a time.
Single-stepping is used for debugging purposes.
FLAG REGISTER Bits of the flag register Flag register bits used in x86 Assembly language programming, with a brief explanation each:
IF (Interrupt Enable Flag) - This bit is set or cleared to enable/disable only external maskable interrupt requests. DF (Direction Flag) - Used to control the direction of string operations. OF (Overflow Flag) - Set when the result of a signed number operation is too large, causing the high-order bit to overflow into the sign bit.
Used only to detect errors in signed arithmetic operations.
FLAG REGISTER Flag register and ADD instruction Flag bits affected by the ADD instruction:
CF (carry flag); PF (parity flag); AF (auxiliary carry flag). ZF (zero flag); SF (sign flag); OF (overflow flag).
FLAG REGISTER Flag register and ADD instruction Flag bits affected by the ADD instruction:
CF (carry flag); PF (parity flag); AF (auxiliary carry flag). ZF (zero flag); SF (sign flag); OF (overflow flag).
FLAG REGISTER Flag register and ADD instruction It is important to note differences between 8- and 16-bit operations in terms of impact on the flag bits.
The parity bit only counts the lower 8 bits of the result and is set accordingly.
FLAG REGISTER Flag register and ADD instruction The carry flag is set if there is a carry beyond bit d15 instead of bit d7.
Since the result of the entire 16-bit operation is zero (meaning the contents of BX), ZF is set to high.
FLAG REGISTER Flag register and ADD instruction Instructions such as data transfers (MOV) affect no flags.
FLAG REGISTER Use of the zero flag for looping A widely used application of the flag register is the use of the zero flag to implement program loops.
A loop is a set of instructions repeated a number of times.
FLAG REGISTER Use of the zero flag for looping As an example, to add 5 bytes of data, a counter can be used to keep track of how many times the loop needs to be repeated.
Each time the addition is performed the counter is decremented and the zero flag is checked.
When the counter becomes zero, the zero flag is set (ZF = 1) and the loop is stopped.
FLAG REGISTER Use of the zero flag for looping Register CX is used to hold the counter.
BX is the offset pointer.
(SI or DI could have been used instead)
FLAG REGISTER Use of the zero flag for looping AL is initialized before the start of the loop
In each iteration, ZF is checked by the JNZ instruction
JNZ stands for "Jump Not Zero, meaning that if ZF = 0, jump to a new address. If ZF = 1, the jump is not performed, and the instruction below the jump will be executed.
FLAG REGISTER Use of the zero flag for looping JNZ instruction must come immediately after the instruction that decrements CX.
JNZ needs to check the effect of "DEC CX" on ZF.
If any instruction were placed between them, that instruction might affect the zero flag.
Flags Register
15 14 13 12 11 O 10 D 9 I 8 T 7 S 6 Z 5 4 A 3 2 P 1 0 C
Segment Registers
The programming model of the 8086 through the Core2 microprocessor including the 64-bit extensions
80286 and above contain program-invisible registers to control and operate protected memory. and other features of the microprocessor 80386 through Core2 microprocessors contain full 32-bit internal architectures. 8086 through the 80286 are fully upward-compatible to the 80386 through Core2.
Multipurpose Registers
RAX - a 64-bit register (RAX), a 32-bit register (accumulator) (EAX), a 16-bit register (AX), or as either of two 8-bit registers (AH and AL). RBX, addressable as RBX, EBX, BX, BH, BL. RCX, as RCX, ECX, CX, CH, or CL. RDX, as RDX, EDX, DX, DH, or DL. a (data) general-purpose register RBP, as RBP, EBP, or BP. points to a memory (base pointer) location for memory data transfers RDI addressable as RDI, EDI, or DI. often addresses (destination index) string destination data for the string instructions RSI used as RSI, ESI, or SI. the (source index) register addresses source string data for the string instructions like RDI, RSI also functions as a generalpurpose register
Special-Purpose Registers
Include RIP, RSP, and RFLAGS
R8 - R15 found in the Pentium 4 and Core2 if 64-bit extensions are enabled. data are addressed as 64-, 32-, 16-, or 8-bit sizes and are of general purpose Most applications will not use these registers until 64-bit processors are common. the 8-bit portion is the rightmost 8-bit only bits 8 to 15 are not directly addressable as a byte segment registers include CS, DS, ES, SS, FS, and GS
The EFLAG and FLAG register counts for the entire 8086 and Pentium microprocessor family
Lecture Outline
1. Brief History of the x86 Family 2. Typical Pins of the x86 Processor 3. Internal Processor Architecture
4. Memory Architecture
5. I/O Operations 6. The Microprocessor-Based Personal Computer System
Flags never change for any data transfer or program control operation. Some of the flags are also used to control features found in the microprocessor.
Segments and Offsets All real mode memory addresses must consist of a segment address plus an offset address.
segment address defines the beginning address of any 64K-byte memory segment offset address selects any location within the 64K byte memory segment
80286 and above operate in either the real or protected mode. Real mode operation allows addressing of only the first 1M byte of memory spaceeven in Pentium 4 or Core2 microprocessor.
the first 1M byte of memory is called the real memory, conventional memory, or DOS memory system
Figure shows how the segment plus offset addressing scheme selects a memory location.
The real mode memory-addressing scheme, using a segment address plus an offset
also shows how an offset address, called a displacement, of F000H selects location 1F000H in the memory
Segment and Offset Addressing Scheme Allows Relocation Segment plus offset addressing allows DOS programs to be relocated in memory. A relocatable program is one that can be placed into any area of memory and executed without change. Relocatable data can be placed in any area of memory and used without any change to the program. Because memory is addressed within a segment by an offset address, the memory segment can be moved to any place in the memory system without changing any of the offset addresses. Only the contents of the segment register must be changed to address the program in the new area of memory.
PROGRAM SEGMENTS A typical x86 assembly language program consists of at least 3 segments: A code segment - which contains the Assembly language instructions that perform the tasks that the program was designed to accomplish. A data segment - used to store information (data) to be processed by the instructions in the code segment. A stack segment - used by the CPU to store information temporarily.
PROGRAM SEGMENTS Origin and definition of the segment A segment is an area of memory that includes up to 64K bytes, and begins on an address evenly divisible by 16 (such an address ends in 0H)
8085 addressed a maximum of 64K of physical memory, since it had only 16 pins for address lines. (216 = 64K)
Limitation was carried into 8088/86 design for compatibility.
In 8085 there was 64K bytes of memory for all code, data, and stack information.
In 8088/86 there can be up to 64K bytes in each category.
The code segment, data segment, and stack segment.
PROGRAM SEGMENTS Logical Address and Physical Address In literature concerning 8086, there are three types of addresses mentioned frequently:
The physical address - the 20-bit address actually on the address pins of the 8086 processor, decoded by the memory interfacing circuitry.
This address can have a range of 00000H to FFFFFH. An actual physical location in RAM or ROM within the 1 mb memory range.
PROGRAM SEGMENTS Code Segment To execute a program, 8086 fetches the instructions (opcodes and operands) from the code segment.
The logical address of an instruction always consists of a CS (code segment) and an IP (instruction pointer), shown in CS:IP format. The physical address for the location of the instruction is generated by shifting the CS left one hex digit, then adding it to the IP.
IP contains the offset address.
The offset address - a location in a 64K-byte segment range, which can can range from 0000H to FFFFH. The logical address - which consists of a segment value and an offset address.
The resulting 20-bit address is called the physical address since it is put on the external physical address bus pins.
PROGRAM SEGMENTS Code Segment Assume values in CS & IP as shown in the diagram:
The offset address contained in IP, is 95F3H. The logical address is CS:IP, or 2500:95F3H. The physical address will be 25000 + 95F3 = 2E5F3H
The microprocessor will retrieve the instruction from memory locations starting at 2E5F3.
Since IP can have a minimum value of 0000H and a maximum of FFFFH, the logical address range in this example is 2500:0000 to 2500:FFFF.
This means that the lowest memory location of the code segment above will be 25000H (25000 + 0000) and the highest memory location will be 34FFFH (25000 + FFFF).
PROGRAM SEGMENTS Code Segment What happens if the desired instructions are located beyond these two limits?
The value of CS must be changed to access those instructions.
PROGRAM SEGMENTS Code Segment - Logical/Physical Address In the next code segment, CS and IP hold the logical address of the instructions to be executed.
The following Assembly language instructions have been assembled (translated into machine code) and stored in memory. The three columns show the logical address of CS:IP, the machine code stored at that address, and the corresponding Assembly language code. The physical address is put on the address bus by the CPU to be decoded by the memory circuitry.
Instruction "MOV AL,57" has a machine code of B057. B0 is the opcode and 57 is the operand.
Instruction "MOV AL,57" has a machine code of B057. B0 is the opcode and 57 is the operand. The byte at address 1132:0100 contains B0, the opcode for moving a value into register AL. Address 1132:0101 contains the operand to be moved to AL.
PROGRAM SEGMENTS data segment Assume a program to add 5 bytes of data, such as 25H, 12H, 15H, 1FH, and 2BH.
One way to add them is as follows:
PROGRAM SEGMENTS data segment In x86 microprocessors, the area of memory set aside for data is called the data segment.
The data segment uses register DS and an offset value. DEBUG assumes that all numbers are in hex.
No "H" suffix is required.
In the program above, the data & code are mixed together in the instructions.
If the data changes, the code must be searched for every place it is included, and the data retyped From this arose the idea of an area of memory strictly for data
The next program demonstrates how data can be stored in the data segment and the program rewritten so that it can be used for any set of data.
PROGRAM SEGMENTS data segment Assume data segment offset begins at 200H.
The data is placed in memory locations:
PROGRAM SEGMENTS data segment The offset address is enclosed in brackets, which indicate that the operand represents the address of the data and not the data itself.
If the brackets were not included, as in "MOV AL,0200", the CPU would attempt to move 200 into AL instead of the contents of offset address 200. decimal.
This program will run with any set of data. Changing the data has no effect on the code.
PROGRAM SEGMENTS data segment If the data had to be stored at a different offset address the program would have to be rewritten
A way to solve this problem is to use a register to hold the offset address, and before each ADD, increment the register to access the next byte.
8088/86 allows only the use of registers BX, SI, and DI as offset registers for the data segment
The term pointer is often used for a register holding an offset address.
PROGRAM SEGMENTS data segment logical/physical address The physical address for data is calculated using the same rules as for the code segment.
The physical address of data is calculated by shifting DS left one hex digit and adding the offset value, as shown in Examples 1-2, 1-3, and 1-4.
PROGRAM SEGMENTS little endian convention Previous examples used 8-bit or 1-byte data.
What happens when 16-bit data is used?
The low byte goes to the low memory location and the high byte goes to the high memory address.
Memory location DS:1500 contains F3H. Memory location DS:1501 contains 35H.
(DS:1500 = F3 DS:1501 = 35)
PROGRAM SEGMENTS little endian convention In the big endian method, the high byte goes to the low address.
In the little endian method, the high byte goes to the high address and the low byte to the low address.
PROGRAM SEGMENTS little endian convention All Intel microprocessors and many microcontrollers use the little endian convention.
Freescale (formerly Motorola) microprocessors, along with some other microcontrollers, use big endian.
PROGRAM SEGMENTS extra segment (ES) ES is a segment register used as an extra data segment.
In many normal programs this segment is not used. Use is essential for string operations.
THE STACK what is a stack? why is it needed? The stack is a section of read/write memory (RAM) used by the CPU to store information temporarily.
The CPU needs this storage area since there are only a limited number of registers.
There must be some place for the CPU to store information safely and temporarily.
THE STACK how stacks are accessed The stack is a section of RAM, so there must be registers inside the CPU to point to it.
The SS (stack segment) register. The SP (stack pointer) register.
These registers must be loaded before any instructions accessing the stack are used.
THE STACK how stacks are accessed The x86 stack pointer register (SP) points at the current memory location used as the top of the stack.
As data is pushed onto the stack it is decremented. As data is popped off the stack into the CPU, it is incremented.
Every register inside the x86 can be stored in the stack, and brought back into the CPU from the stack memory, except segment registers and SP.
Storing a CPU register in the stack is called a push. Loading the contents of the stack into the CPU register is called a pop.
When an instruction pushes or pops a generalpurpose register, it must be the entire 16-bit register.
One must code "PUSH AX".
There are no instructions such as "PUSH AL" or "PUSH AH".
THE STACK how stacks are accessed The SP is decremented after the push is to make sure the stack is growing downward from upper addresses to lower addresses.
The opposite of the IP. (instruction pointer)
THE STACK pushing onto the stack As each PUSH is executed, the register contents are saved on the stack and SP is decremented by 2.
To ensure the code section & stack section of the program never write over each other, they are located at opposite ends of the RAM set aside for the program.
They grow toward each other but must not meet.
If they meet, the program will crash.
THE STACK pushing onto the stack For every byte of data saved on the stack, SP is decremented once.
Since the push is saving the contents of a 16-bit register, it decrements twice.
THE STACK pushing onto the stack In the x86, the lower byte is always stored in the memory location with the lower address.
24H, the content of AH, is saved in the memory location with the address 1235. AL is stored in location 1234.
THE STACK popping the stack With every pop, the top 2 bytes of the stack are copied to the x86 CPU register specified by the instruction & the stack pointer is incremented twice.
While the data actually remains in memory, it is not accessible, since the stack pointer, SP is beyond that point.
THE STACK logical vs physical stack address The exact physical location of the stack depends on the value of the stack segment (SS) register and SP, the stack pointer.
To compute physical addresses for the stack, shift left SS, then add offset SP, the stack pointer register.
THE STACK a few more words about x86 segments Can a single physical address belong to many different logical addresses?
Observe the physical address value of 15020H.
Many possible logical addresses represent this single physical address:
THE STACK a few more words about x86 segments When adding the offset to the shifted segment register results in an address beyond the maximum allowed range of FFFFFH, wrap-around will occur.
An illustration of the dynamic behavior of the segment and offset concept in the 8086 CPU.
THE STACK overlapping In calculating the physical address, it is possible that two segments can overlap.
Segment Address Sources for Different Operations Operation Type Instruction Fetch Stack operation Data operation (aadakiler hari) String source String destination BP (used as base register) Segment CS SS DS DS ES SS Alternative Segment none none CS, ES or SS CS, ES veya SS none CS, ES or SS Offset IP SP several SI DI several
Segmented Addressing
00000h
FFFFFh
FFFFFh
1M byte
00005h 00003h 00001h
512 K byte
00000h
Dk Hafza
The base address of the descriptor indicates the starting location of the memory segment. the paragraph boundary limitation is removed in protected mode segments may begin at any address The G, or granularity bit allows a segment length of 4K to 4G bytes in steps of 4K bytes. 32-bit offset address allows segment lengths of 4G bytes 16-bit offset address allows segment lengths of 64K bytes.
Operating systems operate in a 16- or 32-bit environment. DOS uses a 16-bit environment. Most Windows applications use a 32-bit environment called WIN32. MSDOS/PCDOS & Windows 3.1 operating systems require 16-bit instruction mode. Instruction mode is accessible only in a protected mode system such as Windows Vista.
The access rights byte for the 80286 through Core2 descriptor
The contents of a segment register during protected mode operation of the 80286 through Core2 microprocessors
Descriptors are chosen from the descriptor table by the segment register. register contains a 13-bit selector field, a table selector bit, and requested privilege level field The TI bit selects either the global or the local descriptor table. Requested Privilege Level (RPL) requests the access privilege level of a memory segment. If privilege levels are violated, system normally indicates an application or privilege level violation
Using the DS register to select a description from the global descriptor table. In this example, the DS register accesses memory locations 00100000H001000FFH as a data segment.
Figure 29 shows how the segment register, containing a selector, chooses a descriptor from the global descriptor table. The entry in the global descriptor table selects a segment in the memory system. Descriptor zero is called the null descriptor, must contain all zeros, and may not be used for accessing memory.
Program-Invisible Registers
Global and local descriptor tables are found in the memory system. To access & specify the table addresses, 80286 Core2 contain program-invisible registers. not directly addressed by software Each segment register contains a program-invisible portion used in the protected mode. often called cache memory because cache is any memory that stores information
When a new segment number is placed in a segment register, the microprocessor accesses a descriptor table and loads the descriptor into the program-invisible portion of the segment register. held there and used to access the memory segment until the segment number is changed This allows the microprocessor to repeatedly access a memory segment without referring to the descriptor table. hence the term cache
The GDTR (global descriptor table register) and IDTR (interrupt descriptor table register) contain the base address of the descriptor table and its limit. when protected mode operation desired, address of the global descriptor table and its limit are loaded into the GDTR The location of the local descriptor table is selected from the global descriptor table. one of the global descriptors is set up to address the local descriptor table
Memory Paging
To access the local descriptor table, the LDTR (local descriptor table register) is loaded with a selector. selector accesses global descriptor table, & loads local descriptor table address, limit, & access rights into the cache portion of the LDTR The TR (task register) holds a selector, which accesses a descriptor that defines a task. a task is most often a procedure or application Allows multitasking systems to switch tasks to another in a simple and orderly fashion. The memory paging mechanism allows any physical memory location to be assigned to any linear address. Iinear address is defined as the address generated by a program. Physical address is the actual memory location accessed by a program. With memory paging, the linear address is invisibly translated to any physical address.
Paging Registers
The paging unit is controlled by the contents of the microprocessors control registers. Beginning with Pentium, an additional control register labeled CR4 controls extensions to the basic architecture. See Figure 211 for the contents of control registers CR0 through CR4.
The format for the linear address (a) and a page directory or page table entry (b)
The linear address, as generated by software, is broken into three sections that are used to access the page directory entry, page table entry, and memory page offset address. Figure 212 shows the linear address and its makeup for paging. When the program accesses a location between 00000000H and 00000FFFH, the microprocessor physically addresses location 00100000H 00100FFFH.
DOS and EMM386.EXE use page tables to redefine memory between locations C8000HEFFFFH as upper memory blocks. done by repaging extended memory to backfill conventional memory system to allow DOS access to additional memory Each entry in the page directory corresponds to 4M bytes of physical memory. Each entry in the page table repages 4K bytes of physical memory. Windows also repages the memory system.
The page directory, page table 0, and two memory pages. Note how the address of page 000C8000000C9000 has been moved to 0011000000110FFF.
Flat Mode Memory A flat mode memory system is one in which there is no segmentation.
does not use a segment register to address a location in the memory
First byte address is at 00 0000 0000H; the last location is at FF FFFF FFFFH.
address is 40-bits
The segment register still selects the privilege level of the software.
Lecture Outline
1. Brief History of the x86 Family 2. Typical Pins of the x86 Processor 3. Internal Processor Architecture 4. Memory Architecture
A tipical
5. I/O Operations
6. The Microprocessor-Based Personal Computer System
Two Addressing Modes 1) Immediate Port Address - can only be 1 byte - can only address ports 00h through FFh 2) Port Address Present in DX - can address all ports 0000h through FFFFh - can only use DX for port addresses - can only use AL, AX, EAX for port data
in in in in out out
; ; ; ; ; ;
al gets 1 byte from port 40h ax gets 2 bytes from port ffh ax gets 1 byte from port address in dx eax gets 4 bytes from port addr. in dx send contents of al to port 80h send contents of eax to port addr. in dx
Lecture Outline
1. Brief History of the x86 Family 2. Typical Pins of the x86 Processor 3. Internal Processor Architecture 4. Memory Architecture 5. I/O Operations
IBM PC MEMORY MAP All x86 CPUs in real mode provide 20 address bits.
Maximum memory access is one megabyte.
Type of microprocessor present determines whether an extended memory system exists. First 1M byte of memory often called the real or conventional memory system.
Intel microprocessors designed to function in this area using real mode operation
The 20 system address bus lines, A0A19, can take the lowest value of all 0s to the highest value of all 1s in binary.
Converted to hex, an address range 00000H to FFFFFH.
Fig. 10-14 20 Bit Address Range in Real Mode for x86 CPUs
IBM PC MEMORY MAP BIOS Data Area The BIOS data area is used by BIOS to store some extremely important system information.
The operating system navigates the system hardware with the help of information stored in the BIOS data area.
More About RAM In the early 80s, most PCs came with 64K to 256K bytes of RAM, more than adequate at the time
Users had to buy memory to expand up to 640K.
For this reason, we do not assign any values for the CS, DS, and SS registers.
Such an assignment means specifying an exact physical address in the range 000009FFFFH, and this is beyond the knowledge of the user.
Function of BIOS ROM There must be some permanent (nonvolatile) memory to hold the programs telling the CPU what to do when the power is turned on
This collection of programs is referred to as BIOS.
64K bytes from location F0000HFFFFFH are used by BIOS (basic input/output system) ROM.
Some of the remaining space is used by various adapter cards (such as the network card), and the rest is free.
IBM PC MEMORY MAP Video Display RAM (VDR) map To display information on the monitor of the PC, the CPU must first store it in video display RAM (VDR).
The video controller displays VDR contents on the screen.
Address of the VDR must be within the CPU address range.
IBM PC MEMORY MAP ROM Address and Cold Boot in the PC When power is applied to a CPU it must wake up at an address that belongs to ROM.
The first code executed by the CPU must be stored in nonvolatile memory. On RESET, 8088 starts to fetch information from CS:IP of FFFF:0000.
Physical address FFFF0H.
In the x86, from A0000 to BFFFFH, a total of 128K bytes of addressable memory is allocated for video.
As the microprocessor starts to fetch & execute instructions from FFFF0H, there must be an opcode in that ROM location.
The CPU finds the opcode for the FAR jump, and the target address of the JUMP.
80286 through the Core2 contain the TPA (640K bytes) and system area (384K bytes). also contain extended memory often called AT class machines The PS/l and PS/2 by IBM are other versions of the same basic memory design. Also referred to as ISA (industry standard architecture) or EISA (extended ISA). The PS/2 referred to as a micro-channel architecture or ISA system. depending on the model number
Pentium and ATX class machines feature addition of the PCI (peripheral component interconnect) bus.
now used in all Pentium through Core2 systems
Extended memory up to 15M bytes in the 80286 and 80386SX; 4095M bytes in 80486 80386DX, Pentium microprocessors. The Pentium Pro through Core2 computer systems have up to 1M less than 4G or 1 M less than 64G of extended memory. Servers tend to use the larger memory map.
Many 80486 systems use VESA local, VL bus to interface disk and video to the microprocessor at the local bus level.
allows 32-bit interfaces to function at same clocking speed as the microprocessor recent modification supporting 64-bit data bus has generated little interest
Data transfer rates are 10 Mbps for USB1. Increase to 480 Mbps in USB2.
AGP (advanced graphics port) for video cards. The port transfers data between video card and microprocessor at higher speeds.
66 MHz, with 64-bit data path
Latest new buses are serial ATA interface (SATA) for hard disk drives; PCI Express bus for the video card. The SATA bus transfers data from PC to hard disk at rates of 150M bytes per second; 300M bytes for SATA-2.
serial ATA standard will eventually reach speeds of 450M bytes per second
The TPA
The transient program area (TPA) holds the DOS (disk operating system) operating system; other programs that control the computer system.
the TPA is a DOS concept and not really applicable in Windows also stores any currently active or inactive DOS application programs length of the TPA is 640K bytes
Figure 18 The memory map of the TPA in a personal computer. (Note that this map will vary between systems.)
DOS memory map shows how areas of TPA are used for system programs, data and drivers.
also shows a large area of memory available for application programs hexadecimal number to left of each area represents the memory addresses that begin and end each data area
Interrupt vectors access DOS, BIOS (basic I/O system), and applications. Areas contain transient data to access I/O devices and internal features of the system.
these are stored in the TPA so they can be changed as DOS operates
The IO.SYS loads into the TPA from the disk whenever an MSDOS system is started. IO.SYS contains programs that allow DOS to use keyboard, video display, printer, and other I/O devices often found in computers. The IO.SYS program links DOS to the programs stored on the system BIOS ROM.
Installable drivers control or drive devices or programs added to the computer system. DOS drivers normally have an extension of .SYS; MOUSE.SYS. DOS version 3.2 and later files have an extension of .EXE; EMM386.EXE.
Though not used by Windows, still used to execute DOS applications, even with Win XP. Windows uses a file called SYSTEM.INI to load drivers used by Windows. Newer versions of Windows have a registry added to contain information about the system and the drivers used. You can view the registry with the REGEDIT program.
COMMAND.COM (command processor) controls operation of the computer from the keyboard when operated in the DOS mode. COMMAND.COM processes DOS commands as they are typed from the keyboard. If COMMAND.COM is erased, the computer cannot be used from the keyboard in DOS mode.
never erase COMMAND.COM, IO.SYS, or MSDOS.SYS to make room for other software your computer will not function
First area of system space contains video display RAM and video control programs on ROM or flash memory.
area starts at location A0000H and extends to C7FFFH size/amount of memory depends on type of video display adapter attached
Memory at B0000HBFFFFH stores text data. The video BIOS on a ROM or flash memory, is at locations C0000HC7FFFH.
contains programs to control DOS video display
Expanded memory system allows a 64K-byte page frame of memory for use by applications.
page frame (D0000H - DFFFFH) used to expand memory system by switching in pages of memory from EMS into this range of memory addresses
System BIOS ROM is located in the top 64K bytes of the system area (F0000HFFFFFH).
controls operation of basic I/O devices connected to the computer system does not control operation of video
Locations E0000HEFFFFH contain cassette BASIC on ROM found in early IBM systems.
often open or free in newer computer systems
The first part of the system BIOS (F0000H F7FFFH) often contains programs that set up the computer. Second part contains procedures that control the basic I/O system.
Windows Systems
Modern computers use a different memory map with Windows than DOS memory maps. The Windows memory map in Figure 110 has two main areas; a TPA and system area. The difference between it and the DOS memory map are sizes and locations of these areas.
TPA is first 2G bytes from locations 00000000H to 7FFFFFFFH. Every Windows program can use up to 2G bytes of memory located at linear addresses 00000000H through 7FFFFFFFH. System area is last 2G bytes from 80000000H to FFFFFFFFH.
Memory system physical map is much different. Every process in a Windows Vista, XP, or 2000 system has its own set of page tables. The process can be located anywhere in the memory, even in noncontiguous pages. The operating system assigns physical memory to application.
if not enough exists, it uses the hard disk for any that is not available
I/O Space
I/O devices allow the microprocessor to communicate with the outside world. I/O (input/output) space in a computer system extends from I/O port 0000H to port FFFFH.
I/O port address is similar to a memory address instead of memory, it addresses an I/O device
Figure 111 shows the I/O map found in many personal computer systems.
Access to most I/O devices should always be made through Windows, DOS, or BIOS function calls. The map shown is provided as a guide to illustrate the I/O space in the system.
The area below I/O location 0400H is considered reserved for system devices Area available for expansion extends from I/O port 0400H through FFFFH. Generally, 0000H - 00FFH addresses main board components; 0100H - 03FFH handles devices located on plug-in cards or also on the main board. The limitation of I/O addresses between 0000 and 03FFH comes from original standards specified by IBM for the PC standard.
Acknowledgement
The slides have been based in-part upon original slides of a number of books including: The x86 PC: Assembly Language, Design, and Interfacing, 5/E, M. A. Mazidi, J. G. Mazidi, D. Causey, Prentice Hall, 2010. Intel Microprocessors, 8th Ed., B. B. Brey, Prentice Hall, 2009. Mikroilemciler ve Bilgisayarlar, 6. Basm, H. Gmkaya, ALFA, 2011.
225