Computer Architecture Basicssss (1

Computer architecture basics
1. What is pipelining?
Pipelining is essentially an implementation technique in which multiple
instructions are being overlapped in execution with each overlapped instruction in
diIIerent stages oI its execution.

2. What are the five stages of a DLX pipeline?
The Iive stages oI a DLX pipeline include the instruction Ietch stage and then the
instruction decode stage Iollowed by the execution stage and then the memory
access stage and Iinally the write back stage.

3. What is a dependency in a pipeline?
Each instruction in pipeline may be dependent on each other and this dependency
will lead to hazards.

4. What is a hazard in a pipeline?
There are situations in pipeline where the next instruction cannot execute unless
the completion oI the previous instruction. These events are called as hazard

5. Explain the different types of hazards?
The diIIerent types oI hazards are the structural hazards, the data hazards and the
control hazards.
Structural hazards occurs when the hardware cannot support the combination oI
instructions which we want to execute in the same clock cycle
Data hazards on the other hand occur when the data used in the Iollowing
instruction depends on the previous instruction and cannot be executed unless
completion oI the previous instruction
Control hazards on the other hand arise Irom the need to make a decision on the
result oI instruction while others are being executed.

6. Explain RAW, WAR and WAW hazards?
These are essentially the three diIIerent types oI data hazards, which are possible
in a pipeline. The RAW hazards means Read aIter write, the instruction J cannot
read unless instruction I has been written. The WAW hazard means write aIter
write, the instruction J cannot write unless instruction I has been written. The
WAR hazard means Write aIter read, instruction J cannot be write unless I has
read all its sources.

7. What are the names given to the various data Hazards?
The RAW hazard is called as true dependency, the WAR hazard is called as anti
dependency and the WAW hazard is called as the output dependency.

8. What is stalling in a pipeline and when does it occur?
Stalling essentially occurs in a pipeline when we encounter any kind oI hazards in
the pipeline. When a pipeline is said to be stalled no new instructions are being
Ietched in the stalled clock cycle. All instruction issued earlier will continue
execution and instruction issued later than the stall we stop execution

9. What are the impact of hazards on a pipeline and how will you overcome
each of the three different hazards?
Hazards limit the perIormance oI the pipeline and each oI the three diIIerent kinds
oI hazards has to be handled separately. Structural hazards can be eliminated by
using more hardware resources. Data hazards can be eliminated by data
Iorwarding and control hazards can be eliminated by early evaluation and branch
prediction

10. What is throughput of a pipeline?
Pipeline throughput is deIined as the number oI instruction, which would be
executed per second.

11. What is latency of a pipeline?
Latency is deIined as the time taken to execute one single instruction in a pipeline

12. What is the effect of pipelining on latency and throughput of the machine?
Pipelining helps, the throughput oI the entire workload and it does not help the
latency oI a single task. The potential speed up in a pipeline example will be equal
to the number oI pipeline stages.

13. What are the features of the CISC?
CISC is essentially complex instruction set computer. Variable length instruction
Iormat, large number oI addressing modes and small number oI general-purpose
register are the typical characteristics oI CISC. The main advantage oI the CISC
design philosophy is that it is less expensive and disadvantage is that diIIerent
instruction take diIIerent amount oI time to execute and many instructions would
be executed Irequently. VAX and Intel 80x86 computers use the CISC.

14. What are the features of RISC?
RISC is essentially reduced instruction set computer and IBM was the Iirst
company to use this technology. The main advantages were Iaster and use oI
simpler hardware and shorter design cycle. The main disadvantage was that
debugging would be diIIicult because oI instruction scheduling. It has Iew
addressing modes and many general purpose registers and Iixed length instruction
Iormat.

15. Explain what a superpipelined processor is, what its advantages are and how
it differs from the DLX pipeline?
A superpipelined processor is one with a large number oI pipeline stages (~÷8)
and a Iaster clock than a convential 5 stage DLX pipeline. Its advantage is that the
processor clock and hence the pipeline step time is Iaster. This means as long as
the pipeline is running, we get more instructions completed per second.

16. What is DLX?
DLX is essentially an instruction set which is an subset oI the MIPS instruction
set architecture.

17. What is compiler scheduling for data hazards?
Instead oI just stalling the pipeline, we can re order the instruction code to
eliminate the stalls in the pipeline. This technique is called as compiler
scheduling.

18. What is the difference between Big Endian and Little Endian?
This classiIication oI the computer essentially arises Irom the two diIIerent ways
in which the computer stores bytes oI a multiple byte number. In the Little Endian
Iormat, the lower order bit oI the number is stored in the lowest address and the
higher order bit is stored in the highest address. In the Big Endian, Iormat the
higher order bit is stored in the lowest address and then the lower ordered bit in
highest address. Little Endian, the LSB comes Iirst and in Big Endian the MSB
comes Iirst.

19. List computers where Little Endian and Big Endian is being used?
Intel Processors in PC`s use the little Endian and most oI the UNIX machine are
Big Endian

20. What are the various advantages of pipelining?
The various advantages oI pipelining include eIIicient use oI the hardware; higher
Irequency oI operation is possible and quicker time oI execution oI many
instructions

21. List any disadvantage of pipelining?
Pipelining involves adding extra hardware to the chip and requires large
instruction cache to keep the pipeline Iull and to minimize stalling.

22. What are the five main components of a computer?
The Iive main components oI a computer include the processor, which is made oI
data path and the control, and then we have the memory component, the input,
and the output component. These are the Iive main components oI the
components oI the computer

23. Where is pipelining preferred and why?
Pipelining is more suited Ior RISC architecture and is not suited Ior CISC
architecture. This is because all the instructions are 32 bit long and hence on
every clock cycle, only one word has to be read and this is not the case in CISC
because oI variable instruction length. Also MIPS is a register-to-register
architecture and hence memory reIerence elements cannot be contained and this
keeps the pipeline simpler and shorter. However, CISC architecture has memory
memory instructions and hence additional stages would be required in the pipeline
to Iirst access the memory and read it. Hence pipelining is better Ior the MIPS
architecture, which is reduced instruction set architecture.

24. What is cycle time of a pipeline?
The cycle time oI a pipeline is deIined as the time taken Ior complete execution oI
one single instruction. It is also called as the latency oI the pipeline

25. What is the speed up of a pipeline and what is the effect of unbalanced
pipeline?
The speed up oI a pipeline is deIined as the ratio oI the time taken to execute the
instructions in a unpipelined processor to the time taken to execute these
instructions in an pipelined processor. The number oI stages oI a pipeline is equal
to the ideal speed up oI the pipeline. II the delay through each pipeline stage were
unbalanced then the speed up oI the pipeline would decrease. So creating a
pipeline with balanced stage is one oI the most diIIicult tasks.

26. Draw the schematic of a modern Von Nueman processor?

The above diagram is the basic schematic oI a Von Nueman processor. The
processor is the active part oI the computer, which is responsible Ior data
manipulating decision-making. The processor is made oI 2 components which are
the data path and the control. The data path is the hardware that perIorms all the
operations and control is the hardware that tells the data path what to do.

27. What is the difference between single cycle and multi cycle implementation
of a data path?
In a single cycle implementation, the clock period is dependent on the length oI
the longest instruction to be executed. The load instruction takes Iive steps to be
executed and hence the clock period is determined by the load instruction and
now when we want to execute and R type instruction, which would require only
Iour steps, we have a period where it is idle or time is being wasted.
In multi cycle implementation, the clock period is determined by the longest step
oI an instruction and not the longest instruction itselI. The CPI is exactly one Ior
single cycle implementation and it is greater than one Ior multi cycle
implementation. Hence, the multicycle has overall better perIormance.

28. What is Cache Memory?
A cache is a small amount oI high-speed memory close to the CPU that
temporarily stores recently accessed data. When the CPU needs a certain piece oI
data, the cache is checked Iirst, and iI it is there that copy is used, otherwise main
memory is accessed. The main idea that makes cache a success is that it uses the
principles oI locality.

29. What are the principles of locality?
There are essentially two principles oI locality based on which memories are
designed. The Iirst one is the Temporal Locality, which is the locality in time.
Here iI an item is reIerenced, then it will tend to be reIerenced again. The next is
spatial locality, which is the locality in space. II an item is reIerenced then the
items whose addresses are close by will tend to be reIerenced soon.

30. Give Examples of temporal and Spatial Locality?
Programs which contain loops will be executed repeatedly and exhibit temporal
locality. Instructions are normally executed in sequence and they will exhibit
spatial locality. Access to elements oI an array will also exhibit spatial locality

31. What is memory hierarchy?
Memory hierarchy is essentially a structure that uses multiple levels oI memories
and as the distance Irom the CPU increases the size oI the memory and the access
time oI the memory increases.

32. What is the need for memory hierarchy?
The DRAM and the SRAM are the available memory technologies and each has
variations in cost and access time. SRAM has Iast access time but is very
expensive. DRAM is cheaper but has large access time. Because oI these
diIIerences in cost and access time, it is more advantageous to build memory in
hierarchy oI levels. The Iaster memory is closer to the processor and the slower
one is away Irom the processor. The main goals is to present the user with as
much memory as is available in the cheapest technology ,while providing the
speed oIIered by the Iastest memory.
33. What is hit rate and miss rate?
Hit rate is deIined as the Iraction oI the memory access that is Iound in the upper
level oI memory. It is oIten used as measure oI perIormance oI the memory
hierarchy.
Miss rate is deIined as the Iraction oI memory access that is not Iound in the
upper level oI memory hierarchy.

34. What is a block in a memory?
Block is deIined as the minimum unit oI inIormation that can be either present or
not present in the two-level hierarchy.

35. What is hit time?
Hit time is deIined as the time required to access a level oI memory hierarchy,
including the time needed to determine whether the access is a hit or a miss.

36. What is miss penalty?
Miss penalty is deIined as the time required to Ietch a block into a level oI the
memory hierarchy Irom a lower level, including the time required to access the
block , transmit Irom one level to the other and the time required to insert it into
the level that experienced the miss.

37. What is DMA?
DMA essentially reIers to direct memory access. Data is transIerred Irom a device
to the main memory in either direction without the use oI the Micrprocessor in
this technique. The CPU is bypassed and the DMA is a special machine, which
transIers data directly Irom a Iast I/O device to the memory

38. Why and how does the DMA operate?
The time involved in the transIer oI data Irom the I/O device to the CPU and then
to the memory will be very long in certain cases. To make the transIer oI data Iast
we have the DMA device. This device can operate in the burst mode oI operation.
Here the DMA machine takes control over the bus and then transIers the data to
the memory at top speed and then restores control back to the CPU. In the
duration where the DMA controls the bus, the CPU stops dead.

39. What are the various addressing modes?
The various addressing modes are as Iollows
! Register: Used when values are in register: Add R4,R3 : R4 · R4 ¹ R3
! Immediate: Used Ior constants : Add R4, #3 : R4 · R4 ¹ 3
! Displacement: Used Ior accessing local variables: Add R4, 100(R1)
R4 · R4 ¹ M|100¹R1|
! Direct: UseIul Ior accessing static data: Add R1, (1001)
R1 · R1 ¹ M|1001|
! Auto-increment: UseIul Ior stepping through arrays in a loop.
Add R1, (R2)¹ : R1 · R1 ¹M|R2| ,, R2 · R2 ¹ !
! Auto-Decrement: Add R1,-(R2) : R2 · R2-! ## R1 · R1 ¹ M|R2|

40. What are the different classifications of instruction sets of a machine?
The diIIerent types oI instruction sets Ior various machines are as Iollows
! Stack: The operands are placed implicitly on top oI the stack.
Push A
Push B
Add
Pop C
! Accumulator: One operand is implicitly the accumulator
Load A
Add B
Store C
! Registers: All operands are explicitly either registers or memory locations
Load R1, A
Add R1,B
Store C, R1

41. Which is the most commonly used architecture?
The general-purpose register machines have been used because they are Iaster
than memory , they are easier Ior complier use and can be used eIIectively. This is
the third type oI architecture in the previous question.

42. How are general-purpose register machines classified?
There are two ways by which they are classiIied:
! Based on whether an instruction has 2 or 3 operands.
ADD R3, R1, R2
ADD R1, R2
! Based on how many oI the operands be memory addressed in ALU
instruction
Register- Register (Load/Store) : ADD R3, R1, R2
Register Memory : ADD R1, A :: R1 · R1 ¹ A
Memory Memory : ADD C, A, B :: C · A ¹ B

43. What are pipeline latches?
Pipeline Latches are registers located between pipeline stages, used to pass
inIormation between stages. Without them, data would be overwritten in the next
stage during advancement oI instructions.

44. Assuming ideal conditions, if I have a pipelined machine with n stages, how
fast does a pipelined instruction execute compared to the same instruction on
an identical machine that is not pipelined?
It will execute itselI n times Iaster in the pipelined stage

45. We have a non-pipelined computer that previously took 2.3 microseconds to
execute an instruction, and now it has a pipeline with three stages that take
.5, .8, and .9 microseconds each. What is the speedup?
Speed up is 2.3/.9

46. What is a Cache miss?
Cache miss is deIined as a request Ior data Irom the cache that cannot be Iilled
because the data is not present in the cache

47. How is a Cache miss handled?
When a cache miss occurs, the Iollowing steps are to be taken
! Send the original PC back to the main memory
! Instruct the main memory to perIorm a read and wait Ior the memory to
complete its access
! Write the cache entry and restart the execution oI the instruction, which
will reIetch the instruction and will result in a hit this time.

48. What is the write-through in cache memory?
Write-through in cache memory is essentially a scheme, which automatically
updates both the cache and the memory ensuring that the data is always consistent
between the two. The main disadvantage oI this write-through scheme is that it
takes a lot oI time Ior the data to be written into the memory causing the processor
to slow down considerably.

49. What is write buffer?
Write buIIer is a queue that holds data while the data are waiting to be written to
the memory

50. What is write back cache?
Write back cache is a scheme that handles write by updating values only to the
block in the cache, then writing the modiIied block to the lower level oI memory
hierarchy only when the block is replaced.

51. What are the difference between the write through and write back?
The write back scheme can improve perIormance and is Iaster when compared to
the write through scheme as the write through scheme involves writing data to the
memory that takes time and slows down the processor. The write back is more
complex than the write through scheme to implement. In write-back, since Iew
writes to the next memory level are required it uses less memory bandwidth, but
the write through uses large memory bandwidth.

52. What is Cache coherency?
When a cache value and the next level in the hierarchy do not have the same value
Ior a given address, they are said to be incoherent. For CPU caches, incoherency
can also occur Irom the lower side iI an I/O device writes to main memory but not
the cache
53. What is Snoopy cache?
Snoopy caches are used in multiprocessor, local cache architectures to maintain
cache coherency in situations where the local caches may contain copies oI data.
Snoopy caches "snoop" the bus, and iI data is modiIied that happens to have a
copy in the local cache, the local copy is updated

54. Instead of just 5-8, pipe stages why not have, say, a pipeline with 50 pipe
stages?
The main reason why a 50-stage pipeline is not being adopted is that it will
involve large amount oI hardware resources and hence will increase the area
occupied. In addition, the size oI the instruction cache to be used will be very
large to keep the pipeline Iull and to minimize stalling.

55. What are the different ways of placing a block in a cache?
The diIIerent ways oI placing a block in a cache is the direct mapped cache, set
associative cache and the Iully associative cache

56. What is a direct mapped cache?
Direct mapped cache is deIined as a cache structure in which each memory
location is mapped to exactly one location in the cache

57. What is set associative cache?
A cache that has a Iixed number oI locations where each block can be placed is
called as set associative cache

58. What is fully associative cache?
A cache structure in which a block can be placed in any location in the cache is
called as the Iull set associative cache

59. What is the difference between 1 way,2 way, 4 way and 8 way set associative
cache in 8 block cache memory?
A 1 way set associative means that it is direct mapped structure. A 8 way set
associative means it is the Iully associative structure. A 2 way set associative
means there are 2 blocks in each set. A 4 way set associative means there are 4
blocks in each set.

60. What is to be done to improve the cache performance?
Things, which are to be done to improve the perIormance oI the cache, are to
reduce the miss rate, reduce the miss penalty and reduce the time to hit in the
cache.

61. What are the different ways to reduce the miss rate?
Increasing the size oI the cache is one the methods to reduce the miss rate.
However, the added disadvantage oI this Ieature is that it will increase the hit time
because larger the size oI the cache, the larger will be the access time. Increasing
the block size will also reduce the miss rate, but again may tend to increase the hit
time as they have to read more data Irom the cache due to the increase in blocks
size. Increasing the associativity oI a cache also decreases the miss rate, but at the
same time it also there is an observed increase in hit time

62. How do you improve cache performance?
The various ways to improve the cache perIormance are as Iollows:
! Increase the size oI the cache
! Increase the block size
! Increase the degree oI associativity oI the cache
! Use a second level cache

63. How long does a memory access take?
Average access time ÷ Hit time ¹ Miss rate x Miss Penalty

64. Write the equation governing the cache parameters?
The equation governing the cache parameters is as Iollows;
Cache size ÷ number oI sets * block size * degree oI associativity

65. How will you calculate the offset, index and the tag for a block?
OIIset÷ log 2 (block size)
Index ÷ Log 2(number oI blocks/ associativity)
Tag size÷ address size- oIIset-index

66. What are the different types of cache misses?
There are three diIIerent types oI cache misses;
! Compulsory misses occur when a block is Iirst accessed and it is not Iound
in the cache
! Capacity misses occur when the cache cannot contain all the blocks
needed by the program
! ConIlict misses occur in direct mapped or set associative cache when two
or more Irequently used cache blocks maps to the same cache line,
thereby causing each other to be discarded unnecessarily

67. Consider two equally sized caches, one of which is direct mapped and the
other is two way set-associative. There are 256 lines, with 8 words per line,
and 4 bytes per word. The machine has 32-bit addresses and is byte
addressable, with a word of 4 bytes. How many bits are used for the tag,
index, and offset?
In both caches we need 5 bits Ior the oIIset since there are 8 words/line x 4
bytes/word ÷ 32 bytes/line. In the two-way set-associative cache, we have 128
available lines that we wish to index into, which will require 7 bits. This leaves 32
- 7 - 5 ÷ 20 bits Ior the tag. The direct-mapped cache will have 256 lines, which
results in 8 index bits and 19 tag bits.

68. 32 KB 4-way set-associative data cache array with 32 byte line sizes
· How many sets?
· How many index bits, offset bits, tag bits?
· How large is the tag array?

Number oI sets÷ Cache data size/ (Degree oI associativity *block size)
÷ 2 power 8
OIIset ÷5, Index÷8 and tag ÷19

69. What are the different ways to speed up access to the main memory?
The two typical ways oI speeding up the access to the main memory is
! Use wider memory to provide more bytes at a time
! Use independent memory banks to allow multiple independent accesses.

70. What are the methods to improve the miss penalty of cache
The various methods are as Iollows:
! Give priority to reads over writes
! Use a second level cache

71. What is a tag?
A tag is essentially Iield used in memory hierarchy that contains address
inIormation required to identiIy whether the associated word in the memory
hierarchy corresponds to the requested word

72. What is LRU and where is it used and where is it not used?
LRU reIers to least recently used scheme. It is not used or it is not possible in
direct mapped cache and Iully associative cache. Because in direct mapped cache,
iI a miss occurs then the requested block can go into only one position. In the case
oI a Iully associative cache, the requested block can go anywhere. LRU is used
only in set associative caches. It is a replacement scheme in which the block
replaced is the one, which has been unused Ior a long time.

73. Explain how a cache hit or miss is determined, given a memory address?
A memory address is broken up into three parts: the tag, the index and the oIIset.
ThereIore, we Iirst calculate the oIIset, Index and the tag. We look Ior the cache
line in each set corresponding to the index oI our memory address. There is one
such line Ior each set. We then compare the tags oI each line to the tag oI our
memory address. II they match, we have a cache hit else, we have a cache miss.

74. Are you familiar with the term snooping?
Snooping is essentially the process whereby snoopy caches snoop the buses and
when a value, which has a copy in the local cache, is being modiIied then that
value in the local cache is also updated.

75. Are you familiar with the term MESI?
Processors providing cache coherence commonly implement a MESI protocol,
where the letters oI the acronym represent the Iour diIIerent states that a cache
line may be in. They are
! M- ModiIied: The line has been modiIied and the memory copy is invalid
! E-Exclusive: The cache has the only copy oI the data and the memory is
valid
! S- Shared: More than one cache is holding this value and the memory is
valid
! I-Invalid: this cache line is not valid
Depending on read or write oI cache, the state transition takes place.

76. What is ACBF (Hex) divided by 16?
1010110010111111

77. Convert 65(Hex) to Binary?
01100101

78. How is cache incoherency fixed?
Cache incoherency can be Iixed by using snoopy caches and by the process oI
snooping.

79. What are superscalar processors?
Superscalar processors issue varying numbers oI instructions each clock cycle,
which in turn implies multiple pipelines. They may be statically scheduled by a
compiler or dynamically scheduled using techniques based on scoreboarding or
Tomasulo's algorithm. The main advantage over the VLIW processors is that
there is very little impact on code and unscheduled programs can be run

80. What are VLIW processors?
They are essentially very long instruction word processors. They issue a Iixed
number oI instruction each cycle that are encoded into one very large instruction
The main advantage oI these processors over the superscalar processors are that
they use simpler hardware.

81. What is virtual memory?
Virtual Memory is a technique that uses the main memory as a cache Ior
secondary storage. It is usually implemented in magnetic disks.

82. What is page?
Each block oI a virtual memory is called as a page

83. What is a page fault?
A miss in the virtual memory is called as page Iault

84. What is a virtual address?
An address that corresponds to a location in virtual space and is translated by
address mapping to a physical address when memory is being accessed. The
processor generates the virtual address while the memory is accessed using
physical address.

85. What is address translation?
Address translation is also called as address mapping and is deIined as the process
oI converting the virtual address to a physical address when the main memory is
being accessed.

86. What are the components of the virtual memory address?
The virtual memory address is broken into the virtual page number and the page
oIIset. When address mapping takes place, the page oIIset remains the same, but
the virtual page number is translated to a physical page number.

87. Define speed up of a machine?
Speed up oI a machine is deIined as the ratio oI the execution time without
enhancement to the execution time with enhancement. It is also the ratio oI the
perIormance oI the machine with enhancement to perIormance oI the machine
without enhancement.

88. What is Amdahl`s law?
Amdahl`s law states that the maximum speedup, which can be achieved by using
a Iaster mode oI operation, is limited by the Iraction oI the time the Iaster mode oI
operation can be used.

89. What is the formula involved in the Amdahl`s law?
Ex (new) ÷ Ex (old) |(1- Fract (enha)) ¹ Iract (enhance)/speedup enhance)|
Inverse oI the above expression is overall speed up

90. What is the parameter that determines the performance of CPU?
The perIormance oI a machine is determined solely by the execution time oI the
machine.

91. What is the relation between performance and the execution time of a
machine?
They are inversely proportional

92. What is meant by CPU time?
CPU time is deIined as the time taken by the CPU Ior computing a particular
program, which does not include the time Ior accessing the I/O devices or running
other programs. The CPU time is used a measure oI perIormance because oI its
independence Irom the operating system and other Iactors

93. What are benchmarks?
They are a collection oI programs that try to explore and capture all the strengths
and weaknesses oI a computer system. The main advantage oI these benchmarks
is that lessened by the presence oI other benchmarks. We have the SPEC
benchmarks, which is the system perIormance evaluation corporation

94. What are the various programs for measuring the performance?
The various programs Ior measuring perIormance are as Iollows:
! Real Applications
! ModiIied application
! Kernels
! Toy Benchmarks
! Synthetic Benchmarks

95. What is the CPU time?
The CPU time is deIined as the product oI the CPU clock cycles Ior a program
and the Clock cycle time

96. What is Instruction count?
Instruction count is deIined as the number oI instructions that are being executed

97. What is CPI?
CPI is deIined as the average number clock cycles per instruction
÷ CPU clock cycles Ior a program/ IC
So CPU time÷ CPI * IC*Clock Cycle time

98. What is Dynamic Scheduling and what are its advantages?
Dynamic scheduling removes the restriction that the instruction should be
executed in order by allowing the hardware to rearrange the instructions. The
main advantages oI dynamic scheduling are that they can handle dependencies
that are unknown at execution time and allow code compiled with one pipeline in
mind to run eIIiciently in a diIIerent pipeline. It is in-order issue and out-oI order
execution

1. What is scoreboarding?
Scoreboarding is a technique, which allows pipelined instruction to execute oI
order when there are suIIicient resources and no data dependencies. A centralized
table keeps track oI the status oI instruction; Iunctional units and the registers
.Instructions are executed when they are ready and stalled iI hazards exist.

2. What is Tomasulo`s algorithm?
This is essentially a method oI implanting dynamic scheduling. Register renaming
is used to prevent WAW and WAR hazards and each execution unit has a number
oI reservation stations to temporarily hold instructions and operands. Tomasulo`s
algorithm is in-order issue, out oI order execution and out oI order completion

3. What are the three different stages of Tomasulo`s algorithm?
The three stage oI the Tomasulo`s algorithm are as Iollows:
! Issue the instruction
! Execute the instructions
! Write the result

4. What is register renaming?
Register renaming is way oI preventing name dependencies between instructions.
A name dependency is said to occur iI two instructions use the same memory
location, but no data is being transmitted between them. Register renaming is a
way oI organizing Tomasulo's algorithm together with instruction reordering.

5. What are the various features of Tomasulo`s algorithm?
The various Ieatures are
! It has window size oI 14 instructions : 6 load, 3 store , 3¹, 2*/
! It avoids WAW and WAR hazards by register renaming
! No issue on structural hazards

6. What are the features of Scoreboarding?
The various Ieatures are:
! It has a window size oI 5 instructions : 1 l/s, 1¹, 2 *, 1/
! No issue on structural hazards
! Stalls when we have WAR or WAW hazards

7. What are the various disadvantages of scoreboarding?
Small window size, inability to prevent WAR and WAW hazards are the main
disadvantages oI scoreboarding

8. What is static branch prediction?
In Static branch prediction, all the decisions are being made at compile time. This
does not allow the prediction scheme to adapt to program behavior those changes
with time

9. Why is static and dynamic branch prediction schemes used?
They are used to overcome the control hazards.

10. What is dynamic branch prediction?
Dynamic branch prediction is a method oI combining history inIormation about
branches with algorithms to predict the behavior oI the branches. This is
supported in hardware with a branch prediction buIIer or a branch target buIIer.

11. What is branch prediction buffer?
A branch prediction buIIer uses a small cache that is indexed by the lower bits oI
the branch instruction address, and contains inIormation about recent executions
oI the branch. An address is presented to the buIIer, and iI it is there, the
associated bits are used to predict whether the branch is taken or not.

12. What is branch target buffer?
In a normal branch prediction scheme, we must wait until the instruction decode
stage to Iind the destination oI the branch. II however, we were to store the
destination address oI the branch aIter it is computed, then a subsequent
prediction could supply this address instead oI the prediction bits at the end oI the
instruction Ietch stage. A branch target buIIer is a table oI destination program
counters, indexed by the current program counter.

13. What are exceptions?
An exception is an unexpected event Irom within the processor that disrupts the
normal program execution. Arithmetic overIlow is a typical example oI an
exception.

14. What is an interrupt?
An interrupt is an event that causes unexpected change in the control Ilow but
comes Irom outside the processor. In general, both are reIerred using the name
interrupts.

15. How are exceptions handled?
The basic action, which has to be perIormed when an exception occurs, is to save the
address oI the oIIending instruction in the exception program counter and then transIer
control to the operating system at some speciIied address. Required measures are taken
like stopping the execution or taking some predeIined action in response to an overIlow
and then OS can terminate the program or restart the instruction by the address speciIied
at EPCs

Topics
1. Cache and Virtual memory
2. Pipelining/single and multi cycle
3. Dynamic scheduling
4. Brand prediction
5. number systems
6. Parallelism/exceptions and interrupts

Computer Architecture Basicssss (1

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Computer Architecture Basicssss (1

Uploaded by

Copyright:

Available Formats

Computer architecture basics

You might also like