You are on page 1of 5

Q1Suppose that a 2M x 16 main memory is built using 256K x 8 RAM chips and memory is

word-addressable. a. How many RAM chips are necessary?


b. How many RAM chips are there per memory word?
c. How many address bits are needed for each RAM chip?
d. How many address bits are needed for all of memory?
a. 16 (8 rows of 2 columns)
b. 2
c. 256K = 218, so 18 bits
d. 2M = 221, so 21 bits

Q5
The parameters of a hierarchical memory system are specified as follows: Main
memory size = 8K blocks
Cache memory size = 512 blocks
Block size = 16 words
Determine the size of the tag field under the following conditions:
a. Fully associative mapping
b. Direct mapping
c. Set associative mapping with 16 blocks/set
TAG SET/BLOCK WORD
a) 13 0 4
b) 4 9 4
c) 8 5 4

Q6What is the average access time of a system having three levels of memory hierarchy: a cache
memory, a semiconductor main memory, and magnetic disk secondary memory. The access times
of these memories are 20 ns, 200 ns, and 2 ms, respectively. The cache hit ratio is 80 per cent and
the main memory hit ratio is 99 per cent.

Q7Processor X has a clock speed of 1 GHz, and


takes 1 cycle for integer operations, 2 cycles for memory operations,
and 4 cycles for floating point operations. Empirical data shows that
programs run on Processor X typically are composed of 35% floating
point operations, 30% memory operations, and 35% integer operations.
You are designing Processor Y, an improvement on Processor X which
will run the same programs and you have 2 options to improve the
performance:
1. Increase the clock speed to 1.2 GHz, but memory operations take
3 cycles
2. Decrease the clock speed to 900 MHz, but floating point operations
only take 3 cycles
Compute the speedup for both options and decide the option Processor
Y should take.

Answer:
First, compute the CPI for each processor:
Processor X: 1 * 0.35 + 2 * 0.30 + 4 * 0.35
= 0.35 + 0.60 + 1.40
= 2.35 cycles/instruction
Y- Option 1: 1 * 0.35 + 3 * 0.30 + 4 * 0.35
= 0.35 + 0.90 + 1.40
= 2.65 cycles/instruction
Y- Option 1: 1 * 0.35 + 2 * 0.30 + 3 * 0.35
= 0.35 + 0.60 + 1.05
= 2.00 cycles/instruction
Next, compute how long it would take to execute an “average”
instruction. This is done by dividing CPI (cycles/instruction) by clock
rate (cycles/second) to give (seconds/instruction):
Processor X: 2.35 / 1.0 = 2.35 ns/instruction
Y- Option 1: 2.65 / 1.2 = 2.21 ns/instruction
Y- Option 2: 2.00 / 0.9 = 2.22 ns/instruction
Then, compute the speedups:
Y- Option 1: 2.35/2.21= 1.063
Y- Option 2: 2.35/2.22= 1.059
So the speedups can either be phrased as “Option 1 is 6.3% faster than
X, and Option 2 is 5.9% faster than X” or “Option 1 is 1.063 times
faster than X, and Option 2 is 1.059 times faster than X”.

Q9 A disk has a rotational speed of 6000 RPM, a


seek time of 15ms, and negligible controller overhead. Each track has
256 sectors and each sector is 512B. The disk is connected to memory
via an I/O bus capable of transferring 4MB/s data. The disk contains
a cache to buffer in-flight data, and this cache allows the disk to overlap
data transfer over the I/O bus with the next disk access.
(a). (4 points) What is the maximum bandwidth of the disk? What is
the minimum amount of time (in seconds) that a program could
possibly scan 40MB of data transferred from the disk?
Answer:
In order to achieve maximum bandwidth, the disk must read sequential
data from a track with no seek overhead or delay to wait
for the proper sector. The disk can read one full track (128KB)
in the time that it takes for one rotation (10 ms). 128KB divided
by 10 ms = 12,800 KB/sec or 12.5 MB/sec. However, the I/O
bus is the bottleneck in the system (as it only has a bandwidth of
4MB/sec bandwidth), resulting in minimum of 10 seconds to scan
40MBs.
(b). (5 points) How long does it take to transfer 128KB data from disk
to memory assuming the data is found sequentially on one track
(assume the disk still must seek and rotate to find the start of the
data)?
2
Answer:
We are given that seek time (tseek) is 15ms. 6000RPM is equivalent
to 100RPS, or 10ms per rotation. On average, the disk must wait
for half of a rotation (trotation), or 5 ms.
The transfer time is the time to read the whole track from the
disk is 10ms. However, the transfer bandwidth is constrained by
the I/O bus, so ttransfer is 31.25ms (a 128KB transfer at 4MB/s
is 31.25 ms).
Therefore, the total time for the disk to read 128KB of sequential
data is tseek + trotation + ttransfer = 15ms + 5ms + 31.25ms =
51.25ms.
(c). (5 points) How long does it take to transfer 128KB data from
disk to memory assuming the data is found in sectors which are
randomly scattered across the disk?
Answer:
As before, we are given that seek time (tseek) is 15ms. 6000RPM
is equivalent to 100RPS, or 10ms per rotation. On average, the
disk must wait for half of a rotation (trotation), or 5 ms. 128KB
data needs 2(17−9) = 28, 256 sectors.
The time to read a single sector (ttransfer) is 10ms * (1/256) =
0.04ms. The time for I/O bus to transfer a single sector is 512B /
4MB * 1s = 0.128ms. Although it takes longer time for I/O bus
to transfer 512B, this extra time can be overlapped with the seek
and rotation time for next sector.
The total disk access time for first 255 sector is tseek + trotation +
ttransfer = 15 + 5 + 0.04 = 20.04ms The total disk access time for
last sector is tseek+trotation+ttransfer = 15+5+0.128 = 20.128ms
Noticed, the last sector will take a little bit longer, since it can not
overlap its I/O bus tranfer time with next seek and rotation time,
the total time to move 128KB of randomly spaced data from the
disk to memory is 20.04 * 255 + (15 + 5 + 0.128) = 5130.328ms.

QFor this problem, assume that you have a processor with a cache connected to main memory
via a bus. A successful cache access by the processor (a hit) takes 1 cycle. After an
unsuccessful cache access (a miss), an entire cache block must be fetched from main memory
over the bus. The fetch is not initiated until the cycle following the miss. A bus transaction
consists of one cycle to send the address to memory, four cycles of idle time for main-memory
access, and then one cycle to transfer each word in the block from main memory to the cache.
Assume that the processor continues execution only after the last word of the block has
arrived. In other words, if the block size is B words (at 32 bits/word), a cache miss will cost 1
+ 1 + 4 + B cycles. The following table gives the average cache miss rates of a 1 Mbyte cache
for various block sizes:

Write an expression for the average memory access time for a 1-Mbyte cache and a B-word block size (in
terms of the miss ratio m and B).
Average access time = (1-m)(1 cycle) + (m)(6 + B cycles) = 1 + (m)(5+B) cycles

What block size yields the best average memory access time?

If bus contention adds three cycles to the main-memory access time, which block size yields the best average
memory access time?

If bus width is quadrupled to 128 bits, reducing the time spent in the transfer portion of a bus transaction to
25% of its previous value, what is the optimal block size? Assume that a minimum one transfer cycle is
needed and don't include the contention cycles introduced in part (C).