Translate

Friday, January 4, 2013

Cache performance

UNIT V 
           A better measure of memory-hierarchy performance is the average memory access time:

             Average memory access time = Hit time + Miss rate ??Miss penalty
                       Hit  time  is the time to hit in the cache.

The components of average access time can be measured either in absolute time say  0.25 to 1.0 nanoseconds on a hit or in the number of clock cycles that the CPU waits for the memory such as a miss penalty of 75 to 100 clock cycles.
             
Average memory access time and Processor Performance
·         First, there are other reasons for stalls, such as contention due to I/O devices using memory.
·         Designers often assume that all memory stalls are due to cache misses, since the memory hierarchy typically dominates other reasons for stalls.
·         We use this simplifying assumption here, but beware to account for all memory stalls when calculating final performance.
·         Second, The CPU stalls during misses, and the memory stall time is strongly correlated to average memory access time.
                
               CPU time = (CPU execution clock cycles + Memory stall clock cycles) ??Clock cycle time

Miss Penalty and Out-of-Order Execution Processors
     Let’s redefine memory stalls to lead to a new definition of miss penalty as nonoverlapped latency:

(Memory stall cycles/Instruction)=(Misses/Instruction)*(Total miss latency-Overlapped miss latency)

This equation could be further expanded to account for contention for memory resources in an out-of-order processor by dividing total miss latency into latency without contention and latency due to contention.

 Let’s just concentrate on miss latency.

We now have to decide
·         length of memory latency: what to consider as the start and the end of a memory
             operation in an out-of-order processor.
·         length of latency overlap: what is the start of overlap with for the processor (or
              equivalently, when do we say a memory operation is stalling the processor).
   

Improving Cache Performance
The cache optimizations is of four categories:
·             Reducing the miss penalty : multilevel caches, critical word first read miss before write miss, merging write buffers, victim caches;
·              Reducing the miss rate : larger block size, larger cache size, higher associativity, pseudo-associativity, and compiler optimizations;
·              Reducing the miss penalty or miss rate via parallelism : nonblocking caches, hardware prefetching, and compiler prefetching;
·          Reducing the time to hit in the cache : small and simple caches,avoiding address translation, and pipelined cache access, and tace caches

No comments:

Post a Comment