Translate

Friday, January 4, 2013

Performance issues


Determination of performance in a multi-processor using snoopy coherence protocol:
The overall cache performance is a combination of the behavior of uniprocessor cache miss traffic and the traffic caused by communication, which results in invalidations and subsequent cache misses.
Factors affecting the two components of miss rate:
             i)Changing the processor count,
ii)cache size, and
iii)block size.
3C’s Classification of the  uniprocessor miss rate:
a)Capacity
b)compulsory, and
c)conflict
The misses that arise from interprocessor communication, which are often called coherence misses, can be broken into two separate sources.
·         The first source is the so-called true sharing misses that arise from the communication of data through the cache coherence mechanism. They directly arise from the sharing of data among processors.
·         The second effect, called false sharing, arises from the use of an invalidation basedcoherence algorithm with a single valid bit per cache block.

Components of execution time:
1.      Idle—Execution in the kernel mode idle loop
2.      User—Execution in user code
3.       Synchronization—Execution or waiting for synchronization variables
4.      Kernel—Execution in the OS that is neither idle nor in synchronization
Access
The behavior of the operating system can cause more cache misses than the user processes for two reasons beyond larger code size and lack of locality.
o   First, the kernel initializes all pages before allocating them to a user, which significantly increases the compulsory component of the kernel’s miss rate.
o   Second, the kernel actually shares data and thus has a nontrivial coherence miss rate

No comments:

Post a Comment