148 lines
		
	
	
	
		
			5.3 KiB
		
	
	
	
		
			HTML
		
	
	
	
	
	
			
		
		
	
	
			148 lines
		
	
	
	
		
			5.3 KiB
		
	
	
	
		
			HTML
		
	
	
	
	
	
| <html>
 | |
| <head>
 | |
| <title>Lab: locks</title>
 | |
| <link rel="stylesheet" href="homework.css" type="text/css" />
 | |
| </head>
 | |
| <body>
 | |
| 
 | |
| <h1>Lab: locks</h1>
 | |
| 
 | |
| <p>In this lab you will try to avoid lock contention for certain
 | |
| workloads.
 | |
| 
 | |
| <h2>lock contention</h2>
 | |
| 
 | |
| <p>The program user/kalloctest stresses xv6's memory allocator: three
 | |
|   processes grow and shrink there address space, which will results in
 | |
|   many calls to <tt>kalloc</tt> and <tt>kfree</tt>,
 | |
|   respectively.  <tt>kalloc</tt> and <tt>kfree</tt>
 | |
|   obtain <tt>kmem.lock</tt>.  To see if there is lock contention for
 | |
|   <tt>kmem.lock</tt> replace the call to <tt>acquire</tt>
 | |
|   in <tt>kalloc</tt> with the following code:
 | |
| 
 | |
|   <pre>
 | |
|     while(!tryacquire(&kmem.lock)) {
 | |
|       printf("!");
 | |
|     }
 | |
|   </pre>
 | |
| 
 | |
| <p><tt>tryacquire</tt> tries to acquire <tt>kmem.lock</tt>: if the
 | |
|   lock is taking it returns false (0); otherwise, it returns true (1)
 | |
|   and with the lock acquired.  Your first job is to
 | |
|   implement <tt>tryacquire</tt> in kernel/spinlock.c.
 | |
| 
 | |
| <p>A few hints:
 | |
|   <ul>
 | |
|     <li>look at <tt>acquire</tt>.
 | |
|     <li>don't forget to restore interrupts when acquision fails
 | |
|     <li>Add tryacquire's signature to defs.h.
 | |
|   </ul>
 | |
| 
 | |
| <p>Run usertests to see if you didn't break anything.  Note that
 | |
|   usertests never prints "!"; there is never contention
 | |
|   for <tt>kmem.lock</tt>.  The caller is always able to immediately
 | |
|   acquire the lock and never has to wait because some other process
 | |
|   has the lock.
 | |
| 
 | |
| <p>Now run kalloctest.  You should see quite a number of "!" on the
 | |
|   console.  kalloctest causes many processes to contend on
 | |
|   the <tt>kmem.lock</tt>.  This lock contention is a bit artificial,
 | |
|   because qemu is simulating 3 processors, but it is likely on real
 | |
|   hardware, there would be contention too.
 | |
|   
 | |
| <h2>Removing lock contention</h2>
 | |
| 
 | |
| <p>The root cause of lock contention in kalloctest is that there is a
 | |
|   single free list, protected by a single lock.  To remove lock
 | |
|   contention, you will have to redesign the memory allocator to avoid
 | |
|   a single lock and list.  The basic idea is to maintain a free list
 | |
|   per CPU, each list with its own lock. Allocations and frees on each
 | |
|   CPU can run in parallel, because each CPU will operate on a
 | |
|   different list.
 | |
|   
 | |
| <p> The main challenge will be to deal with the case that one CPU runs
 | |
|   out of memory, but another CPU has still free memory; in that case,
 | |
|   the one CPU must "steal" part of the other CPU's free list.
 | |
|   Stealing may introduce lock contention, but that may be acceptable
 | |
|   because it may happen infrequently.
 | |
| 
 | |
| <p>Your job is to implement per-CPU freelists and stealing when one
 | |
|   CPU is out of memory.  Run kalloctest() to see if your
 | |
|   implementation has removed lock contention.
 | |
| 
 | |
| <p>Some hints:
 | |
|   <ul>
 | |
|     <li>You can use the constant <tt>NCPU</tt> in kernel/param.h
 | |
|     <li>Let <tt>freerange</tt> give all free memory to the CPU
 | |
|       running <tt>freerange</tt>.
 | |
|     <li>The function <tt>cpuid</tt> returns the current core, but note
 | |
|     that you can use it when interrupts are turned off and so you will
 | |
|     need to turn on/off interrupts in your solution.
 | |
|   </ul>
 | |
| 
 | |
| <p>Run usertests to see if you don't break anything.
 | |
| 
 | |
| <h2>More scalabale bcache lookup</h2>
 | |
| 
 | |
| 
 | |
| <p>Several processes reading different files repeatedly will
 | |
|   bottleneck in the buffer cache, bcache, in bio.c.  Replace the
 | |
|   acquire in <tt>bget</tt> with
 | |
|   
 | |
|   <pre>
 | |
|     while(!tryacquire(&bcache.lock)) {
 | |
|       printf("!");
 | |
|     }
 | |
|   </pre>
 | |
| 
 | |
|   and run test0 from bcachetest and you will see "!"s.
 | |
| 
 | |
| <p>Modify <tt>bget</tt> so that a lookup for a buffer that is in the
 | |
|   bcache doesn't need to acquire <tt>bcache.lock</tt>.  This is more
 | |
|   tricky than the kalloc assignment, because bcache buffers are truly
 | |
|   shared among processes. You must maintain the invariant that a
 | |
|   buffer is only once in memory.
 | |
| 
 | |
| <p> There are several races that <tt>bcache.lock</tt> protects
 | |
| against, including:
 | |
|   <ul>
 | |
|     <li>A <tt>brelse</tt> may set <tt>b->ref</tt> to 0,
 | |
|       while concurrent <tt>bget</tt> is incrementing it.
 | |
|     <li>Two <tt>bget</tt> may see <tt>b->ref = 0</tt> and one may re-use
 | |
|     the buffer, while the other may replaces it with another block.
 | |
|     <li>A concurrent <tt>brelse</tt> modifies the list
 | |
|       that <tt>bget</tt> traverses.
 | |
|   </ul>
 | |
| 
 | |
| <p>A challenge is testing whether you code is still correct.  One way
 | |
|   to do is to artificially delay certain operations
 | |
|   using <tt>sleepticks</tt>.  <tt>test1</tt> trashes the buffer cache
 | |
|   and exercises more code paths.
 | |
| 
 | |
| <p>Here are some hints:
 | |
|   <ul>
 | |
|     <li>Read the description of buffer cache in the xv6 book (Section 7.2).
 | |
|     <li>Use a simple design: i.e., don't design a lock-free implementation.
 | |
|     <li>Use a simple hash table with locks per bucket.
 | |
|     <li>Searching in hash table for a buffer and allocating an entry
 | |
|       for that buffer when the buffer is not found must be atomic.
 | |
|     <li>It is fine to acquire <tt>bcache.lock</tt> in <tt>brelse</tt>
 | |
|       to update the LRU/MRU list.
 | |
|   </ul>
 | |
| 
 | |
| <p>Check that your implementation has less contention
 | |
|   on <tt>test0</tt>
 | |
| 
 | |
| <p>Make sure your implementation passes bcachetest and usertests.
 | |
| 
 | |
| <p>Optional:
 | |
|   <ul>
 | |
|   <li>make the buffer cache more scalable (e.g., avoid taking
 | |
|   out <tt>bcache.lock</tt> on <tt>brelse</tt>).
 | |
|   <li>make lookup lock-free (Hint: use gcc's <tt>__sync_*</tt>
 | |
|     functions.) How do you convince yourself that your implementation is correct?
 | |
|   </ul>
 | |
|   
 | |
|   
 | |
| </body>
 | |
| </html>
 |