Move labs into 6.828 repo. The lab text isn't dependent on specific
xv6 code. Lab submission instructions etc. are likely going to be more MIT 6.828 specific.
This commit is contained in:
parent
c612d452fd
commit
7241838b4c
109
labs/cow.html
109
labs/cow.html
|
@ -1,109 +0,0 @@
|
||||||
<html>
|
|
||||||
<head>
|
|
||||||
<title>Lab: Copy-on-Write Fork for xv6</title>
|
|
||||||
<link rel="stylesheet" href="homework.css" type="text/css" />
|
|
||||||
</head>
|
|
||||||
<body>
|
|
||||||
|
|
||||||
<h1>Lab: Copy-on-Write Fork for xv6</h2>
|
|
||||||
|
|
||||||
<p>
|
|
||||||
Your task is implement copy-on-write fork in the xv6 kernel. You are
|
|
||||||
done if your modified kernel executes both the cow and usertests
|
|
||||||
programs successfully.
|
|
||||||
|
|
||||||
<h2>The problem</h2>
|
|
||||||
|
|
||||||
The fork() system call in xv6 copies all of the parent process's
|
|
||||||
user-space memory into the child. If the parent is large, copying can
|
|
||||||
take a long time. In addition, the copies often waste memory; in many
|
|
||||||
cases neither the parent nor the child modifies a page, so that in
|
|
||||||
principle they could share the same physical memory. The inefficiency
|
|
||||||
is particularly clear if the child calls exec(), since then most of
|
|
||||||
the copied pages are thrown away without ever being used. Of course,
|
|
||||||
sometimes both child and parent modify memory at the same virtual
|
|
||||||
address after a fork(), so for some pages the copying is truly needed.
|
|
||||||
|
|
||||||
<h2>The solution</h2>
|
|
||||||
|
|
||||||
The goal of copy-on-write (COW) fork() is to defer allocating and
|
|
||||||
copying physical memory pages for the child until they are actually
|
|
||||||
needed, in the hope that they may never be needed.
|
|
||||||
|
|
||||||
<p>
|
|
||||||
COW fork() creates just a pagetable for the child, with PTEs for user
|
|
||||||
memory pointing to the parent's physical pages. COW fork() marks all
|
|
||||||
the user PTEs in both parent and child as read-only. When either
|
|
||||||
process tries to write one of these COW pages, the CPU will force a
|
|
||||||
page fault. The kernel page-fault handler detects this case, allocates
|
|
||||||
a page of physical memory for the faulting process, copies the
|
|
||||||
original page into the new page, and modifies the relevant PTE in the
|
|
||||||
faulting process to refer to the new page, this time with the PTE
|
|
||||||
marked writeable. When the page fault handler returns, the user
|
|
||||||
process will be able to write its copy of the page.
|
|
||||||
|
|
||||||
<p>
|
|
||||||
COW fork() makes freeing of the physical pages that implement user
|
|
||||||
memory a little trickier. A given physical page may be referred to by
|
|
||||||
multiple processes' page tables, and should be freed when the last
|
|
||||||
reference disappears.
|
|
||||||
|
|
||||||
<h2>The cow test program</h2>
|
|
||||||
|
|
||||||
To help you test your implementation, we've provided an xv6 program
|
|
||||||
called cow (source in user/cow.c). cow runs various tests, but
|
|
||||||
even the first will fail on unmodified xv6. Thus, initially, you
|
|
||||||
will see:
|
|
||||||
|
|
||||||
<pre>
|
|
||||||
$ cow
|
|
||||||
simple: fork() failed
|
|
||||||
$
|
|
||||||
</pre>
|
|
||||||
|
|
||||||
The "simple" test allocates more than half of available physical
|
|
||||||
memory, and then fork()s. The fork fails because there is not enough
|
|
||||||
free physical memory to give the child a complete copy of the parent.
|
|
||||||
|
|
||||||
<p>
|
|
||||||
When you are done, your kernel should be able to run both cow and
|
|
||||||
usertests. That is:
|
|
||||||
|
|
||||||
<pre>
|
|
||||||
$ cow
|
|
||||||
simple: ok
|
|
||||||
simple: ok
|
|
||||||
three: zombie!
|
|
||||||
ok
|
|
||||||
three: zombie!
|
|
||||||
ok
|
|
||||||
three: zombie!
|
|
||||||
ok
|
|
||||||
file: ok
|
|
||||||
ALL COW TESTS PASSED
|
|
||||||
$ usertests
|
|
||||||
...
|
|
||||||
ALL TESTS PASSED
|
|
||||||
$
|
|
||||||
</pre>
|
|
||||||
|
|
||||||
<h2>Hints</h2>
|
|
||||||
|
|
||||||
Here's one reasonable plan of attack. Modify uvmcopy() to map the
|
|
||||||
parent's physical pages into the child, instead of allocating new
|
|
||||||
pages, and clear PTE_W in the PTEs of both child and parent.
|
|
||||||
Modify usertrap() to recognize a page fault. When a page fault occurs
|
|
||||||
on a COW page, allocate a new page with kalloc(), copy the old page to
|
|
||||||
the new page, and install the new page in the PTE with PTE_W set.
|
|
||||||
Next, ensure that each physical page is freed when the last PTE
|
|
||||||
reference to it goes away (but not before!), perhaps by implementing
|
|
||||||
reference counts in kalloc.c. Finally, modify copyout() to use the
|
|
||||||
same scheme as page faults when it encounters a COW page.
|
|
||||||
|
|
||||||
<p>
|
|
||||||
It may be useful to have a way to record, for each PTE, whether it is
|
|
||||||
a COW mapping. You can use the RSW (reserved for software) bits in
|
|
||||||
the RISC-V PTE for this.
|
|
||||||
|
|
||||||
</body>
|
|
||||||
</html>
|
|
360
labs/fs.html
360
labs/fs.html
|
@ -1,360 +0,0 @@
|
||||||
<html>
|
|
||||||
<head>
|
|
||||||
<title>Lab: file system</title>
|
|
||||||
<link rel="stylesheet" href="homework.css" type="text/css" />
|
|
||||||
</head>
|
|
||||||
<body>
|
|
||||||
|
|
||||||
<h1>Lab: file system</h1>
|
|
||||||
|
|
||||||
<p>In this lab you will add large files and <tt>mmap</tt> to the xv6 file system.
|
|
||||||
|
|
||||||
<h2>Large files</h2>
|
|
||||||
|
|
||||||
<p>In this assignment you'll increase the maximum size of an xv6
|
|
||||||
file. Currently xv6 files are limited to 268 blocks, or 268*BSIZE
|
|
||||||
bytes (BSIZE is 1024 in xv6). This limit comes from the fact that an
|
|
||||||
xv6 inode contains 12 "direct" block numbers and one "singly-indirect"
|
|
||||||
block number, which refers to a block that holds up to 256 more block
|
|
||||||
numbers, for a total of 12+256=268. You'll change the xv6 file system
|
|
||||||
code to support a "doubly-indirect" block in each inode, containing
|
|
||||||
256 addresses of singly-indirect blocks, each of which can contain up
|
|
||||||
to 256 addresses of data blocks. The result will be that a file will
|
|
||||||
be able to consist of up to 256*256+256+11 blocks (11 instead of 12,
|
|
||||||
because we will sacrifice one of the direct block numbers for the
|
|
||||||
double-indirect block).
|
|
||||||
|
|
||||||
<h3>Preliminaries</h3>
|
|
||||||
|
|
||||||
<p>Modify your Makefile's <tt>CPUS</tt> definition so that it reads:
|
|
||||||
<pre>
|
|
||||||
CPUS := 1
|
|
||||||
</pre>
|
|
||||||
|
|
||||||
<b>XXX doesn't seem to speedup things</b>
|
|
||||||
<p>Add
|
|
||||||
<pre>
|
|
||||||
QEMUEXTRA = -snapshot
|
|
||||||
</pre>
|
|
||||||
right before
|
|
||||||
<tt>QEMUOPTS</tt>
|
|
||||||
<p>
|
|
||||||
The above two steps speed up qemu tremendously when xv6
|
|
||||||
creates large files.
|
|
||||||
|
|
||||||
<p><tt>mkfs</tt> initializes the file system to have fewer
|
|
||||||
than 1000 free data blocks, too few to show off the changes
|
|
||||||
you'll make. Modify <tt>param.h</tt> to
|
|
||||||
set <tt>FSSIZE</tt> to:
|
|
||||||
<pre>
|
|
||||||
#define FSSIZE 20000 // size of file system in blocks
|
|
||||||
</pre>
|
|
||||||
|
|
||||||
<p>Download <a href="big.c">big.c</a> into your xv6 directory,
|
|
||||||
add it to the UPROGS list, start up xv6, and run <tt>big</tt>.
|
|
||||||
It creates as big a file as xv6 will let
|
|
||||||
it, and reports the resulting size. It should say 140 sectors.
|
|
||||||
|
|
||||||
<h3>What to Look At</h3>
|
|
||||||
|
|
||||||
The format of an on-disk inode is defined by <tt>struct dinode</tt>
|
|
||||||
in <tt>fs.h</tt>. You're particularly interested in <tt>NDIRECT</tt>,
|
|
||||||
<tt>NINDIRECT</tt>, <tt>MAXFILE</tt>, and the <tt>addrs[]</tt> element
|
|
||||||
of <tt>struct dinode</tt>. Look Figure 7.3 in the xv6 text for a
|
|
||||||
diagram of the standard xv6 inode.
|
|
||||||
|
|
||||||
<p>
|
|
||||||
The code that finds a file's data on disk is in <tt>bmap()</tt>
|
|
||||||
in <tt>fs.c</tt>. Have a look at it and make sure you understand
|
|
||||||
what it's doing. <tt>bmap()</tt> is called both when reading and
|
|
||||||
writing a file. When writing, <tt>bmap()</tt> allocates new
|
|
||||||
blocks as needed to hold file content, as well as allocating
|
|
||||||
an indirect block if needed to hold block addresses.
|
|
||||||
|
|
||||||
<p>
|
|
||||||
<tt>bmap()</tt> deals with two kinds of block numbers. The <tt>bn</tt>
|
|
||||||
argument is a "logical block" -- a block number relative to the start
|
|
||||||
of the file. The block numbers in <tt>ip->addrs[]</tt>, and the
|
|
||||||
argument to <tt>bread()</tt>, are disk block numbers.
|
|
||||||
You can view <tt>bmap()</tt> as mapping a file's logical
|
|
||||||
block numbers into disk block numbers.
|
|
||||||
|
|
||||||
<h3>Your Job</h3>
|
|
||||||
|
|
||||||
Modify <tt>bmap()</tt> so that it implements a doubly-indirect
|
|
||||||
block, in addition to direct blocks and a singly-indirect block.
|
|
||||||
You'll have to have only 11 direct blocks, rather than 12,
|
|
||||||
to make room for your new doubly-indirect block; you're
|
|
||||||
not allowed to change the size of an on-disk inode.
|
|
||||||
The first 11 elements of <tt>ip->addrs[]</tt> should be
|
|
||||||
direct blocks; the 12th should be a singly-indirect block
|
|
||||||
(just like the current one); the 13th should be your new
|
|
||||||
doubly-indirect block.
|
|
||||||
|
|
||||||
<p>
|
|
||||||
You don't have to modify xv6 to handle deletion of files with
|
|
||||||
doubly-indirect blocks.
|
|
||||||
|
|
||||||
<p>
|
|
||||||
If all goes well, <tt>big</tt> will now report that it
|
|
||||||
can write sectors. It will take <tt>big</tt> minutes
|
|
||||||
to finish.
|
|
||||||
|
|
||||||
<b>XXX this runs for a while!</b>
|
|
||||||
|
|
||||||
<h3>Hints</h3>
|
|
||||||
|
|
||||||
<p>
|
|
||||||
Make sure you understand <tt>bmap()</tt>. Write out a diagram of the
|
|
||||||
relationships between <tt>ip->addrs[]</tt>, the indirect block, the
|
|
||||||
doubly-indirect block and the singly-indirect blocks it points to, and
|
|
||||||
data blocks. Make sure you understand why adding a doubly-indirect
|
|
||||||
block increases the maximum file size by 256*256 blocks (really -1),
|
|
||||||
since you have to decrease the number of direct blocks by one).
|
|
||||||
|
|
||||||
<p>
|
|
||||||
Think about how you'll index the doubly-indirect block, and
|
|
||||||
the indirect blocks it points to, with the logical block
|
|
||||||
number.
|
|
||||||
|
|
||||||
<p>If you change the definition of <tt>NDIRECT</tt>, you'll
|
|
||||||
probably have to change the size of <tt>addrs[]</tt>
|
|
||||||
in <tt>struct inode</tt> in <tt>file.h</tt>. Make sure that
|
|
||||||
<tt>struct inode</tt> and <tt>struct dinode</tt> have the
|
|
||||||
same number of elements in their <tt>addrs[]</tt> arrays.
|
|
||||||
|
|
||||||
<p>If you change the definition of <tt>NDIRECT</tt>, make sure to create a
|
|
||||||
new <tt>fs.img</tt>, since <tt>mkfs</tt> uses <tt>NDIRECT</tt> too to build the
|
|
||||||
initial file systems. If you delete <tt>fs.img</tt>, <tt>make</tt> on Unix (not
|
|
||||||
xv6) will build a new one for you.
|
|
||||||
|
|
||||||
<p>If your file system gets into a bad state, perhaps by crashing,
|
|
||||||
delete <tt>fs.img</tt> (do this from Unix, not xv6). <tt>make</tt> will build a
|
|
||||||
new clean file system image for you.
|
|
||||||
|
|
||||||
<p>Don't forget to <tt>brelse()</tt> each block that you
|
|
||||||
<tt>bread()</tt>.
|
|
||||||
|
|
||||||
<p>You should allocate indirect blocks and doubly-indirect
|
|
||||||
blocks only as needed, like the original <tt>bmap()</tt>.
|
|
||||||
|
|
||||||
<p>Optional challenge: support triple-indirect blocks.
|
|
||||||
|
|
||||||
<h2>Writing with a Log</h2>
|
|
||||||
|
|
||||||
Insert a print statement in bwrite (in bio.c) so that you get a
|
|
||||||
print every time a block is written to disk:
|
|
||||||
|
|
||||||
<pre>
|
|
||||||
printf("bwrite block %d\n", b->blockno);
|
|
||||||
</pre>
|
|
||||||
|
|
||||||
Build and boot a new kernel and run this:
|
|
||||||
<pre>
|
|
||||||
$ rm README
|
|
||||||
</pre>
|
|
||||||
|
|
||||||
<p>You should see a sequence of bwrite prints after the <tt>rm</tt>.</p>
|
|
||||||
|
|
||||||
<div class="question">
|
|
||||||
<ol>
|
|
||||||
<li>Annotate the bwrite lines with the kind of information that is
|
|
||||||
being written to the disk (e.g., "README's inode", "allocation
|
|
||||||
bitmap"). If the log is being written, note both that the log is being
|
|
||||||
written and also what kind of information is being written to the log.
|
|
||||||
<li>Mark with an arrow the first point at which, if a
|
|
||||||
crash occured, README would be missing after a reboot
|
|
||||||
(after the call to <tt>recover_from_log()</tt>).
|
|
||||||
</ol>
|
|
||||||
</p>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
|
|
||||||
<h2>Crash safety</h2>
|
|
||||||
|
|
||||||
<p>This assignment explores the xv6 log in two parts.
|
|
||||||
First, you'll artificially create a crash which illustrates
|
|
||||||
why logging is needed. Second, you'll remove one
|
|
||||||
inefficiency in the xv6 logging system.
|
|
||||||
|
|
||||||
<p>
|
|
||||||
Submit your solution before the beginning of the next lecture
|
|
||||||
to <a href="https://6828.scripts.mit.edu/2018/handin.py/">the submission
|
|
||||||
web site</a>.
|
|
||||||
|
|
||||||
<h3>Creating a Problem</h3>
|
|
||||||
|
|
||||||
<p>
|
|
||||||
The point of the xv6 log is to cause all the disk updates of a
|
|
||||||
filesystem operation to be atomic with respect to crashes.
|
|
||||||
For example, file creation involves both adding a new entry
|
|
||||||
to a directory and marking the new file's inode as in-use.
|
|
||||||
A crash that happened after one but before the other would
|
|
||||||
leave the file system in an incorrect state after a reboot,
|
|
||||||
if there were no log.
|
|
||||||
|
|
||||||
<p>
|
|
||||||
The following steps will break the logging code in a way that
|
|
||||||
leaves a file partially created.
|
|
||||||
|
|
||||||
<p>
|
|
||||||
First, replace <tt>commit()</tt> in <tt>log.c</tt> with
|
|
||||||
this code:
|
|
||||||
<pre>
|
|
||||||
#include "kernel/proc.h"
|
|
||||||
void
|
|
||||||
commit(void)
|
|
||||||
{
|
|
||||||
int pid = myproc()->pid;
|
|
||||||
if (log.lh.n > 0) {
|
|
||||||
write_log();
|
|
||||||
write_head();
|
|
||||||
if(pid > 1) // AAA
|
|
||||||
log.lh.block[0] = 0; // BBB
|
|
||||||
install_trans();
|
|
||||||
if(pid > 1) // AAA
|
|
||||||
panic("commit mimicking crash"); // CCC
|
|
||||||
log.lh.n = 0;
|
|
||||||
write_head();
|
|
||||||
}
|
|
||||||
}
|
|
||||||
</pre>
|
|
||||||
|
|
||||||
<p>
|
|
||||||
The BBB line causes the first block in the log to be written to
|
|
||||||
block zero, rather than wherever it should be written. During file
|
|
||||||
creation, the first block in the log is the new file's inode updated
|
|
||||||
to have non-zero <tt>type</tt>.
|
|
||||||
Line BBB causes the block
|
|
||||||
with the updated inode to be written to block 0 (whence
|
|
||||||
it will never be read), leaving the on-disk inode still marked
|
|
||||||
unallocated. The CCC line forces a crash.
|
|
||||||
The AAA lines suppress this buggy behavior for <tt>init</tt>,
|
|
||||||
which creates files before the shell starts.
|
|
||||||
|
|
||||||
<p>
|
|
||||||
Second, replace <tt>recover_from_log()</tt> in <tt>log.c</tt>
|
|
||||||
with this code:
|
|
||||||
<pre>
|
|
||||||
static void
|
|
||||||
recover_from_log(void)
|
|
||||||
{
|
|
||||||
read_head();
|
|
||||||
printf("recovery: n=%d but ignoring\n", log.lh.n);
|
|
||||||
// install_trans();
|
|
||||||
log.lh.n = 0;
|
|
||||||
// write_head();
|
|
||||||
}
|
|
||||||
</pre>
|
|
||||||
|
|
||||||
<p>
|
|
||||||
This modification suppresses log recovery (which would repair
|
|
||||||
the damage caused by your change to <tt>commit()</tt>).
|
|
||||||
|
|
||||||
<p>
|
|
||||||
Finally, remove the <tt>-snapshot</tt> option from the definition
|
|
||||||
of <tt>QEMUEXTRA</tt> in your Makefile so that the disk image will see the
|
|
||||||
changes.
|
|
||||||
|
|
||||||
<p>
|
|
||||||
Now remove <tt>fs.img</tt> and run xv6:
|
|
||||||
<pre>
|
|
||||||
% rm fs.img ; make qemu
|
|
||||||
</pre>
|
|
||||||
<p>
|
|
||||||
Tell the xv6 shell to create a file:
|
|
||||||
<pre>
|
|
||||||
$ echo hi > a
|
|
||||||
</pre>
|
|
||||||
|
|
||||||
<p>
|
|
||||||
You should see the panic from <tt>commit()</tt>. So far
|
|
||||||
it is as if a crash occurred in a non-logging system in the middle
|
|
||||||
of creating a file.
|
|
||||||
|
|
||||||
<p>
|
|
||||||
Now re-start xv6, keeping the same <tt>fs.img</tt>:
|
|
||||||
<pre>
|
|
||||||
% make qemu
|
|
||||||
</pre>
|
|
||||||
|
|
||||||
<p>
|
|
||||||
And look at file <tt>a</tt>:
|
|
||||||
<pre>
|
|
||||||
$ cat a
|
|
||||||
</pre>
|
|
||||||
|
|
||||||
<p>
|
|
||||||
You should see <tt>panic: ilock: no type</tt>. Make sure you understand what happened.
|
|
||||||
Which of the file creation's modifications were written to the disk
|
|
||||||
before the crash, and which were not?
|
|
||||||
|
|
||||||
<h3>Solving the Problem</h3>
|
|
||||||
|
|
||||||
Now fix <tt>recover_from_log()</tt>:
|
|
||||||
<pre>
|
|
||||||
static void
|
|
||||||
recover_from_log(void)
|
|
||||||
{
|
|
||||||
read_head();
|
|
||||||
cprintf("recovery: n=%d\n", log.lh.n);
|
|
||||||
install_trans();
|
|
||||||
log.lh.n = 0;
|
|
||||||
write_head();
|
|
||||||
}
|
|
||||||
</pre>
|
|
||||||
|
|
||||||
<p>
|
|
||||||
Run xv6 (keeping the same <tt>fs.img</tt>) and read <tt>a</tt> again:
|
|
||||||
<pre>
|
|
||||||
$ cat a
|
|
||||||
</pre>
|
|
||||||
|
|
||||||
<p>
|
|
||||||
This time there should be no crash. Make sure you understand why
|
|
||||||
the file system now works.
|
|
||||||
|
|
||||||
<p>
|
|
||||||
Why was the file empty, even though you created
|
|
||||||
it with <tt>echo hi > a</tt>?
|
|
||||||
|
|
||||||
<p>
|
|
||||||
Now remove your modifications to <tt>commit()</tt>
|
|
||||||
(the if's and the AAA and BBB lines), so that logging works again,
|
|
||||||
and remove <tt>fs.img</tt>.
|
|
||||||
|
|
||||||
<h3>Streamlining Commit</h3>
|
|
||||||
|
|
||||||
<p>
|
|
||||||
Suppose the file system code wants to update an inode in block 33.
|
|
||||||
The file system code will call <tt>bp=bread(block 33)</tt> and update the
|
|
||||||
buffer data. <tt>write_log()</tt> in <tt>commit()</tt>
|
|
||||||
will copy the data to a block in the log on disk, for example block 3.
|
|
||||||
A bit later in <tt>commit</tt>, <tt>install_trans()</tt> reads
|
|
||||||
block 3 from the log (containing block 33), copies its contents into the in-memory
|
|
||||||
buffer for block 33, and then writes that buffer to block 33 on the disk.
|
|
||||||
|
|
||||||
<p>
|
|
||||||
However, in <tt>install_trans()</tt>, it turns out that the modified
|
|
||||||
block 33 is guaranteed to be still in the buffer cache, where the
|
|
||||||
file system code left it. Make sure you understand why it would be a
|
|
||||||
mistake for the buffer cache to evict block 33 from the buffer cache
|
|
||||||
before the commit.
|
|
||||||
|
|
||||||
<p>
|
|
||||||
Since the modified block 33 is guaranteed to already be in the buffer
|
|
||||||
cache, there's no need for <tt>install_trans()</tt> to read block
|
|
||||||
33 from the log. Your job: modify <tt>log.c</tt> so that, when
|
|
||||||
<tt>install_trans()</tt> is called from <tt>commit()</tt>,
|
|
||||||
<tt>install_trans()</tt> does not perform the needless read from the log.
|
|
||||||
|
|
||||||
<p>To test your changes, create a file in xv6, restart, and make sure
|
|
||||||
the file is still there.
|
|
||||||
|
|
||||||
<b>XXX Does this speedup bigfile?</b>
|
|
||||||
|
|
||||||
<b>XXX Maybe support lseek and modify shell to append to a file?</b>
|
|
||||||
|
|
||||||
|
|
||||||
</body>
|
|
||||||
</html>
|
|
215
labs/fs1.html
215
labs/fs1.html
|
@ -1,215 +0,0 @@
|
||||||
<html>
|
|
||||||
<head>
|
|
||||||
<title>Lab: mount/umount</title>
|
|
||||||
<link rel="stylesheet" href="homework.css" type="text/css" />
|
|
||||||
</head>
|
|
||||||
<body>
|
|
||||||
|
|
||||||
<h1>Lab: mount/umount</h1>
|
|
||||||
|
|
||||||
<p>In this lab you will add support for mounting/unmounting of file
|
|
||||||
systems to xv6. This lab will expose you to many parts of the xv6
|
|
||||||
file system, including pathname lookup, inodes, logging/recovery, disk
|
|
||||||
driver, concurrency, etc.
|
|
||||||
|
|
||||||
<p>Your job is modify xv6 so that your modified kernel passes the
|
|
||||||
tests in mounttest. You will have to implement two system
|
|
||||||
calls: <tt>mount(char *source, char *target)</tt>
|
|
||||||
and <tt>umount(char *target)</tt>. Mount attaches the device
|
|
||||||
referenced by <tt>source</tt> (e.g., <tt>/disk1</tt>) at the
|
|
||||||
location specified by <tt>target</tt>. For
|
|
||||||
example, <tt>mount("/disk1", "/m")</tt> will attach <tt>disk1</tt>
|
|
||||||
at the directory <tt>/m</tt>. After this mount call, users can use
|
|
||||||
pathnames such as <tt>/m/README</tt> to read the
|
|
||||||
file <tt>README</tt> stored in the root directory
|
|
||||||
on <tt>disk1</tt>. <tt>Umount</tt> removes the attachment. For
|
|
||||||
example, <tt>umount("/m")</tt> unmounts disk1 from <tt>/m</tt>.
|
|
||||||
|
|
||||||
<p>There are several major challenges in implementing the mount system
|
|
||||||
calls:
|
|
||||||
|
|
||||||
<ul>
|
|
||||||
|
|
||||||
<li>Adding the actual system calls so that user programs can call
|
|
||||||
them. This is similar to previous labs in which you added
|
|
||||||
systems calls xv6.
|
|
||||||
|
|
||||||
<li>Supporting several disks. You will have generalize to
|
|
||||||
virtio_disk.c to support at least two disks.
|
|
||||||
|
|
||||||
<li>Logging file system modifications to the right disk. xv6
|
|
||||||
assumes there is only disk and file system calls typically start
|
|
||||||
with <tt>begin_op</tt> and end with <tt>end_op</tt>, logging all
|
|
||||||
modifications between these two calls to the log on the one
|
|
||||||
disk. With mount, modifications to the file system on the
|
|
||||||
second disk must be logged to the second disk.
|
|
||||||
|
|
||||||
<li>Modifying pathname lookup (<tt>namex</tt>) so that when a
|
|
||||||
lookup cross a mount point, it continues at the root inode of
|
|
||||||
the attached disk.
|
|
||||||
|
|
||||||
</ul>
|
|
||||||
|
|
||||||
<p>The rest of this assignment provides some hints how you might go
|
|
||||||
about the above challenges.
|
|
||||||
|
|
||||||
<h2>Adding system calls</h2>
|
|
||||||
|
|
||||||
<p>Add the stubs for the two systems calls to xv6 so that you can
|
|
||||||
compile mounttest and add two empty functions for the two system calls
|
|
||||||
to sysfile.c. Run mounttest and it will fail on the first call
|
|
||||||
to <tt>mount</tt>.
|
|
||||||
|
|
||||||
|
|
||||||
<h2>Adding a second disk</h2>
|
|
||||||
|
|
||||||
<p>To be able to mount another disk, you need to extend xv6 to support
|
|
||||||
at least two disks. Modify virtio_disk.c to support an array of two
|
|
||||||
disks instead of a single disk. The address of the second disk
|
|
||||||
is <tt>0x10002000</tt>; modify the macro <tt>R</tt> to take a disk
|
|
||||||
number (0, 1,..) and read/write to the memory address for that disk.
|
|
||||||
|
|
||||||
<p>All functions in <tt>virtio_disk.c</tt> need to take the disk
|
|
||||||
number as an argument to update the state of the disk that is
|
|
||||||
read/written to or to receive an interrupt from the disk.
|
|
||||||
Modify <tt>virtio_disk_init</tt> to take a disk number as an argument
|
|
||||||
and update is to that it initializes that disk. Similar, go through
|
|
||||||
the other functions; make these changes should be most mechanical
|
|
||||||
(i.e., text substitutions).
|
|
||||||
|
|
||||||
<p>The second disk interrupts at IRQ 2; modify trap.c to receive that
|
|
||||||
interrupt and <tt>virtio_disk_intr</tt> with the number of the disk
|
|
||||||
that generated the interrupt.
|
|
||||||
|
|
||||||
<p>Modify the file Makefile to tell qemu to provide a second
|
|
||||||
disk. Define the variable <tt>QEMUEXTRA = -drive
|
|
||||||
file=fs1.img,if=none,format=raw,id=x1 -device
|
|
||||||
virtio-blk-device,drive=x1,bus=virtio-mmio-bus.1</tt> and
|
|
||||||
add <tt>$(QEMUEXTRA)</tt> to the end of <tt>QEMUOPTS</tt>.
|
|
||||||
|
|
||||||
<p>Create a second disk image <tt>fs1.img</tt>. Easiest thing to do
|
|
||||||
is just copy the file <tt>fs.img</tt>. You might want to add rules
|
|
||||||
to the Makefile to make this image and remove it on <tt>make
|
|
||||||
clean</tt>.
|
|
||||||
|
|
||||||
<p>Add to the user program init a call to create a device for the new
|
|
||||||
disk. For example, add the line <tt>mknod("disk1", DISK, 1);</tt> to
|
|
||||||
init.c. This will create an inode of type device in the root
|
|
||||||
directory with major number <tt>DISK</tt> and minor number 1.
|
|
||||||
|
|
||||||
<p>The first argument of the <tt>mount</tt> system call ("disk1") will
|
|
||||||
refer to the device you created using <tt>mknod</tt> above. In your
|
|
||||||
implementation of the mount system call,
|
|
||||||
call <tt>virtio_disk_init</tt> with the minor number as the argument
|
|
||||||
to initialize the second disk. (We reserve minor number 0 for the
|
|
||||||
first disk.)
|
|
||||||
|
|
||||||
<p>Boot xv6, run mounttest, and make sure <tt>virtio_disk_init</tt>
|
|
||||||
gets called (e.g., add print statement). You won't know if your
|
|
||||||
changes are correct, but your code should compile and invoke the
|
|
||||||
driver for the second disk.
|
|
||||||
|
|
||||||
<h2>Modify the logging system</h2>
|
|
||||||
|
|
||||||
<p>After calling <tt>virtio_disk_init</tt>, you need to also
|
|
||||||
call <tt>loginit</tt> to initialize the logging system for the
|
|
||||||
second disk (and restore the second disk if a power failure happened
|
|
||||||
while modifying the second disk). Generalize the logging system to
|
|
||||||
support to two logs, one on disk 0 and one disk 1. These changes
|
|
||||||
are mostly mechanical (e.g., <tt>log.</tt> changes
|
|
||||||
to <tt>log[n].</tt>), similar to generalizing the disk driver to
|
|
||||||
support two disks.
|
|
||||||
|
|
||||||
<p>To make xv6 compile, you need to provide a disk number
|
|
||||||
to <tt>begin_op</tt> and <tt>end_op</tt>. It will be a challenge to
|
|
||||||
figure out what the right value is; for now just specify the first
|
|
||||||
disk (i.e., 0). This isn't correct, since modifications to the
|
|
||||||
second disk should be logged on the second disk, but we have no way
|
|
||||||
yet to read/write the second disk. Come back to this later when you
|
|
||||||
have a better idea how things will fit together, but make sure that
|
|
||||||
xv6 compiles and still runs.
|
|
||||||
|
|
||||||
<h2>Pathname lookup</h2>
|
|
||||||
|
|
||||||
<p>Modify <tt>namex</tt> to traverse mount points: when <tt>namex</tt>
|
|
||||||
sees an inode to which a file system is attached, it should traverse
|
|
||||||
to the root inode of that file system. Hint: modify the in-memory
|
|
||||||
inode in file.h to keep some additional state, and initialize that
|
|
||||||
state in the mount system call. Note that the inode already has a
|
|
||||||
field for disk number (i.e., <tt>dev</tt>), which is initialized and
|
|
||||||
passed to reads and writes to the driver. <tt>dev</tt> corresponds
|
|
||||||
to the minor number for disk devices.
|
|
||||||
|
|
||||||
<p>Your modified xv6 should be able to pass the first tests in
|
|
||||||
mounttest (i.e., <tt>stat</tt>). This is likely to be challenging,
|
|
||||||
however, because now your kernel will be reading from the second
|
|
||||||
disk for the first time, and you may run into many issues.
|
|
||||||
|
|
||||||
<p>Even though <tt>stat</tt> may return correctly, your code is likely
|
|
||||||
to be incorrect, because in <tt>namex</tt>
|
|
||||||
because <tt>iunlockput</tt> may modify the second disk (e.g., if
|
|
||||||
another process removes the file or directory) and those
|
|
||||||
modifications must be written to the second disk. Your job is to
|
|
||||||
fix the calls to <tt>begin_op</tt> and <tt>end_op</tt> to take the
|
|
||||||
right device. One challenge is that <tt>begin_op</tt> is called at
|
|
||||||
the beginning of a system call but then you don't know the device
|
|
||||||
that will be involved; you will have to postpone this call until you
|
|
||||||
know which inode is involved (which tells you will which device is
|
|
||||||
involved). Another challenge is that you cannot postpone
|
|
||||||
calling <tt>begin_op</tt> passed <tt>ilock</tt> because that
|
|
||||||
violates lock ordering in xv6; you should not be
|
|
||||||
calling <tt>begin_op</tt> while holding locks on inodes. (The log
|
|
||||||
system allows a few systems calls to run; if a system call that
|
|
||||||
holds an inode lock isn't admitted and one of the admitted system
|
|
||||||
calls needs that inode to complete, then xv6 will deadlock.)
|
|
||||||
|
|
||||||
<p>Once you have implemented a plan for <tt>begin_op</tt>
|
|
||||||
and <tt>end_op</tt>, see if your kernel can pass <tt>test0</tt>. It
|
|
||||||
is likely that you will have to modify your implementation of the
|
|
||||||
mount system call to handle several corner cases. See the tests
|
|
||||||
in <tt>test0</tt>.
|
|
||||||
|
|
||||||
<p>Run usertests to see if you didn't break anything else. Since you
|
|
||||||
modified <tt>namex</tt> and <tt>begin/end_op</tt>, which are at the
|
|
||||||
core of the xv6 file system, you might have introduced bugs, perhaps
|
|
||||||
including deadlocks. Deadlocks manifest themselves as no output
|
|
||||||
being produced because all processes are sleeping (hit ctrl-p a few
|
|
||||||
times). Your kernel might also suffer kernel panics, because your
|
|
||||||
changes violate invariants. You may have to iterate a few times to
|
|
||||||
get a good design and implementation.
|
|
||||||
|
|
||||||
<h2>umount</h2>
|
|
||||||
|
|
||||||
<p>Once your kernel passes usertests and test0 of mounttest, implement
|
|
||||||
umount. The main challenge is that umount of a file system should
|
|
||||||
fail if the file system is still in use; that is, if there is an
|
|
||||||
inode on the mounted device that has a <tt>ref > 0</tt>.
|
|
||||||
Furthermore, this test and unmounting should be an atomic
|
|
||||||
operation. (Hint: lock the inode cache.) Make sure your kernel
|
|
||||||
passes test1 of mounttest.
|
|
||||||
|
|
||||||
<p>Test2 of mounttest stresses <tt>namex</tt> more; if you have done
|
|
||||||
everything right above, your kernel should pass it. Test3 tests
|
|
||||||
concurrent mount/unmounts with file creation.
|
|
||||||
|
|
||||||
<h2>crash safety</h2>
|
|
||||||
|
|
||||||
<p>One of the main goals of the file system is to provide crash
|
|
||||||
safety: if there is a power failure during a file system operation,
|
|
||||||
xv6 should recover correctly. It is difficult to introduce power
|
|
||||||
failure at the critical steps of logging; instead, we added a system
|
|
||||||
call that causes a kernel panic after committing an operation but
|
|
||||||
before installing the operation. Test4 with crashtest tests if your
|
|
||||||
xv6 recovers the mounted disk correctly.
|
|
||||||
|
|
||||||
|
|
||||||
</body>
|
|
||||||
</html>
|
|
||||||
|
|
||||||
<h2>Optional challenges</h2>
|
|
||||||
|
|
||||||
<p>Modify xv6 so that init mounts the first disk on the root inode.
|
|
||||||
This will allow you to remove some code specific for the first disk
|
|
||||||
from the kernel.
|
|
||||||
|
|
||||||
<p>Support mounts on top of mounts.
|
|
132
labs/lazy.html
132
labs/lazy.html
|
@ -1,132 +0,0 @@
|
||||||
<html>
|
|
||||||
<head>
|
|
||||||
<title>Lab: xv6 lazy page allocation</title>
|
|
||||||
<link rel="stylesheet" href="homework.css" type="text/css" />
|
|
||||||
</head>
|
|
||||||
<body>
|
|
||||||
|
|
||||||
<h1>Lab: xv6 lazy page allocation</h1>
|
|
||||||
|
|
||||||
<p>
|
|
||||||
One of the many neat tricks an O/S can play with page table hardware
|
|
||||||
is lazy allocation of heap memory. Xv6 applications ask the kernel for
|
|
||||||
heap memory using the sbrk() system call. In the kernel we've given
|
|
||||||
you, sbrk() allocates physical memory and maps it into the process's
|
|
||||||
virtual address space. There are programs that allocate memory but
|
|
||||||
never use it, for example to implement large sparse arrays.
|
|
||||||
Sophisticated kernels delay allocation of each page of memory until
|
|
||||||
the application tries to use that page -- as signaled by a page fault.
|
|
||||||
You'll add this lazy allocation feature to xv6 in this lab.
|
|
||||||
|
|
||||||
<h2>Part One: Eliminate allocation from sbrk()</h2>
|
|
||||||
|
|
||||||
Your first task is to delete page allocation from the sbrk(n) system
|
|
||||||
call implementation, which is the function sys_sbrk() in sysproc.c. The
|
|
||||||
sbrk(n) system call grows the process's memory size by n bytes, and
|
|
||||||
then returns the start of the newly allocated region (i.e., the old
|
|
||||||
size). Your new sbrk(n) should just increment the process's size
|
|
||||||
(myproc()->sz) by n and return the old size. It should not allocate memory
|
|
||||||
-- so you should delete the call to growproc() (but you still need to
|
|
||||||
increase the process's size!).
|
|
||||||
|
|
||||||
<p>
|
|
||||||
Try to guess what the result of this modification will be: what will
|
|
||||||
break?
|
|
||||||
|
|
||||||
<p>
|
|
||||||
Make this modification, boot xv6, and type <tt>echo hi</tt> to the shell.
|
|
||||||
You should see something like this:
|
|
||||||
|
|
||||||
<pre>
|
|
||||||
init: starting sh
|
|
||||||
$ echo hi
|
|
||||||
usertrap(): unexpected scause 0x000000000000000f pid=3
|
|
||||||
sepc=0x00000000000011dc stval=0x0000000000004008
|
|
||||||
va=0x0000000000004000 pte=0x0000000000000000
|
|
||||||
panic: unmappages: not mapped
|
|
||||||
</pre>
|
|
||||||
|
|
||||||
The "usertrap(): ..." message is from the user trap handler in trap.c;
|
|
||||||
it has caught an exception that it does not know how to handle. Make
|
|
||||||
sure you understand why this page fault occurs. The "stval=0x0..04008"
|
|
||||||
indicates that the virtual address that caused the page fault is
|
|
||||||
0x4008.
|
|
||||||
|
|
||||||
<h2>Part Two: Lazy allocation</h2>
|
|
||||||
|
|
||||||
Modify the code in trap.c to respond to a page fault from user space
|
|
||||||
by mapping a newly-allocated page of physical memory at the faulting
|
|
||||||
address, and then returning back to user space to let the process
|
|
||||||
continue executing. You should add your code just before
|
|
||||||
the <tt>printf</tt> call that produced the "usertrap(): ..."
|
|
||||||
message.
|
|
||||||
|
|
||||||
<p>
|
|
||||||
Hint: look at the printf arguments to see how to find the virtual
|
|
||||||
address that caused the page fault.
|
|
||||||
|
|
||||||
<p>
|
|
||||||
Hint: steal code from allocuvm() in vm.c, which is what sbrk()
|
|
||||||
calls (via growproc()).
|
|
||||||
|
|
||||||
<p>
|
|
||||||
Hint: use PGROUNDDOWN(va) to round the faulting virtual address
|
|
||||||
down to a page boundary.
|
|
||||||
|
|
||||||
<p>
|
|
||||||
Hint: <tt>usertrapret()</tt> in order to avoid
|
|
||||||
the <tt>printf</tt> and the <tt>myproc()->killed = 1</tt>.
|
|
||||||
|
|
||||||
<p>
|
|
||||||
Hint: you'll need to call mappages().
|
|
||||||
|
|
||||||
<p>Hint: you can check whether a fault is a page fault by r_scause()
|
|
||||||
is 13 or 15 in trap().
|
|
||||||
|
|
||||||
<p>Hint: modify unmappages() to not free pages that aren't mapped.
|
|
||||||
|
|
||||||
<p>Hint: if the kernel crashes, look up sepc in kernel/kernel.asm
|
|
||||||
|
|
||||||
<p>Hint: if you see the error "imcomplete type proc", include "proc.h"
|
|
||||||
(and "spinlock.h").
|
|
||||||
|
|
||||||
<p>Hint: the first test in sbrk() allocates something large, this
|
|
||||||
should succeed now.
|
|
||||||
|
|
||||||
<p>
|
|
||||||
If all goes well, your lazy allocation code should result in <tt>echo
|
|
||||||
hi</tt> working. You should get at least one page fault (and thus lazy
|
|
||||||
allocation) in the shell, and perhaps two.
|
|
||||||
|
|
||||||
<p>If you have the basics working, now turn your implementation into
|
|
||||||
one that handles the corner cases too:
|
|
||||||
|
|
||||||
<ul>
|
|
||||||
|
|
||||||
<li> Handle negative sbrk() arguments. sbrktest() in usertests will
|
|
||||||
tests this.
|
|
||||||
|
|
||||||
<li> Handle fork correctly. sbrktst() will test this.
|
|
||||||
|
|
||||||
<li> Make sure that kernel use of not-yet-allocated user addresses
|
|
||||||
works; for example, if a program passes an sbrk()-allocated
|
|
||||||
address to write(). sbrktest() will test this.
|
|
||||||
|
|
||||||
<li> Handle out of memory correctly. sbrktst() will test this.
|
|
||||||
|
|
||||||
<li> Handle faults on the invalid page below the stack. stacktest()
|
|
||||||
in usertests will tests this.
|
|
||||||
|
|
||||||
</ul>
|
|
||||||
|
|
||||||
<p>Run all tests in usertests() to make sure your solution doesn't
|
|
||||||
break other tests.
|
|
||||||
|
|
||||||
<p>
|
|
||||||
<div class="question">
|
|
||||||
<p><b>Submit</b>: The code that you added to trap.c in a file named <em>hwN.c</em> where <em>N</em> is the homework number as listed on the schedule.
|
|
||||||
</div>
|
|
||||||
|
|
||||||
|
|
||||||
</body>
|
|
||||||
</html>
|
|
148
labs/lock.html
148
labs/lock.html
|
@ -1,148 +0,0 @@
|
||||||
<html>
|
|
||||||
<head>
|
|
||||||
<title>Lab: locks</title>
|
|
||||||
<link rel="stylesheet" href="homework.css" type="text/css" />
|
|
||||||
</head>
|
|
||||||
<body>
|
|
||||||
|
|
||||||
<h1>Lab: locks</h1>
|
|
||||||
|
|
||||||
<p>In this lab you will try to avoid lock contention for certain
|
|
||||||
workloads.
|
|
||||||
|
|
||||||
<h2>lock contention</h2>
|
|
||||||
|
|
||||||
<p>The program user/kalloctest stresses xv6's memory allocator: three
|
|
||||||
processes grow and shrink there address space, which will results in
|
|
||||||
many calls to <tt>kalloc</tt> and <tt>kfree</tt>,
|
|
||||||
respectively. <tt>kalloc</tt> and <tt>kfree</tt>
|
|
||||||
obtain <tt>kmem.lock</tt>. To see if there is lock contention for
|
|
||||||
<tt>kmem.lock</tt> replace the call to <tt>acquire</tt>
|
|
||||||
in <tt>kalloc</tt> with the following code:
|
|
||||||
|
|
||||||
<pre>
|
|
||||||
while(!tryacquire(&kmem.lock)) {
|
|
||||||
printf("!");
|
|
||||||
}
|
|
||||||
</pre>
|
|
||||||
|
|
||||||
<p><tt>tryacquire</tt> tries to acquire <tt>kmem.lock</tt>: if the
|
|
||||||
lock is taking it returns false (0); otherwise, it returns true (1)
|
|
||||||
and with the lock acquired. Your first job is to
|
|
||||||
implement <tt>tryacquire</tt> in kernel/spinlock.c.
|
|
||||||
|
|
||||||
<p>A few hints:
|
|
||||||
<ul>
|
|
||||||
<li>look at <tt>acquire</tt>.
|
|
||||||
<li>don't forget to restore interrupts when acquision fails
|
|
||||||
<li>Add tryacquire's signature to defs.h.
|
|
||||||
</ul>
|
|
||||||
|
|
||||||
<p>Run usertests to see if you didn't break anything. Note that
|
|
||||||
usertests never prints "!"; there is never contention
|
|
||||||
for <tt>kmem.lock</tt>. The caller is always able to immediately
|
|
||||||
acquire the lock and never has to wait because some other process
|
|
||||||
has the lock.
|
|
||||||
|
|
||||||
<p>Now run kalloctest. You should see quite a number of "!" on the
|
|
||||||
console. kalloctest causes many processes to contend on
|
|
||||||
the <tt>kmem.lock</tt>. This lock contention is a bit artificial,
|
|
||||||
because qemu is simulating 3 processors, but it is likely on real
|
|
||||||
hardware, there would be contention too.
|
|
||||||
|
|
||||||
<h2>Removing lock contention</h2>
|
|
||||||
|
|
||||||
<p>The root cause of lock contention in kalloctest is that there is a
|
|
||||||
single free list, protected by a single lock. To remove lock
|
|
||||||
contention, you will have to redesign the memory allocator to avoid
|
|
||||||
a single lock and list. The basic idea is to maintain a free list
|
|
||||||
per CPU, each list with its own lock. Allocations and frees on each
|
|
||||||
CPU can run in parallel, because each CPU will operate on a
|
|
||||||
different list.
|
|
||||||
|
|
||||||
<p> The main challenge will be to deal with the case that one CPU runs
|
|
||||||
out of memory, but another CPU has still free memory; in that case,
|
|
||||||
the one CPU must "steal" part of the other CPU's free list.
|
|
||||||
Stealing may introduce lock contention, but that may be acceptable
|
|
||||||
because it may happen infrequently.
|
|
||||||
|
|
||||||
<p>Your job is to implement per-CPU freelists and stealing when one
|
|
||||||
CPU is out of memory. Run kalloctest() to see if your
|
|
||||||
implementation has removed lock contention.
|
|
||||||
|
|
||||||
<p>Some hints:
|
|
||||||
<ul>
|
|
||||||
<li>You can use the constant <tt>NCPU</tt> in kernel/param.h
|
|
||||||
<li>Let <tt>freerange</tt> give all free memory to the CPU
|
|
||||||
running <tt>freerange</tt>.
|
|
||||||
<li>The function <tt>cpuid</tt> returns the current core, but note
|
|
||||||
that you can use it when interrupts are turned off and so you will
|
|
||||||
need to turn on/off interrupts in your solution.
|
|
||||||
</ul>
|
|
||||||
|
|
||||||
<p>Run usertests to see if you don't break anything.
|
|
||||||
|
|
||||||
<h2>More scalabale bcache lookup</h2>
|
|
||||||
|
|
||||||
|
|
||||||
<p>Several processes reading different files repeatedly will
|
|
||||||
bottleneck in the buffer cache, bcache, in bio.c. Replace the
|
|
||||||
acquire in <tt>bget</tt> with
|
|
||||||
|
|
||||||
<pre>
|
|
||||||
while(!tryacquire(&bcache.lock)) {
|
|
||||||
printf("!");
|
|
||||||
}
|
|
||||||
</pre>
|
|
||||||
|
|
||||||
and run test0 from bcachetest and you will see "!"s.
|
|
||||||
|
|
||||||
<p>Modify <tt>bget</tt> so that a lookup for a buffer that is in the
|
|
||||||
bcache doesn't need to acquire <tt>bcache.lock</tt>. This is more
|
|
||||||
tricky than the kalloc assignment, because bcache buffers are truly
|
|
||||||
shared among processes. You must maintain the invariant that a
|
|
||||||
buffer is only once in memory.
|
|
||||||
|
|
||||||
<p> There are several races that <tt>bcache.lock</tt> protects
|
|
||||||
against, including:
|
|
||||||
<ul>
|
|
||||||
<li>A <tt>brelse</tt> may set <tt>b->ref</tt> to 0,
|
|
||||||
while concurrent <tt>bget</tt> is incrementing it.
|
|
||||||
<li>Two <tt>bget</tt> may see <tt>b->ref = 0</tt> and one may re-use
|
|
||||||
the buffer, while the other may replaces it with another block.
|
|
||||||
<li>A concurrent <tt>brelse</tt> modifies the list
|
|
||||||
that <tt>bget</tt> traverses.
|
|
||||||
</ul>
|
|
||||||
|
|
||||||
<p>A challenge is testing whether you code is still correct. One way
|
|
||||||
to do is to artificially delay certain operations
|
|
||||||
using <tt>sleepticks</tt>. <tt>test1</tt> trashes the buffer cache
|
|
||||||
and exercises more code paths.
|
|
||||||
|
|
||||||
<p>Here are some hints:
|
|
||||||
<ul>
|
|
||||||
<li>Read the description of buffer cache in the xv6 book (Section 7.2).
|
|
||||||
<li>Use a simple design: i.e., don't design a lock-free implementation.
|
|
||||||
<li>Use a simple hash table with locks per bucket.
|
|
||||||
<li>Searching in hash table for a buffer and allocating an entry
|
|
||||||
for that buffer when the buffer is not found must be atomic.
|
|
||||||
<li>It is fine to acquire <tt>bcache.lock</tt> in <tt>brelse</tt>
|
|
||||||
to update the LRU/MRU list.
|
|
||||||
</ul>
|
|
||||||
|
|
||||||
<p>Check that your implementation has less contention
|
|
||||||
on <tt>test0</tt>
|
|
||||||
|
|
||||||
<p>Make sure your implementation passes bcachetest and usertests.
|
|
||||||
|
|
||||||
<p>Optional:
|
|
||||||
<ul>
|
|
||||||
<li>make the buffer cache more scalable (e.g., avoid taking
|
|
||||||
out <tt>bcache.lock</tt> on <tt>brelse</tt>).
|
|
||||||
<li>make lookup lock-free (Hint: use gcc's <tt>__sync_*</tt>
|
|
||||||
functions.) How do you convince yourself that your implementation is correct?
|
|
||||||
</ul>
|
|
||||||
|
|
||||||
|
|
||||||
</body>
|
|
||||||
</html>
|
|
171
labs/mmap.html
171
labs/mmap.html
|
@ -1,171 +0,0 @@
|
||||||
<html>
|
|
||||||
<head>
|
|
||||||
<title>Lab: mmap</title>
|
|
||||||
<link rel="stylesheet" href="homework.css" type="text/css" />
|
|
||||||
</head>
|
|
||||||
<body>
|
|
||||||
|
|
||||||
<h1>Lab: mmap</h1>
|
|
||||||
|
|
||||||
<p>In this lab you will use </tt>mmap</tt> on Linux to demand-page a
|
|
||||||
very large table and add memory-mapped files to xv6.
|
|
||||||
|
|
||||||
<h2>Using mmap on Linux</h2>
|
|
||||||
|
|
||||||
<p>This assignment will make you more familiar with how to manage virtual memory
|
|
||||||
in user programs using the Unix system call interface. You can do this
|
|
||||||
assignment on any operating system that supports the Unix API (a Linux Athena
|
|
||||||
machine, your laptop with Linux or MacOS, etc.).
|
|
||||||
|
|
||||||
<p>Download the <a href="mmap.c">mmap homework assignment</a> and look
|
|
||||||
it over. The program maintains a very large table of square root
|
|
||||||
values in virtual memory. However, the table is too large to fit in
|
|
||||||
physical RAM. Instead, the square root values should be computed on
|
|
||||||
demand in response to page faults that occur in the table's address
|
|
||||||
range. Your job is to implement the demand faulting mechanism using a
|
|
||||||
signal handler and UNIX memory mapping system calls. To stay within
|
|
||||||
the physical RAM limit, we suggest using the simple strategy of
|
|
||||||
unmapping the last page whenever a new page is faulted in.
|
|
||||||
|
|
||||||
<p>To compile <tt>mmap.c</tt>, you need a C compiler, such as gcc. On Athena,
|
|
||||||
you can type:
|
|
||||||
<pre>
|
|
||||||
$ add gnu
|
|
||||||
</pre>
|
|
||||||
Once you have gcc, you can compile mmap.c as follows:
|
|
||||||
<pre>
|
|
||||||
$ gcc mmap.c -lm -o mmap
|
|
||||||
</pre>
|
|
||||||
Which produces a <tt>mmap</tt> file, which you can run:
|
|
||||||
<pre>
|
|
||||||
$ ./mmap
|
|
||||||
page_size is 4096
|
|
||||||
Validating square root table contents...
|
|
||||||
oops got SIGSEGV at 0x7f6bf7fd7f18
|
|
||||||
</pre>
|
|
||||||
|
|
||||||
<p>When the process accesses the square root table, the mapping does not exist
|
|
||||||
and the kernel passes control to the signal handler code in
|
|
||||||
<tt>handle_sigsegv()</tt>. Modify the code in <tt>handle_sigsegv()</tt> to map
|
|
||||||
in a page at the faulting address, unmap a previous page to stay within the
|
|
||||||
physical memory limit, and initialize the new page with the correct square root
|
|
||||||
values. Use the function <tt>calculate_sqrts()</tt> to compute the values.
|
|
||||||
The program includes test logic that verifies if the contents of the
|
|
||||||
square root table are correct. When you have completed your task
|
|
||||||
successfully, the process will print “All tests passed!”.
|
|
||||||
|
|
||||||
<p>You may find that the man pages for mmap() and munmap() are helpful references.
|
|
||||||
<pre>
|
|
||||||
$ man mmap
|
|
||||||
$ man munmap
|
|
||||||
</pre>
|
|
||||||
|
|
||||||
|
|
||||||
<h2>Implement memory-mapped files in xv6</h2>
|
|
||||||
|
|
||||||
<p>In this assignment you will implement memory-mapped files in xv6.
|
|
||||||
The test program <tt>mmaptest</tt> tells you what should work.
|
|
||||||
|
|
||||||
<p>Here are some hints about how you might go about this assignment:
|
|
||||||
|
|
||||||
<ul>
|
|
||||||
<li>Start with adding the two systems calls to the kernel, as you
|
|
||||||
done for other systems calls (e.g., <tt>sigalarm</tt>), but
|
|
||||||
don't implement them yet; just return an
|
|
||||||
error. run <tt>mmaptest</tt> to observe the error.
|
|
||||||
|
|
||||||
<li>Keep track for each process what <tt>mmap</tt> has mapped.
|
|
||||||
You will need to allocate a <tt>struct vma</tt> to record the
|
|
||||||
address, length, permissions, etc. for each virtual memory area
|
|
||||||
(VMA) that maps a file. Since the xv6 kernel doesn't have a
|
|
||||||
memory allocator in the kernel, you can use the same approach has
|
|
||||||
for <tt>struct file</tt>: have a global array of <tt>struct
|
|
||||||
vma</tt>s and have for each process a fixed-sized array of VMAs
|
|
||||||
(like the file descriptor array).
|
|
||||||
|
|
||||||
<li>Implement <tt>mmap</tt>: allocate a VMA, add it to the process's
|
|
||||||
table of VMAs, fill in the VMA, and find a hole in the process's
|
|
||||||
address space where you will map the file. You can assume that no
|
|
||||||
file will be bigger than 1GB. The VMA will contain a pointer to
|
|
||||||
a <tt>struct file</tt> for the file being mapped; you will need to
|
|
||||||
increase the file's reference count so that the structure doesn't
|
|
||||||
disappear when the file is closed (hint:
|
|
||||||
see <tt>filedup</tt>). You don't have worry about overlapping
|
|
||||||
VMAs. Run <tt>mmaptest</tt>: the first <tt>mmap</tt> should
|
|
||||||
succeed, but the first access to the mmaped- memory will fail,
|
|
||||||
because you haven't updated the page fault handler.
|
|
||||||
|
|
||||||
<li>Modify the page-fault handler from the lazy-allocation and COW
|
|
||||||
labs to call a VMA function that handles page faults in VMAs.
|
|
||||||
This function allocates a page, reads a 4KB from the mmap-ed
|
|
||||||
file into the page, and maps the page into the address space of
|
|
||||||
the process. To read the page, you can use <tt>readi</tt>,
|
|
||||||
which allows you to specify an offset from where to read in the
|
|
||||||
file (but you will have to lock/unlock the inode passed
|
|
||||||
to <tt>readi</tt>). Don't forget to set the permissions correctly
|
|
||||||
on the page. Run <tt>mmaptest</tt>; you should get to the
|
|
||||||
first <tt>munmap</tt>.
|
|
||||||
|
|
||||||
<li>Implement <tt>munmap</tt>: find the <tt>struct vma</tt> for
|
|
||||||
the address and unmap the specified pages (hint:
|
|
||||||
use <tt>uvmunmap</tt>). If <tt>munmap</tt> removes all pages
|
|
||||||
from a VMA, you will have to free the VMA (don't forget to
|
|
||||||
decrement the reference count of the VMA's <tt>struct
|
|
||||||
file</tt>); otherwise, you may have to shrink the VMA. You can
|
|
||||||
assume that <tt>munmap</tt> will not split a VMA into two VMAs;
|
|
||||||
that is, we don't unmap a few pages in the middle of a VMA. If
|
|
||||||
an unmapped page has been modified and the file is
|
|
||||||
mapped <tt>MAP_SHARED</tt>, you will have to write the page back
|
|
||||||
to the file. RISC-V has a dirty bit (<tt>D</tt>) in a PTE to
|
|
||||||
record whether a page has ever been written too; add the
|
|
||||||
declaration to kernel/riscv.h and use it. Modify <tt>exit</tt>
|
|
||||||
to call <tt>munmap</tt> for the process's open VMAs.
|
|
||||||
Run <tt>mmaptest</tt>; you should <tt>mmaptest</tt>, but
|
|
||||||
probably not <tt>forktest</tt>.
|
|
||||||
|
|
||||||
<li>Modify <tt>fork</tt> to copy VMAs from parent to child. Don't
|
|
||||||
forget to increment reference count for a VMA's <tt>struct
|
|
||||||
file</tt>. In the page fault handler of the child, it is OK to
|
|
||||||
allocate a new page instead of sharing the page with the
|
|
||||||
parent. The latter would be cooler, but it would require more
|
|
||||||
implementation work. Run <tt>mmaptest</tt>; make sure you pass
|
|
||||||
both <tt>mmaptest</tt> and <tt>forktest</tt>.
|
|
||||||
|
|
||||||
</ul>
|
|
||||||
|
|
||||||
<p>Run usertests to make sure you didn't break anything.
|
|
||||||
|
|
||||||
<p>Optional challenges:
|
|
||||||
<ul>
|
|
||||||
|
|
||||||
<li>If two processes have the same file mmap-ed (as
|
|
||||||
in <tt>forktest</tt>), share their physical pages. You will need
|
|
||||||
reference counts on physical pages.
|
|
||||||
|
|
||||||
<li>The solution above allocates a new physical page for each page
|
|
||||||
read from the mmap-ed file, even though the data is also in kernel
|
|
||||||
memory in the buffer cache. Modify your implementation to mmap
|
|
||||||
that memory, instead of allocating a new page. This requires that
|
|
||||||
file blocks be the same size as pages (set <tt>BSIZE</tt> to
|
|
||||||
4096). You will need to pin mmap-ed blocks into the buffer cache.
|
|
||||||
You will need worry about reference counts.
|
|
||||||
|
|
||||||
<li>Remove redundancy between your implementation for lazy
|
|
||||||
allocation and your implementation of mmapp-ed files. (Hint:
|
|
||||||
create an VMA for the lazy allocation area.)
|
|
||||||
|
|
||||||
<li>Modify <tt>exec</tt> to use a VMA for different sections of
|
|
||||||
the binary so that you get on-demand-paged executables. This will
|
|
||||||
make starting programs faster, because <tt>exec</tt> will not have
|
|
||||||
to read any data from the file system.
|
|
||||||
|
|
||||||
<li>Implement on-demand paging: don't keep a process in memory,
|
|
||||||
but let the kernel move some parts of processes to disk when
|
|
||||||
physical memory is low. Then, page in the paged-out memory when
|
|
||||||
the process references it. Port your linux program from the first
|
|
||||||
assignment to xv6 and run it.
|
|
||||||
|
|
||||||
</ul>
|
|
||||||
|
|
||||||
</body>
|
|
||||||
</html>
|
|
|
@ -1,443 +0,0 @@
|
||||||
<html>
|
|
||||||
<head>
|
|
||||||
<title>Lab: Alarm and uthread</title>
|
|
||||||
<link rel="stylesheet" href="homework.css" type="text/css" />
|
|
||||||
</head>
|
|
||||||
<body>
|
|
||||||
|
|
||||||
<h1>Lab: Alarm and uthread</h1>
|
|
||||||
|
|
||||||
This lab will familiarize you with the implementation of system calls
|
|
||||||
and switching between threads of execution. In particular, you will
|
|
||||||
implement new system calls (<tt>sigalarm</tt> and <tt>sigreturn</tt>)
|
|
||||||
and switching between threads in a user-level thread package.
|
|
||||||
|
|
||||||
<h2>Warmup: RISC-V assembly</h2>
|
|
||||||
|
|
||||||
<p>For this lab it will be important to understand a bit of RISC-V assembly.
|
|
||||||
|
|
||||||
<p>Add a file user/call.c with the following content, modify the
|
|
||||||
Makefile to add the program to the user programs, and compile (make
|
|
||||||
fs.img). The Makefile also produces a binary and a readable
|
|
||||||
assembly a version of the program in the file user/call.asm.
|
|
||||||
<pre>
|
|
||||||
#include "kernel/param.h"
|
|
||||||
#include "kernel/types.h"
|
|
||||||
#include "kernel/stat.h"
|
|
||||||
#include "user/user.h"
|
|
||||||
|
|
||||||
int g(int x) {
|
|
||||||
return x+3;
|
|
||||||
}
|
|
||||||
|
|
||||||
int f(int x) {
|
|
||||||
return g(x);
|
|
||||||
}
|
|
||||||
|
|
||||||
void main(void) {
|
|
||||||
printf(1, "%d %d\n", f(8)+1, 13);
|
|
||||||
exit();
|
|
||||||
}
|
|
||||||
</pre>
|
|
||||||
|
|
||||||
<p>Read through user/call.asm and understand it. The instruction manual
|
|
||||||
for RISC-V is in the doc directory (doc/riscv-spec-v2.2.pdf). Here
|
|
||||||
are some questions that you should answer for yourself:
|
|
||||||
|
|
||||||
<ul>
|
|
||||||
<li>Which registers contain arguments to functions? Which
|
|
||||||
register holds 13 in the call to <tt>printf</tt>? Which register
|
|
||||||
holds the second argument? Which register holds the third one? Etc.
|
|
||||||
|
|
||||||
<li>Where is the function call to <tt>f</tt> from main? Where
|
|
||||||
is the call to <tt>g</tt>?
|
|
||||||
(Hint: the compiler may inline functions.)
|
|
||||||
|
|
||||||
<li>At what address is the function <tt>printf</tt> located?
|
|
||||||
|
|
||||||
<li>What value is in the register <tt>ra</tt> just after the <tt>jalr</tt>
|
|
||||||
to <tt>printf</tt> in <tt>main</tt>?
|
|
||||||
</ul>
|
|
||||||
|
|
||||||
<h2>Warmup: system call tracing</h2>
|
|
||||||
|
|
||||||
<p>In this exercise you will modify the xv6 kernel to print out a line
|
|
||||||
for each system call invocation. It is enough to print the name of the
|
|
||||||
system call and the return value; you don't need to print the system
|
|
||||||
call arguments.
|
|
||||||
|
|
||||||
<p>
|
|
||||||
When you're done, you should see output like this when booting
|
|
||||||
xv6:
|
|
||||||
|
|
||||||
<pre>
|
|
||||||
...
|
|
||||||
fork -> 2
|
|
||||||
exec -> 0
|
|
||||||
open -> 3
|
|
||||||
close -> 0
|
|
||||||
$write -> 1
|
|
||||||
write -> 1
|
|
||||||
</pre>
|
|
||||||
|
|
||||||
<p>
|
|
||||||
That's init forking and execing sh, sh making sure only two file descriptors are
|
|
||||||
open, and sh writing the $ prompt. (Note: the output of the shell and the
|
|
||||||
system call trace are intermixed, because the shell uses the write syscall to
|
|
||||||
print its output.)
|
|
||||||
|
|
||||||
<p> Hint: modify the syscall() function in kernel/syscall.c.
|
|
||||||
|
|
||||||
<p>Run the xv6 programs you wrote in earlier labs and inspect the system call
|
|
||||||
trace. Are there many system calls? Which system calls correspond
|
|
||||||
to code in the applications you wrote?
|
|
||||||
|
|
||||||
<p>Optional: print the system call arguments.
|
|
||||||
|
|
||||||
|
|
||||||
<h2>Alarm</h2>
|
|
||||||
|
|
||||||
<p>
|
|
||||||
In this exercise you'll add a feature to xv6 that periodically alerts
|
|
||||||
a process as it uses CPU time. This might be useful for compute-bound
|
|
||||||
processes that want to limit how much CPU time they chew up, or for
|
|
||||||
processes that want to compute but also want to take some periodic
|
|
||||||
action. More generally, you'll be implementing a primitive form of
|
|
||||||
user-level interrupt/fault handlers; you could use something similar
|
|
||||||
to handle page faults in the application, for example.
|
|
||||||
|
|
||||||
<p>
|
|
||||||
You should add a new <tt>sigalarm(interval, handler)</tt> system call.
|
|
||||||
If an application calls <tt>sigalarm(n, fn)</tt>, then after every
|
|
||||||
<tt>n</tt> "ticks" of CPU time that the program consumes, the kernel
|
|
||||||
should cause application function
|
|
||||||
<tt>fn</tt> to be called. When <tt>fn</tt> returns, the application
|
|
||||||
should resume where it left off. A tick is a fairly arbitrary unit of
|
|
||||||
time in xv6, determined by how often a hardware timer generates
|
|
||||||
interrupts.
|
|
||||||
|
|
||||||
<p>
|
|
||||||
You'll find a file <tt>user/alarmtest.c</tt> in your xv6
|
|
||||||
repository. Add it to the Makefile. It won't compile correctly
|
|
||||||
until you've added <tt>sigalarm</tt> and <tt>sigreturn</tt>
|
|
||||||
system calls (see below).
|
|
||||||
|
|
||||||
<p>
|
|
||||||
<tt>alarmtest</tt> calls <tt>sigalarm(2, periodic)</tt> in <tt>test0</tt> to
|
|
||||||
ask the kernel to force a call to <tt>periodic()</tt> every 2 ticks,
|
|
||||||
and then spins for a while.
|
|
||||||
You can see the assembly
|
|
||||||
code for alarmtest in user/alarmtest.asm, which may be handy
|
|
||||||
for debugging.
|
|
||||||
When you've finished the lab,
|
|
||||||
<tt>alarmtest</tt> should produce output like this:
|
|
||||||
|
|
||||||
<pre>
|
|
||||||
$ alarmtest
|
|
||||||
test0 start
|
|
||||||
......................................alarm!
|
|
||||||
test0 passed
|
|
||||||
test1 start
|
|
||||||
..alarm!
|
|
||||||
..alarm!
|
|
||||||
..alarm!
|
|
||||||
.alarm!
|
|
||||||
..alarm!
|
|
||||||
..alarm!
|
|
||||||
..alarm!
|
|
||||||
..alarm!
|
|
||||||
..alarm!
|
|
||||||
..alarm!
|
|
||||||
test1 passed
|
|
||||||
$
|
|
||||||
</pre>
|
|
||||||
|
|
||||||
<p>The main challenge will be to arrange that the handler is invoked
|
|
||||||
when the process's alarm interval expires. You'll need to modify
|
|
||||||
usertrap() in kernel/trap.c so that when a
|
|
||||||
process's alarm interval expires, the process executes
|
|
||||||
the handler. How can you do that? You will need to understand
|
|
||||||
how system calls work (i.e., the code in kernel/trampoline.S
|
|
||||||
and kernel/trap.c). Which register contains the address to which
|
|
||||||
system calls return?
|
|
||||||
|
|
||||||
<p>Your solution will be only a few lines of code, but it may be tricky to
|
|
||||||
get it right.
|
|
||||||
We'll test your code with the version of alarmtest.c in the original
|
|
||||||
repository; if you modify alarmtest.c, make sure your kernel changes
|
|
||||||
cause the original alarmtest to pass the tests.
|
|
||||||
|
|
||||||
<h3>test0: invoke handler</h3>
|
|
||||||
|
|
||||||
<p>Get started by modifying the kernel to jump to the alarm handler in
|
|
||||||
user space, which will cause test0 to print "alarm!". Don't worry yet
|
|
||||||
what happens after the "alarm!" output; it's OK for now if your
|
|
||||||
program crashes after printing "alarm!". Here are some hints:
|
|
||||||
|
|
||||||
<ul>
|
|
||||||
|
|
||||||
<li>You'll need to modify the Makefile to cause <tt>alarmtest.c</tt>
|
|
||||||
to be compiled as an xv6 user program.
|
|
||||||
|
|
||||||
<li>The right declarations to put in <tt>user/user.h</tt> are:
|
|
||||||
<pre>
|
|
||||||
int sigalarm(int ticks, void (*handler)());
|
|
||||||
int sigreturn(void);
|
|
||||||
</pre>
|
|
||||||
|
|
||||||
<li>Update user/sys.pl (which generates user/usys.S),
|
|
||||||
kernel/syscall.h, and kernel/syscall.c
|
|
||||||
to allow <tt>alarmtest</tt> to invoke the sigalarm and
|
|
||||||
sigreturn system calls.
|
|
||||||
|
|
||||||
<li>For now, your <tt>sys_sigreturn</tt> should just return zero.
|
|
||||||
|
|
||||||
<li>Your <tt>sys_sigalarm()</tt> should store the alarm interval and
|
|
||||||
the pointer to the handler function in new fields in the <tt>proc</tt>
|
|
||||||
structure, defined in <tt>kernel/proc.h</tt>.
|
|
||||||
|
|
||||||
<li>You'll need to keep track of how many ticks have passed since the
|
|
||||||
last call (or are left until the next call) to a process's alarm
|
|
||||||
handler; you'll need a new field in <tt>struct proc</tt> for this
|
|
||||||
too. You can initialize <tt>proc</tt> fields in <tt>allocproc()</tt>
|
|
||||||
in <tt>proc.c</tt>.
|
|
||||||
|
|
||||||
<li>Every tick, the hardware clock forces an interrupt, which is handled
|
|
||||||
in <tt>usertrap()</tt>; you should add some code here.
|
|
||||||
|
|
||||||
<li>You only want to manipulate a process's alarm ticks if there's a a
|
|
||||||
timer interrupt; you want something like
|
|
||||||
<pre>
|
|
||||||
if(which_dev == 2) ...
|
|
||||||
</pre>
|
|
||||||
|
|
||||||
<li>Only invoke the alarm function if the process has a
|
|
||||||
timer outstanding. Note that the address of the user's alarm
|
|
||||||
function might be 0 (e.g., in alarmtest.asm, <tt>periodic</tt> is at
|
|
||||||
address 0).
|
|
||||||
|
|
||||||
<li>It will be easier to look at traps with gdb if you tell qemu to
|
|
||||||
use only one CPU, which you can do by running
|
|
||||||
<pre>
|
|
||||||
make CPUS=1 qemu
|
|
||||||
</pre>
|
|
||||||
|
|
||||||
<li>You've succeeded if alarmtest prints "alarm!".
|
|
||||||
|
|
||||||
</ul>
|
|
||||||
|
|
||||||
<h3>test1(): resume interrupted code</h3>
|
|
||||||
|
|
||||||
Chances are that alarmtest crashes at some point after it prints
|
|
||||||
"alarm!". Depending on how your solution works, that point may be in
|
|
||||||
test0, or it may be in test1. Crashes are likely caused
|
|
||||||
by the alarm handler (<tt>periodic</tt> in alarmtest.c) returning
|
|
||||||
to the wrong point in the user program.
|
|
||||||
|
|
||||||
<p>
|
|
||||||
Your job now is to ensure that, when the alarm handler is done,
|
|
||||||
control returns to
|
|
||||||
the instruction at which the user program was originally
|
|
||||||
interrupted by the timer interrupt. You must also ensure that
|
|
||||||
the register contents are restored to values they held
|
|
||||||
at the time of the interrupt, so that the user program
|
|
||||||
can continue undisturbed after the alarm.
|
|
||||||
|
|
||||||
<p>Your solution is likely to require you to save and restore
|
|
||||||
registers---what registers do you need to save and restore to resume
|
|
||||||
the interrupted code correctly? (Hint: it will be many).
|
|
||||||
Several approaches are possible; for this lab you should make
|
|
||||||
the <tt>sigreturn</tt> system call
|
|
||||||
restore registers and return to the original
|
|
||||||
interrupted user instruction.
|
|
||||||
The user-space alarm handler
|
|
||||||
calls sigreturn when it is done.
|
|
||||||
|
|
||||||
Some hints:
|
|
||||||
<ul>
|
|
||||||
<li>Have <tt>usertrap</tt> save enough state in
|
|
||||||
<tt>struct proc</tt> when the timer goes off
|
|
||||||
that <tt>sigreturn</tt> can correctly return to the
|
|
||||||
interrupted user code.
|
|
||||||
|
|
||||||
<li>Prevent re-entrant calls to the handler----if a handler hasn't
|
|
||||||
returned yet, the kernel shouldn't call it again.
|
|
||||||
</ul>
|
|
||||||
|
|
||||||
<p>Once you pass <tt>test0</tt> and <tt>test1</tt>, run usertests to
|
|
||||||
make sure you didn't break any other parts of the kernel.
|
|
||||||
|
|
||||||
<h2>Uthread: switching between threads</h2>
|
|
||||||
|
|
||||||
<p>Download <a href="uthread.c">uthread.c</a> and <a
|
|
||||||
href="uthread_switch.S">uthread_switch.S</a> into your xv6 directory.
|
|
||||||
Make sure <tt>uthread_switch.S</tt> ends with <tt>.S</tt>, not
|
|
||||||
<tt>.s</tt>. Add the
|
|
||||||
following rule to the xv6 Makefile after the _forktest rule:
|
|
||||||
|
|
||||||
<pre>
|
|
||||||
$U/_uthread: $U/uthread.o $U/uthread_switch.o
|
|
||||||
$(LD) $(LDFLAGS) -N -e main -Ttext 0 -o $U/_uthread $U/uthread.o $U/uthread_switch.o $(ULIB)
|
|
||||||
$(OBJDUMP) -S $U/_uthread > $U/uthread.asm
|
|
||||||
</pre>
|
|
||||||
Make sure that the blank space at the start of each line is a tab,
|
|
||||||
not spaces.
|
|
||||||
|
|
||||||
<p>
|
|
||||||
Add <tt>_uthread</tt> in the Makefile to the list of user programs defined by UPROGS.
|
|
||||||
|
|
||||||
<p>Run xv6, then run <tt>uthread</tt> from the xv6 shell. The xv6 kernel will print an error message about <tt>uthread</tt> encountering a page fault.
|
|
||||||
|
|
||||||
<p>Your job is to complete <tt>uthread_switch.S</tt>, so that you see output similar to
|
|
||||||
this (make sure to run with CPUS=1):
|
|
||||||
<pre>
|
|
||||||
~/classes/6828/xv6$ make CPUS=1 qemu
|
|
||||||
...
|
|
||||||
$ uthread
|
|
||||||
my thread running
|
|
||||||
my thread 0x0000000000002A30
|
|
||||||
my thread running
|
|
||||||
my thread 0x0000000000004A40
|
|
||||||
my thread 0x0000000000002A30
|
|
||||||
my thread 0x0000000000004A40
|
|
||||||
my thread 0x0000000000002A30
|
|
||||||
my thread 0x0000000000004A40
|
|
||||||
my thread 0x0000000000002A30
|
|
||||||
my thread 0x0000000000004A40
|
|
||||||
my thread 0x0000000000002A30
|
|
||||||
...
|
|
||||||
my thread 0x0000000000002A88
|
|
||||||
my thread 0x0000000000004A98
|
|
||||||
my thread: exit
|
|
||||||
my thread: exit
|
|
||||||
thread_schedule: no runnable threads
|
|
||||||
$
|
|
||||||
</pre>
|
|
||||||
|
|
||||||
<p><tt>uthread</tt> creates two threads and switches back and forth between
|
|
||||||
them. Each thread prints "my thread ..." and then yields to give the other
|
|
||||||
thread a chance to run.
|
|
||||||
|
|
||||||
<p>To observe the above output, you need to complete <tt>uthread_switch.S</tt>, but before
|
|
||||||
jumping into <tt>uthread_switch.S</tt>, first understand how <tt>uthread.c</tt>
|
|
||||||
uses <tt>uthread_switch</tt>. <tt>uthread.c</tt> has two global variables
|
|
||||||
<tt>current_thread</tt> and <tt>next_thread</tt>. Each is a pointer to a
|
|
||||||
<tt>thread</tt> structure. The thread structure has a stack for a thread and a
|
|
||||||
saved stack pointer (<tt>sp</tt>, which points into the thread's stack). The
|
|
||||||
job of <tt>uthread_switch</tt> is to save the current thread state into the
|
|
||||||
structure pointed to by <tt>current_thread</tt>, restore <tt>next_thread</tt>'s
|
|
||||||
state, and make <tt>current_thread</tt> point to where <tt>next_thread</tt> was
|
|
||||||
pointing to, so that when <tt>uthread_switch</tt> returns <tt>next_thread</tt>
|
|
||||||
is running and is the <tt>current_thread</tt>.
|
|
||||||
|
|
||||||
<p>You should study <tt>thread_create</tt>, which sets up the initial stack for
|
|
||||||
a new thread. It provides hints about what <tt>uthread_switch</tt> should do.
|
|
||||||
Note that <tt>thread_create</tt> simulates saving all callee-save registers
|
|
||||||
on a new thread's stack.
|
|
||||||
|
|
||||||
<p>To write the assembly in <tt>thread_switch</tt>, you need to know how the C
|
|
||||||
compiler lays out <tt>struct thread</tt> in memory, which is as
|
|
||||||
follows:
|
|
||||||
|
|
||||||
<pre>
|
|
||||||
--------------------
|
|
||||||
| 4 bytes for state|
|
|
||||||
--------------------
|
|
||||||
| stack size bytes |
|
|
||||||
| for stack |
|
|
||||||
--------------------
|
|
||||||
| 8 bytes for sp |
|
|
||||||
-------------------- <--- current_thread
|
|
||||||
......
|
|
||||||
|
|
||||||
......
|
|
||||||
--------------------
|
|
||||||
| 4 bytes for state|
|
|
||||||
--------------------
|
|
||||||
| stack size bytes |
|
|
||||||
| for stack |
|
|
||||||
--------------------
|
|
||||||
| 8 bytes for sp |
|
|
||||||
-------------------- <--- next_thread
|
|
||||||
</pre>
|
|
||||||
|
|
||||||
The variables <tt>&next_thread</tt> and <tt>¤t_thread</tt> each
|
|
||||||
contain the address of a pointer to <tt>struct thread</tt>, and are
|
|
||||||
passed to <tt>thread_switch</tt>. The following fragment of assembly
|
|
||||||
will be useful:
|
|
||||||
|
|
||||||
<pre>
|
|
||||||
ld t0, 0(a0)
|
|
||||||
sd sp, 0(t0)
|
|
||||||
</pre>
|
|
||||||
|
|
||||||
This saves <tt>sp</tt> in <tt>current_thread->sp</tt>. This works because
|
|
||||||
<tt>sp</tt> is at
|
|
||||||
offset 0 in the struct.
|
|
||||||
You can study the assembly the compiler generates for
|
|
||||||
<tt>uthread.c</tt> by looking at <tt>uthread.asm</tt>.
|
|
||||||
|
|
||||||
<p>To test your code it might be helpful to single step through your
|
|
||||||
<tt>uthread_switch</tt> using <tt>riscv64-linux-gnu-gdb</tt>. You can get started in this way:
|
|
||||||
|
|
||||||
<pre>
|
|
||||||
(gdb) file user/_uthread
|
|
||||||
Reading symbols from user/_uthread...
|
|
||||||
(gdb) b *0x230
|
|
||||||
|
|
||||||
</pre>
|
|
||||||
0x230 is the address of uthread_switch (see uthread.asm). When you
|
|
||||||
compile it may be at a different address, so check uthread_asm.
|
|
||||||
You may also be able to type "b uthread_switch". <b>XXX This doesn't work
|
|
||||||
for me; why?</b>
|
|
||||||
|
|
||||||
<p>The breakpoint may (or may not) be triggered before you even run
|
|
||||||
<tt>uthread</tt>. How could that happen?
|
|
||||||
|
|
||||||
<p>Once your xv6 shell runs, type "uthread", and gdb will break at
|
|
||||||
<tt>thread_switch</tt>. Now you can type commands like the following to inspect
|
|
||||||
the state of <tt>uthread</tt>:
|
|
||||||
|
|
||||||
<pre>
|
|
||||||
(gdb) p/x *next_thread
|
|
||||||
$1 = {sp = 0x4a28, stack = {0x0 (repeats 8088 times),
|
|
||||||
0x68, 0x1, 0x0 <repeats 102 times>}, state = 0x1}
|
|
||||||
</pre>
|
|
||||||
What address is <tt>0x168</tt>, which sits on the bottom of the stack
|
|
||||||
of <tt>next_thread</tt>?
|
|
||||||
|
|
||||||
With "x", you can examine the content of a memory location
|
|
||||||
<pre>
|
|
||||||
(gdb) x/x next_thread->sp
|
|
||||||
0x4a28 <all_thread+16304>: 0x00000168
|
|
||||||
</pre>
|
|
||||||
Why does that print <tt>0x168</tt>?
|
|
||||||
|
|
||||||
<h3>Optional challenges</h3>
|
|
||||||
|
|
||||||
<p>The user-level thread package interacts badly with the operating system in
|
|
||||||
several ways. For example, if one user-level thread blocks in a system call,
|
|
||||||
another user-level thread won't run, because the user-level threads scheduler
|
|
||||||
doesn't know that one of its threads has been descheduled by the xv6 scheduler. As
|
|
||||||
another example, two user-level threads will not run concurrently on different
|
|
||||||
cores, because the xv6 scheduler isn't aware that there are multiple
|
|
||||||
threads that could run in parallel. Note that if two user-level threads were to
|
|
||||||
run truly in parallel, this implementation won't work because of several races
|
|
||||||
(e.g., two threads on different processors could call <tt>thread_schedule</tt>
|
|
||||||
concurrently, select the same runnable thread, and both run it on different
|
|
||||||
processors.)
|
|
||||||
|
|
||||||
<p>There are several ways of addressing these problems. One is
|
|
||||||
using <a href="http://en.wikipedia.org/wiki/Scheduler_activations">scheduler
|
|
||||||
activations</a> and another is to use one kernel thread per
|
|
||||||
user-level thread (as Linux kernels do). Implement one of these ways
|
|
||||||
in xv6. This is not easy to get right; for example, you will need to
|
|
||||||
implement TLB shootdown when updating a page table for a
|
|
||||||
multithreaded user process.
|
|
||||||
|
|
||||||
<p>Add locks, condition variables, barriers,
|
|
||||||
etc. to your thread package.
|
|
||||||
|
|
||||||
</body>
|
|
||||||
</html>
|
|
||||||
|
|
238
labs/xv6.html
238
labs/xv6.html
|
@ -1,238 +0,0 @@
|
||||||
<html>
|
|
||||||
<head>
|
|
||||||
<title>Lab: xv6</title>
|
|
||||||
<link rel="stylesheet" href="homework.css" type="text/css" />
|
|
||||||
</head>
|
|
||||||
<body>
|
|
||||||
|
|
||||||
<h1>Lab: xv6</h1>
|
|
||||||
|
|
||||||
This lab makes you familiar with xv6 and its system calls.
|
|
||||||
|
|
||||||
<h2>Boot xv6</h2>
|
|
||||||
|
|
||||||
<p>Login to Athena (e.g., ssh -X athena.dialup.mit.edu) and attach the course
|
|
||||||
locker: (You must run this command every time you log in; or add it to your
|
|
||||||
~/.environment file.)
|
|
||||||
|
|
||||||
<pre>
|
|
||||||
$ add -f 6.828
|
|
||||||
</pre>
|
|
||||||
|
|
||||||
<p>Fetch the xv6 source:
|
|
||||||
|
|
||||||
<pre>
|
|
||||||
$ mkdir 6.828
|
|
||||||
$ cd 6.828
|
|
||||||
$ git clone git://github.com/mit-pdos/xv6-riscv.git
|
|
||||||
Cloning into 'xv6-riscv'...
|
|
||||||
...
|
|
||||||
$
|
|
||||||
</pre>
|
|
||||||
|
|
||||||
<p>XXX pointer to an update tools page
|
|
||||||
|
|
||||||
<p>Build xv6 on Athena:
|
|
||||||
<pre>
|
|
||||||
$ cd xv6-public
|
|
||||||
$ makeriscv64-linux-gnu-gcc -c -o kernel/entry.o kernel/entry.S
|
|
||||||
riscv64-linux-gnu-gcc -Wall -Werror -O -fno-omit-frame-pointer -ggdb -MD -mcmodel=medany -ffreestanding -fno-common -nostdlib -mno-relax -I. -fno-stack-protector -fno-pie -no-pie -c -o kernel/start.o kernel/start.c
|
|
||||||
...
|
|
||||||
$ make qemu
|
|
||||||
...
|
|
||||||
mkfs/mkfs fs.img README user/_cat user/_echo user/_forktest user/_grep user/_init user/_kill user/_ln user/_ls user/_mkdir user/_rm user/_sh user/_stressfs user/_usertests user/_wc user/_zombie user/_cow
|
|
||||||
nmeta 46 (boot, super, log blocks 30 inode blocks 13, bitmap blocks 1) blocks 954 total 1000
|
|
||||||
balloc: first 497 blocks have been allocated
|
|
||||||
balloc: write bitmap block at sector 45
|
|
||||||
qemu-system-riscv64 -machine virt -kernel kernel/kernel -m 3G -smp 3 -nographic -drive file=fs.img,if=none,format=raw,id=x0 -device virtio-blk-device,drive=x0,bus=virtio-mmio-bus.0
|
|
||||||
hart 0 starting
|
|
||||||
hart 2 starting
|
|
||||||
hart 1 starting
|
|
||||||
init: starting sh
|
|
||||||
$
|
|
||||||
</pre>
|
|
||||||
|
|
||||||
<p>
|
|
||||||
If you type <tt>ls</tt> at the prompt, you should output similar to the following:
|
|
||||||
<pre>
|
|
||||||
$ ls
|
|
||||||
. 1 1 1024
|
|
||||||
.. 1 1 1024
|
|
||||||
README 2 2 2181
|
|
||||||
cat 2 3 21024
|
|
||||||
echo 2 4 19776
|
|
||||||
forktest 2 5 11456
|
|
||||||
grep 2 6 24512
|
|
||||||
init 2 7 20656
|
|
||||||
kill 2 8 19856
|
|
||||||
ln 2 9 19832
|
|
||||||
ls 2 10 23280
|
|
||||||
mkdir 2 11 19952
|
|
||||||
rm 2 12 19936
|
|
||||||
sh 2 13 38632
|
|
||||||
stressfs 2 14 20912
|
|
||||||
usertests 2 15 106264
|
|
||||||
wc 2 16 22160
|
|
||||||
zombie 2 17 19376
|
|
||||||
cow 2 18 27152
|
|
||||||
console 3 19 0
|
|
||||||
</pre>
|
|
||||||
These are the programs/files that <tt>mkfs</tt> includes in the
|
|
||||||
initial file system. You just ran one of them: <tt>ls</tt>.
|
|
||||||
|
|
||||||
<h2>sleep</h2>
|
|
||||||
|
|
||||||
<p>Implement the UNIX program sleep for xv6; your sleep should pause
|
|
||||||
for a user-specified number of ticks.
|
|
||||||
|
|
||||||
<p>Some hints:
|
|
||||||
<ul>
|
|
||||||
<li>Look at some of the other programs in <tt>user/</tt> to see
|
|
||||||
how you can obtain the command-line arguments passed to a program. If the user
|
|
||||||
forgets to pass an argument, sleep should print an error message.
|
|
||||||
|
|
||||||
<li>The command-line argument is passed as a string; you can convert it to an
|
|
||||||
integer using <tt>atoi</tt> (see user/ulib.c).
|
|
||||||
|
|
||||||
<li>Use the system call <tt>sleep</tt> (see user/usys.S and kernel/sysproc.c).
|
|
||||||
|
|
||||||
<li>Make sure <tt>main</tt> calls <tt>exit()</tt> in order to exit
|
|
||||||
your program.
|
|
||||||
|
|
||||||
<li>Add the program to <tt>UPROGS</tt> in Makefile and compile
|
|
||||||
user programs by typing <tt>make fs.img</tt>.
|
|
||||||
|
|
||||||
</ul>
|
|
||||||
|
|
||||||
<p>Run the program from the xv6 shell:
|
|
||||||
<pre>
|
|
||||||
$ make qemu
|
|
||||||
...
|
|
||||||
init: starting sh
|
|
||||||
$ sleep 10
|
|
||||||
(waits for a little while)
|
|
||||||
$
|
|
||||||
</pre>
|
|
||||||
|
|
||||||
<p>Optional: write an uptime program that prints the uptime in terms
|
|
||||||
of ticks using the <tt>uptime</tt> system call.
|
|
||||||
|
|
||||||
<h2>pingpong</h2>
|
|
||||||
|
|
||||||
<p> Write a program that uses UNIX system calls to ``ping-pong'' a
|
|
||||||
byte between two processes over a pair of pipes, one for each
|
|
||||||
direction. The parent sends by writing a byte to <tt>fd[1]</tt> and
|
|
||||||
the child receives it by reading from <tt>fd[0]</tt>. After
|
|
||||||
receiving a byte from parent, the child responds with its own byte
|
|
||||||
by writing to <tt>fd[1]</tt>, which the parent then reads.
|
|
||||||
|
|
||||||
<p>Some hints:
|
|
||||||
<ul>
|
|
||||||
<li>Use <tt>pipe</tt> to create a pipe.
|
|
||||||
<li>Use <tt>fork</tt> to create a child.
|
|
||||||
<li>Use <tt>read</tt> to read from the pipe, and <tt>write</tt> to write to the pipe.
|
|
||||||
</ul>
|
|
||||||
|
|
||||||
<h2>primes</h2>
|
|
||||||
|
|
||||||
<p>Write a concurrent version of prime sieve using pipes. This idea
|
|
||||||
is due to Doug McIlroy, inventor of Unix pipes. The picture
|
|
||||||
halfway down <a href="http://swtch.com/~rsc/thread/">the page</a>
|
|
||||||
and the text surrounding it explain how to do it.
|
|
||||||
|
|
||||||
<p>Your goal is to use <tt>pipe</tt> and <tt>fork</tt> to set up
|
|
||||||
the pipeline. The first process feeds the numbers 2 through 35
|
|
||||||
into the pipeline. For each prime number, you will arrange to
|
|
||||||
create one process that reads from its left neighbor over a pipe
|
|
||||||
and writes to its right neighbor over another pipe. Since xv6 has
|
|
||||||
limited number of file descriptors and processes, the first
|
|
||||||
process can stop at 35.
|
|
||||||
|
|
||||||
<p>Some hints:
|
|
||||||
<ul>
|
|
||||||
<li>Be careful to close file descriptors that a process doesn't
|
|
||||||
need, because otherwise your program will run xv6 out of resources
|
|
||||||
before the first process reaches 35.
|
|
||||||
|
|
||||||
<li>Once the first process reach 35, you should arrange that the
|
|
||||||
pipeline terminates cleanly (Hint: read will return an end-of-file
|
|
||||||
when the write-side of the pipe is closed).
|
|
||||||
</ul>
|
|
||||||
|
|
||||||
<h2>find</h2>
|
|
||||||
|
|
||||||
<p>Write a simple version of the UNIX find program: find all the files
|
|
||||||
in a directory tree whose name matches a string. For example if the
|
|
||||||
file system contains a file <tt>a/b</tt>, then running find as
|
|
||||||
follows should produce:
|
|
||||||
<pre>
|
|
||||||
$ find . b
|
|
||||||
./a/b
|
|
||||||
$
|
|
||||||
</pre>
|
|
||||||
|
|
||||||
<p>Some hints:
|
|
||||||
<ul>
|
|
||||||
<li>Look at user/ls.c to see how to read directories.
|
|
||||||
<li>Use recursion to run find in sub-directories.
|
|
||||||
<li>Don't recurse into "." and "..".
|
|
||||||
</ul>
|
|
||||||
|
|
||||||
<p>Optional: support regular expressions in name matching. Grep has some
|
|
||||||
primitive support for regular expressions.
|
|
||||||
|
|
||||||
<h2>xargs</h2>
|
|
||||||
|
|
||||||
<p>Write a simple version of the UNIX xargs program: read lines from
|
|
||||||
standard in and run a command for each line, supplying the line as
|
|
||||||
arguments to the command. The following example illustrates xarg's
|
|
||||||
behavior:
|
|
||||||
<pre>
|
|
||||||
$ xargs echo bye
|
|
||||||
hello too
|
|
||||||
bye hello too
|
|
||||||
<ctrl-d>
|
|
||||||
$
|
|
||||||
</pre>
|
|
||||||
Note that the command here is "echo bye" and the additional
|
|
||||||
arguments are "hello too", making the command "echo bye hello too",
|
|
||||||
which outputs "bye hello too".
|
|
||||||
|
|
||||||
<p>xargs and find combine well:
|
|
||||||
<pre>
|
|
||||||
find . b | xargs grep hello
|
|
||||||
</pre>
|
|
||||||
will run "grep hello" on each file named b in the directories below ".".
|
|
||||||
|
|
||||||
<p>Some hints:
|
|
||||||
<ul>
|
|
||||||
<li>Use <tt>fork</tt> and <tt>exec</tt> system call to invoke the
|
|
||||||
command on each line of input. Use <tt>wait</tt> in the parent
|
|
||||||
to wait for the child to complete running the command.
|
|
||||||
<li>Read from stdin a character at the time until the newline
|
|
||||||
character ('\n').
|
|
||||||
<li>kernel/param.h declares MAXARG, which may be useful if you need
|
|
||||||
to declare an argv.
|
|
||||||
</ul>
|
|
||||||
|
|
||||||
<h2>Optional: modify the shell</h2>
|
|
||||||
|
|
||||||
There are endless ways in which the shell could be extended. Here are
|
|
||||||
some suggestions:
|
|
||||||
|
|
||||||
<ul>
|
|
||||||
|
|
||||||
<li>Modify the shell to support wait.
|
|
||||||
|
|
||||||
<li>Modify the shell to support lists of commands, separated by ";"
|
|
||||||
|
|
||||||
<li>Modify the shell to support sub-shells by implementing "(" and ")"
|
|
||||||
|
|
||||||
<li>Modify the shell to allow users to edit the command line
|
|
||||||
|
|
||||||
</ul>
|
|
||||||
|
|
||||||
</body>
|
|
||||||
</html>
|
|
||||||
|
|
||||||
|
|
Loading…
Reference in a new issue