First draft of text for mmap assignment.

This commit is contained in:
Frans Kaashoek 2019-08-01 07:56:39 -04:00
parent d600026c3f
commit 77da01abb1

View file

@ -1,4 +1,4 @@
<html>
q<html>
<head>
<title>Lab: file system</title>
<link rel="stylesheet" href="homework.css" type="text/css" />
@ -140,6 +140,110 @@ blocks only as needed, like the original <tt>bmap()</tt>.
<h2>Memory-mapped files</h2>
<p>In this assignment you will implement the core of the systems
calls <tt>mmap</tt> and <tt>munmap</tt>; see the man pages for an
explanation what they do (run <tt>man 2 mmap</tt> in your terminal).
The test program <tt>mmaptest</tt> tells you what should work.
<p>Here are some hints about how you might go about this assignment:
<ul>
<li>Start with adding the two systems calls to the kernel, as you
done for other systems calls (e.g., <tt>sigalarm</tt>), but
don't implement them yet; just return an
error. run <tt>mmaptest</tt> to observe the error.
<li>Keep track for each process what <tt>mmap</tt> has mapped.
You will need to allocate a <tt>struct vma</tt> to record the
address, length, permissions, etc. for each virtual memory area
(VMA) that maps a file. Since the xv6 kernel doesn't have a
memory allocator in the kernel, you can use the same approach has
for <tt>struct file</tt>: have a global array of <tt>struct
vma</tt>s and have for each process a fixed-sized array of VMAs
(like the file descriptor array).
<li>Implement <tt>mmap</tt>: allocate a VMA, add it to the process's
table of VMAs, fill in the VMA, and find a hole in the process's
address space where you will map the file. You can assume that no
file will be bigger than 1GB. The VMA will contain a pointer to
a <tt>struct file</tt> for the file being mapped; you will need to
increase the file's reference count so that the structure doesn't
disappear when the file is closed (hint:
see <tt>filedup</tt>). You don't have worry about overlapping
VMAs. Run <tt>mmaptest</tt>: the first <tt>mmap</tt> should
succeed, but the first access to the mmaped- memory will fail,
because you haven't updated the page fault handler.
<li>Modify the page-fault handler from the lazy-allocation and COW
labs to call a VMA function that handles page faults in VMAs.
This function allocates a page, reads a 4KB from the mmap-ed
file into the page, and maps the page into the address space of
the process. To read the page, you can use <tt>readi</tt>,
which allows you to specify an offset from where to read in the
file (but you will have to lock/unlock the inode passed
to <tt>readi</tt>). Don't forget to set the permissions correctly
on the page. Run <tt>mmaptest</tt>; you should get to the
first <tt>munmap</tt>.
<li>Implement <tt>munmap</tt>: find the <tt>struct vma</tt> for
the address and unmap the specified pages (hint:
use <tt>uvmunmap</tt>). If <tt>munmap</tt> removes all pages
from a VMA, you will have to free the VMA (don't forget to
decrement the reference count of the VMA's <tt>struct
file</tt>); otherwise, you may have to shrink the VMA. You can
assume that <tt>munmap</tt> will not split a VMA into two VMAs;
that is, we don't unmap a few pages in the middle of a VMA. If
an unmapped page has been modified and the file is
mapped <tt>MAP_SHARED</tt>, you will have to write the page back
to the file. RISC-V has a dirty bit (<tt>D</tt>) in a PTE to
record whether a page has ever been written too; add the
declaration to kernel/riscv.h and use it. Modify <tt>exit</tt>
to call <tt>munmap</tt> for the process's open VMAs.
Run <tt>mmaptest</tt>; you should <tt>mmaptest</tt>, but
probably not <tt>forktest</tt>.
<li>Modify <tt>fork</tt> to copy VMAs from parent to child. Don't
forget to increment reference count for a VMA's <tt>struct
file</tt>. In the page fault handler of the child, it is OK to
allocate a new page instead of sharing the page with the
parent. The latter would be cooler, but it would require more
implementation work. Run <tt>mmaptest</tt>; make sure you pass
both <tt>mmaptest</tt> and <tt>forktest</tt>.
</ul>
<p>Run usertests to make sure you didn't break anything.
<p>Optional challenges:
<ul>
<li>If two processes have the same file mmap-ed (as
in <tt>forktest</tt>), share their physical pages. You will need
reference counts on physical pages.
<li>The solution above allocates a new physical page for each page
read from the mmap-ed file, even though the data is also in kernel
memory in the buffer cache. Modify your implementation to mmap
that memory, instead of allocating a new page. This requires that
file blocks be the same size as pages (set <tt>BSIZE</tt> to
4096). You will need to pin mmap-ed blocks into the buffer cache.
You will need worry about reference counts.
<li>Remove redundancy between your implementation for lazy
allocation and your implementation of mmapp-ed files. (Hint:
create an VMA for the lazy allocation area.)
<li>Modify <tt>exec</tt> to use a VMA for different sections of
the binary so that you get on-demand-paged executables. This will
make starting programs faster, because <tt>exec</tt> will not have
to read any data from the file system.
<li>Implement on-demand paging: don't keep a process in memory,
but let the kernel move some parts of processes to disk when
physical memory is low. Then, page in the paged-out memory when
the process references it.
</ul>
</body>
</html>