Lab: mmap
In this lab you will use mmap on Linux to demand-page a
very large table and add memory-mapped files to xv6.
Using mmap on Linux
This assignment will make you more familiar with how to manage virtual memory
in user programs using the Unix system call interface. You can do this
assignment on any operating system that supports the Unix API (a Linux Athena
machine, your laptop with Linux or MacOS, etc.).
Download the mmap homework assignment and look
it over. The program maintains a very large table of square root
values in virtual memory. However, the table is too large to fit in
physical RAM. Instead, the square root values should be computed on
demand in response to page faults that occur in the table's address
range. Your job is to implement the demand faulting mechanism using a
signal handler and UNIX memory mapping system calls. To stay within
the physical RAM limit, we suggest using the simple strategy of
unmapping the last page whenever a new page is faulted in.
To compile mmap.c, you need a C compiler, such as gcc. On Athena,
you can type:
$ add gnu
Once you have gcc, you can compile mmap.c as follows:
$ gcc mmap.c -lm -o mmap
Which produces a mmap file, which you can run:
$ ./mmap
page_size is 4096
Validating square root table contents...
oops got SIGSEGV at 0x7f6bf7fd7f18
When the process accesses the square root table, the mapping does not exist
and the kernel passes control to the signal handler code in
handle_sigsegv(). Modify the code in handle_sigsegv() to map
in a page at the faulting address, unmap a previous page to stay within the
physical memory limit, and initialize the new page with the correct square root
values. Use the function calculate_sqrts() to compute the values.
The program includes test logic that verifies if the contents of the
square root table are correct. When you have completed your task
successfully, the process will print “All tests passed!”.
You may find that the man pages for mmap() and munmap() are helpful references.
$ man mmap
$ man munmap
Implement memory-mapped files in xv6
In this assignment you will implement memory-mapped files in xv6.
The test program mmaptest tells you what should work.
Here are some hints about how you might go about this assignment:
- Start with adding the two systems calls to the kernel, as you
done for other systems calls (e.g., sigalarm), but
don't implement them yet; just return an
error. run mmaptest to observe the error.
- Keep track for each process what mmap has mapped.
You will need to allocate a struct vma to record the
address, length, permissions, etc. for each virtual memory area
(VMA) that maps a file. Since the xv6 kernel doesn't have a
memory allocator in the kernel, you can use the same approach has
for struct file: have a global array of struct
vmas and have for each process a fixed-sized array of VMAs
(like the file descriptor array).
- Implement mmap: allocate a VMA, add it to the process's
table of VMAs, fill in the VMA, and find a hole in the process's
address space where you will map the file. You can assume that no
file will be bigger than 1GB. The VMA will contain a pointer to
a struct file for the file being mapped; you will need to
increase the file's reference count so that the structure doesn't
disappear when the file is closed (hint:
see filedup). You don't have worry about overlapping
VMAs. Run mmaptest: the first mmap should
succeed, but the first access to the mmaped- memory will fail,
because you haven't updated the page fault handler.
- Modify the page-fault handler from the lazy-allocation and COW
labs to call a VMA function that handles page faults in VMAs.
This function allocates a page, reads a 4KB from the mmap-ed
file into the page, and maps the page into the address space of
the process. To read the page, you can use readi,
which allows you to specify an offset from where to read in the
file (but you will have to lock/unlock the inode passed
to readi). Don't forget to set the permissions correctly
on the page. Run mmaptest; you should get to the
first munmap.
- Implement munmap: find the struct vma for
the address and unmap the specified pages (hint:
use uvmunmap). If munmap removes all pages
from a VMA, you will have to free the VMA (don't forget to
decrement the reference count of the VMA's struct
file); otherwise, you may have to shrink the VMA. You can
assume that munmap will not split a VMA into two VMAs;
that is, we don't unmap a few pages in the middle of a VMA. If
an unmapped page has been modified and the file is
mapped MAP_SHARED, you will have to write the page back
to the file. RISC-V has a dirty bit (D) in a PTE to
record whether a page has ever been written too; add the
declaration to kernel/riscv.h and use it. Modify exit
to call munmap for the process's open VMAs.
Run mmaptest; you should mmaptest, but
probably not forktest.
- Modify fork to copy VMAs from parent to child. Don't
forget to increment reference count for a VMA's struct
file. In the page fault handler of the child, it is OK to
allocate a new page instead of sharing the page with the
parent. The latter would be cooler, but it would require more
implementation work. Run mmaptest; make sure you pass
both mmaptest and forktest.
Run usertests to make sure you didn't break anything.
Optional challenges:
- If two processes have the same file mmap-ed (as
in forktest), share their physical pages. You will need
reference counts on physical pages.
- The solution above allocates a new physical page for each page
read from the mmap-ed file, even though the data is also in kernel
memory in the buffer cache. Modify your implementation to mmap
that memory, instead of allocating a new page. This requires that
file blocks be the same size as pages (set BSIZE to
4096). You will need to pin mmap-ed blocks into the buffer cache.
You will need worry about reference counts.
- Remove redundancy between your implementation for lazy
allocation and your implementation of mmapp-ed files. (Hint:
create an VMA for the lazy allocation area.)
- Modify exec to use a VMA for different sections of
the binary so that you get on-demand-paged executables. This will
make starting programs faster, because exec will not have
to read any data from the file system.
- Implement on-demand paging: don't keep a process in memory,
but let the kernel move some parts of processes to disk when
physical memory is low. Then, page in the paged-out memory when
the process references it. Port your linux program from the first
assignment to xv6 and run it.