<title>Lecture 5/title> <html> <head> </head> <body> <h2>Address translation and sharing using page tables</h2> <p> Reading: <a href="../readings/i386/toc.htm">80386</a> chapters 5 and 6<br> <p> Handout: <b> x86 address translation diagram</b> - <a href="x86_translation.ps">PS</a> - <a href="x86_translation.eps">EPS</a> - <a href="x86_translation.fig">xfig</a> <br> <p>Why do we care about x86 address translation? <ul> <li>It can simplify s/w structure by placing data at fixed known addresses. <li>It can implement tricks like demand paging and copy-on-write. <li>It can isolate programs to contain bugs. <li>It can isolate programs to increase security. <li>JOS uses paging a lot, and segments more than you might think. </ul> <p>Why aren't protected-mode segments enough? <ul> <li>Why did the 386 add translation using page tables as well? <li>Isn't it enough to give each process its own segments? </ul> <p>Translation using page tables on x86: <ul> <li>paging hardware maps linear address (la) to physical address (pa) <li>(we will often interchange "linear" and "virtual") <li>page size is 4096 bytes, so there are 1,048,576 pages in 2^32 <li>why not just have a big array with each page #'s translation? <ul> <li>table[20-bit linear page #] => 20-bit phys page # </ul> <li>386 uses 2-level mapping structure <li>one page directory page, with 1024 page directory entries (PDEs) <li>up to 1024 page table pages, each with 1024 page table entries (PTEs) <li>so la has 10 bits of directory index, 10 bits table index, 12 bits offset <li>What's in a PDE or PTE? <ul> <li>20-bit phys page number, present, read/write, user/supervisor </ul> <li>cr3 register holds physical address of current page directory <li>puzzle: what do PDE read/write and user/supervisor flags mean? <li>puzzle: can supervisor read/write user pages? <li>Here's how the MMU translates an la to a pa: <pre> uint translate (uint la, bool user, bool write) { uint pde; pde = read_mem (%CR3 + 4*(la >> 22)); access (pde, user, read); pte = read_mem ( (pde & 0xfffff000) + 4*((la >> 12) & 0x3ff)); access (pte, user, read); return (pte & 0xfffff000) + (la & 0xfff); } // check protection. pxe is a pte or pde. // user is true if CPL==3 void access (uint pxe, bool user, bool write) { if (!(pxe & PG_P) => page fault -- page not present if (!(pxe & PG_U) && user) => page fault -- not access for user if (write && !(pxe & PG_W)) if (user) => page fault -- not writable else if (!(pxe & PG_U)) => page fault -- not writable else if (%CR0 & CR0_WP) => page fault -- not writable } </pre> <li>CPU's TLB caches vpn => ppn mappings <li>if you change a PDE or PTE, you must flush the TLB! <ul> <li>by re-loading cr3 </ul> <li>turn on paging by setting CR0_PE bit of %cr0 </ul> Can we use paging to limit what memory an app can read/write? <ul> <li>user can't modify cr3 (requires privilege) <li>is that enough? <li>could user modify page tables? after all, they are in memory. </ul> <p>How we will use paging (and segments) in JOS: <ul> <li>use segments only to switch privilege level into/out of kernel <li>use paging to structure process address space <li>use paging to limit process memory access to its own address space <li>below is the JOS virtual memory map <li>why map both kernel and current process? why not 4GB for each? <li>why is the kernel at the top? <li>why map all of phys mem at the top? i.e. why multiple mappings? <li>why map page table a second time at VPT? <li>why map page table a third time at UVPT? <li>how do we switch mappings for a different process? </ul> <pre> 4 Gig --------> +------------------------------+ | | RW/-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ : . : : . : : . : |~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~| RW/-- | | RW/-- | Remapped Physical Memory | RW/-- | | RW/-- KERNBASE -----> +------------------------------+ 0xf0000000 | Cur. Page Table (Kern. RW) | RW/-- PTSIZE VPT,KSTACKTOP--> +------------------------------+ 0xefc00000 --+ | Kernel Stack | RW/-- KSTKSIZE | | - - - - - - - - - - - - - - -| PTSIZE | Invalid Memory | --/-- | ULIM ------> +------------------------------+ 0xef800000 --+ | Cur. Page Table (User R-) | R-/R- PTSIZE UVPT ----> +------------------------------+ 0xef400000 | RO PAGES | R-/R- PTSIZE UPAGES ----> +------------------------------+ 0xef000000 | RO ENVS | R-/R- PTSIZE UTOP,UENVS ------> +------------------------------+ 0xeec00000 UXSTACKTOP -/ | User Exception Stack | RW/RW PGSIZE +------------------------------+ 0xeebff000 | Empty Memory | --/-- PGSIZE USTACKTOP ---> +------------------------------+ 0xeebfe000 | Normal User Stack | RW/RW PGSIZE +------------------------------+ 0xeebfd000 | | | | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ . . . . . . |~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~| | Program Data & Heap | UTEXT --------> +------------------------------+ 0x00800000 PFTEMP -------> | Empty Memory | PTSIZE | | UTEMP --------> +------------------------------+ 0x00400000 | Empty Memory | PTSIZE 0 ------------> +------------------------------+ </pre> <h3>The VPT </h3> <p>Remember how the X86 translates virtual addresses into physical ones: <p><img src=pagetables.png> <p>CR3 points at the page directory. The PDX part of the address indexes into the page directory to give you a page table. The PTX part indexes into the page table to give you a page, and then you add the low bits in. <p>But the processor has no concept of page directories, page tables, and pages being anything other than plain memory. So there's nothing that says a particular page in memory can't serve as two or three of these at once. The processor just follows pointers: pd = lcr3(); pt = *(pd+4*PDX); page = *(pt+4*PTX); <p>Diagramatically, it starts at CR3, follows three arrows, and then stops. <p>If we put a pointer into the page directory that points back to itself at index Z, as in <p><img src=vpt.png> <p>then when we try to translate a virtual address with PDX and PTX equal to V, following three arrows leaves us at the page directory. So that virtual page translates to the page holding the page directory. In Jos, V is 0x3BD, so the virtual address of the VPD is (0x3BD<<22)|(0x3BD<<12). <p>Now, if we try to translate a virtual address with PDX = V but an arbitrary PTX != V, then following three arrows from CR3 ends one level up from usual (instead of two as in the last case), which is to say in the page tables. So the set of virtual pages with PDX=V form a 4MB region whose page contents, as far as the processor is concerned, are the page tables themselves. In Jos, V is 0x3BD so the virtual address of the VPT is (0x3BD<<22). <p>So because of the "no-op" arrow we've cleverly inserted into the page directory, we've mapped the pages being used as the page directory and page table (which are normally virtually invisible) into the virtual address space. </body>