Virtual memory - memory hierarchy levels of main memory and disk type of traffic level access time between levels unit agent ----- ----------- --------------- ---- ----- registers 100's picoseconds ^ | load/store word programmer v cache 10's nanoseconds ^ | cache miss/repl. line hardware v main memory 100's nanoseconds ^ | page fault/repl. page OS v disk 10's milliseconds note the "access gap" between main memory and disk (100,000 x speed difference) for virtual memory demand fetch is page fault - OS invoked, process switch rather than wait placement - typically any available page frame (although sometimes cache-related issues result in placement constraints, which are known as page coloring) replacement - FIFO, pseudo LRU, clock (more on this in OS course) write-back (because of access gap, write-through is not feasible) virtual memory supports multiple users (processes), each with a copy of their own address space (leads to simplified addressing and access control) virtual address physical memory spaces (1 per user) mapping address space +----------+ +----------+ | +----------+ +----------+ | | segmentation - 0| | 0| | | | virtual memory is divided 1| | 1| | | | into logical blocks 2| | 2| | | | (variable-sized segments) 3| | 3| | | | 4| | 4| | | | paging - .| | 5| | | | virtual memory is divided .| | .| | | | into physical blocks .| | .| | |-+ (fixed-size pages) +----------+ .| |-+ +----------+ <- virtual pages / page frames -> Segments reflect the logical division of a program, but are variable-length and typically result in wasted space between allocated segments in memory, called external fragmentation Logical segment names will be produced by the compiler and will be visible to the linker, loader, and at run-time. Segments also represent distinct protection domains. Pages are fixed-size blocks, so the last page which holds an object may have wasted space, called internal fragmentation. Page size considerations: - small pages have less internal fragmentation - small pages require large mapping resources (i.e., large page tables) - match page size to blocksize of most efficient disk I/O transfer Pages can be used as protection domains; but, because of the granularity, different objects should be page aligned (with attendant increase in internal fragmentation). For objects that need limit checks (like stack underflow and overflow), empty "guard pages" can be allocated in the virtual address space immediately before and after. +-----+ +-------+--------+ | CPU | virtual addresses | VPN | offset | +-----+ +-------+--------+ |cache| v | +-----+ +-----+ | | | page| | =========================== |table| | | | +-----+ | +-----+ +------------+ v v |cntlr| | | physical +-------+--------+ +-----+ | main | addresses | PFN | offset | | | memory | +-------+--------+ _|_ | | .-_ _-. +------------+ active parts of | | program in memory | disk| . . memory image of -_ _- whole program on disk The page table answers these questions: 1) is a virtual page currently resident in memory? (if not => page fault occurs when referenced) 2) what is the page frame number? 3) has the page in memory been modified? (implements write-back) Page table entry (PTE) fields: +---+-----+---+---+---+---+ PTE | P | PFN |s/u|r/w| R | M | +---+-----+---+---+---+---+ - presence bit to indicate if page is present in main memory - physical page frame number (PFN) - access rights bits, such as system/user and read/write - referenced (accessed) bit, indicates recently used - modified (dirty/changed) bit so that only the modified blocks need to be returned to disk on replacement Compare to STE +---+------------------+--------+---+---+---+---+---+---+ STE | P | full-length-base | length |s/u|r/w| x | a | R | M | +---+------------------+--------+---+---+---+---+---+---+ - base field is base address of segment within main memory when present - note that segmentation requires an adder (base + offset) while paging just concatenates the translated PFN with the page offset - length field used for memory protection - additional access control bits for execute (x) and append (a) Note recent introduction of NX (no execute) bit in AMD and now Intel processors reducing page table size variable-sized bage table - use page table base and limit multi-level page table Too slow to always refer to page table in main memory, so cache the most recent PTEs close to the CPU in a translation lookaside buffer (TLB) - can be between CPU and cache or between cache and bus (memory always uses physical addresses) +-----+ +-----+ | CPU | or | CPU | +-----+ +-----+ virtually- | TLB | |cache| addressed +-----+ physically- +-----+ cache |cache| addressed | TLB | +-----+ cache +-----+ | | =========================== | | +-----+ +------------+ |cntlr| | | physical +-----+ | main | addresses | | memory | _|_ | | .-_ _-. +------------+ active pages of | | program in memory | disk| . . memory image of -_ _- whole program on disk Traditional page size is 4KB. Rather than changing the page size, most systems now support super-pages, such as 2MB or 4MB contiguous blocks, using a single TLB entry. Paging example Consider a paging system with a virtual address of one hex digit for the virtual page number and one hex digit for the offset within the page (i.e., a page = 16 units). Let the physical address also be one hex digit for the physical memory page frame and one hex digit for the offset within the page. Starting each time from the page table below, what is the result of the virtual address references (give either the OS actions or the physical address that results from translation). page table P - presence bit (0 = not present, 1 = present) entry fields R - read permission bit (0 = read not allowed, 1 = allowed) W - write permission bit (0 = write not allowed, 1 = allowed) X - execute permission bit (0 = fetch not allowed, 1 = allowed) M - modified bit (0 = unmodified, 1 = modified) => write a changed page back to disk when replaced PFN - page frame number page table physical memory (each dot is a byte or word) -----offset----- P R W X M PFN page 0123456789abcdef +---+---+---+---+---+-----+ frame +----------------+ 0 | 1 | 1 | 0 | 0 | 0 | 6 | 0 |................| vpn a R X 1 | 1 | 1 | 0 | 0 | 0 | 2 | 1 | | 2 | 1 | 1 | 1 | 0 | 0 | 5 | 2 |................| vpn 1 R 3 | 1 | 1 | 1 | 0 | 1 | 4 | .----> 3 |................| vpn 5 R 4 | 0 | 1 | 0 | 0 | 0 | 0 | / 4 |................| vpn 3 RW M 5 | 1 | 1 | 0 | 0 | 0 | 3 ---------' 5 |................| vpn 2 RW 6 | 0 | 1 | 1 | 0 | 0 | 0 | 6 |................| vpn 0 R 7 | 1 | 1 | 1 | 1 | 1 | a -----. 7 | | 8 | 0 | 1 | 0 | 0 | 0 | 0 | \ 8 | | 9 | 0 | 1 | 0 | 0 | 0 | 0 | \ 9 | | a | 1 | 1 | 0 | 1 | 0 | 0 | `-------> a |..W............X| vpn 7 RWXM b | 0 | 0 | 0 | 0 | 0 | 0 | b | | c | 0 | 0 | 0 | 0 | 0 | 0 | c | | d | 0 | 0 | 0 | 0 | 0 | 0 | d | | e | 0 | 0 | 0 | 0 | 0 | 0 | e | | f | 0 | 0 | 0 | 0 | 0 | 0 | f | | +---+---+---+---+---+-----+ +----------------+ ^ ^ write 72 -- (W) memory writes at a2 ------' | | read 65 -- page fault (since not present) | | inst. fetch 7f -- (X) memory responds with contents of af ---' write 52 -- protection error (since read-only page)