Home | Docs | Forums | Lists | Bugs | Planet | Store | GMN | Get Gentoo!
Not eligible to see or edit group visibility for this bug.
View Bug Activity | Format For Printing | XML | Clone This Bug
I have found an exploitable flaw in the page fault handler, however only in the SMP case. The problem is this: [A] down_read(&mm->mmap_sem); vma = find_vma(mm, address); if (!vma) goto bad_area; if (vma->vm_start <= address) goto good_area; if (!(vma->vm_flags & VM_GROWSDOWN)) goto bad_area; if (error_code & 4) { /* * accessing the stack below %esp is always a bug. * The "+ 32" is there due to some instructions (like * pusha) doing post-decrement on the stack and that * doesn't show up until later.. */ if (address + 32 < regs->esp) goto bad_area; } if (expand_stack(vma, address)) [B] goto bad_area; an exploitable race scenario looks as follows: 1) one thread issues down_write on the sem (remap, madvise, ...) 2) two other threads faul below a VM_GROWSDOWN segment (note that they can fault anywhere below vm_start since esp is arbitrary) and sleep in [A] 3) first thread releases the sem and the two others run again, both find the same VMA but: thread1 ----------F1----------[ VMA ] thread2 ---------------F2-----[ VMA ] where F1/2 faul address. If timed carefully we get: thread1 expands stack to F1, installs pte1 thread2 expands stack to F2, installs pte2 resulting in pte1 not covered by the VMA. Techniques like in mremap_pte can be applied to further exploit this condition. Please do not argue that the race window is small - I have seen even smaller windows opening like a barn :-] The critical section is only from [A] to [B], we do not care about timings of handle_mm_fault etc, since the VMA is later consulted only for page flags. Note that this also races with ptrace/proc etc (everything using access_process_vm/get_user_pages).
Fix provided. Description: Fix expand_stack() SMP race Two threads sharing the same VMA can race in expand_stack, resulting in incorrect VMA size accounting and possibly a "uncovered-by-VMA" pte leak. Fix is to check if the stack has already been expanded after acquiring a lock which guarantees exclusivity (page_table_lock in v2.4 and vma_anon lock in v2.6). v2.4: --- linux-2.4.28.orig/include/linux/mm.h 2005-01-07 09:12:48.000000000 -0200 +++ linux-2.4.28/include/linux/mm.h 2005-01-07 14:51:20.595060272 -0200 @@ -647,12 +647,19 @@ unsigned long grow; /* - * vma->vm_start/vm_end cannot change under us because the caller is required - * to hold the mmap_sem in write mode. We need to get the spinlock only - * before relocating the vma range ourself. + * vma->vm_start/vm_end cannot change under us because the caller + * is required to hold the mmap_sem in read mode. We need the + * page_table_lock lock to serialize against concurrent expand_stacks. */ address &= PAGE_MASK; spin_lock(&vma->vm_mm->page_table_lock); + + /* already expanded while we were spinning? */ + if (vma->vm_start <= address) { + spin_unlock(&vma->vm_mm->page_table_lock); + return 0; + } + grow = (vma->vm_start - address) >> PAGE_SHIFT; if (vma->vm_end - address > current->rlim[RLIMIT_STACK].rlim_cur || ((vma->vm_mm->total_vm + grow) << PAGE_SHIFT) > current->rlim[RLIMIT_AS].rlim_cur) { v2.6: --- linux-2.6.10-mm1.orig/mm/mmap.c 2005-01-05 15:58:26.000000000 -0200 +++ linux-2.6.10-mm1/mm/mmap.c 2005-01-07 14:47:05.894780600 -0200 @@ -1373,6 +1373,13 @@ */ address += 4 + PAGE_SIZE - 1; address &= PAGE_MASK; + + /* already expanded while waiting for anon_vma lock? */ + if (vma->vm_end >= address) { + anon_vma_unlock(vma); + return 0; + } + grow = (address - vma->vm_end) >> PAGE_SHIFT; /* Overcommit.. */ @@ -1432,6 +1439,12 @@ return -ENOMEM; anon_vma_lock(vma); + /* already expanded while waiting for anon_vma lock? */ + if (vma->vm_start <= address) { + anon_vma_unlock(vma); + return 0; + } + /* * vma->vm_start/vm_end cannot change under us because the caller * is required to hold the mmap_sem in read mode. We need the _
Alternative RH fix (http://rhn.redhat.com/errata/RHBA-2004-550.html): + + /* check if another thread has already expanded the stack */ + if (address >= vma->vm_start) { + spin_unlock(&vma->vm_mm->page_table_lock); + vm_validate_enough("exiting expand_stack - NOTHING TO DO"); + return 0; + } +
Disclosure is set to 20050112. This will be handled on a new bug as this one is CLASSIFIED and should _never_ be opened.
followup in bug 77666 *** This bug has been marked as a duplicate of 77666 ***