Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 76818

Summary: Kernel SMP Issue CAN-2005-0001 (Vendor-Sec)
Product: Gentoo Security Reporter: Sune Kloppenborg Jeppesen (RETIRED) <jaervosz>
Component: KernelAssignee: Gentoo Security <security>
Status: RESOLVED DUPLICATE    
Severity: normal    
Priority: High    
Version: unspecified   
Hardware: All   
OS: All   
Whiteboard: CLASSIFIED
Package list:
Runtime testing required: ---

Description Sune Kloppenborg Jeppesen (RETIRED) gentoo-dev 2005-01-05 14:19:45 UTC
I have found an exploitable flaw in the page fault handler, however only 
in the SMP case.

The problem is this:

[A]     down_read(&mm->mmap_sem);

        vma = find_vma(mm, address);
        if (!vma)
                goto bad_area;
        if (vma->vm_start <= address)
                goto good_area;
        if (!(vma->vm_flags & VM_GROWSDOWN))
                goto bad_area;
        if (error_code & 4) {
                /*
                 * accessing the stack below %esp is always a bug.
                 * The "+ 32" is there due to some instructions (like
                 * pusha) doing post-decrement on the stack and that
                 * doesn't show up until later..
                 */
                if (address + 32 < regs->esp)
                        goto bad_area;
        }
        if (expand_stack(vma, address))
[B]             goto bad_area;


an exploitable race scenario looks as follows:

1) one thread issues down_write on the sem (remap, madvise, ...)

2) two other threads faul below a VM_GROWSDOWN segment (note that they can 
fault anywhere below vm_start since esp is arbitrary) and sleep in [A]

3) first thread releases the sem and the two others run again, both find 
the same VMA but:

thread1 ----------F1----------[   VMA   ]
thread2 ---------------F2-----[   VMA   ]

where F1/2 faul address.

If timed carefully we get:

thread1 expands stack to F1, installs pte1
thread2 expands stack to F2, installs pte2

resulting in pte1 not covered by the VMA. Techniques like in mremap_pte 
can be applied to further exploit this condition.

Please do not argue that the race window is small - I have seen even 
smaller windows opening like a barn :-]

The critical section is only from [A] to [B], we do not care about timings 
of handle_mm_fault etc, since the VMA is later consulted only for page 
flags.

Note that this also races with ptrace/proc etc (everything using 
access_process_vm/get_user_pages).
Comment 1 Sune Kloppenborg Jeppesen (RETIRED) gentoo-dev 2005-01-10 12:30:20 UTC
Fix provided.

Description: Fix expand_stack() SMP race

Two threads sharing the same VMA can race in expand_stack, resulting in incorrect VMA 
size accounting and possibly a "uncovered-by-VMA" pte leak.

Fix is to check if the stack has already been expanded after acquiring a lock which 
guarantees exclusivity (page_table_lock in v2.4 and vma_anon lock in v2.6).

v2.4:

--- linux-2.4.28.orig/include/linux/mm.h        2005-01-07 09:12:48.000000000 -0200
+++ linux-2.4.28/include/linux/mm.h     2005-01-07 14:51:20.595060272 -0200
@@ -647,12 +647,19 @@
        unsigned long grow;
 
        /*
-        * vma->vm_start/vm_end cannot change under us because the caller is required
-        * to hold the mmap_sem in write mode. We need to get the spinlock only
-        * before relocating the vma range ourself.
+        * vma->vm_start/vm_end cannot change under us because the caller
+        * is required to hold the mmap_sem in read mode.  We need the
+        * page_table_lock lock to serialize against concurrent expand_stacks.
         */
        address &= PAGE_MASK;
        spin_lock(&vma->vm_mm->page_table_lock);
+
+       /* already expanded while we were spinning? */
+       if (vma->vm_start <= address) {
+               spin_unlock(&vma->vm_mm->page_table_lock);
+               return 0;
+       }
+
        grow = (vma->vm_start - address) >> PAGE_SHIFT;
        if (vma->vm_end - address > current->rlim[RLIMIT_STACK].rlim_cur ||
            ((vma->vm_mm->total_vm + grow) << PAGE_SHIFT) > current->rlim[RLIMIT_AS].rlim_cur) {

v2.6: 

--- linux-2.6.10-mm1.orig/mm/mmap.c     2005-01-05 15:58:26.000000000 -0200
+++ linux-2.6.10-mm1/mm/mmap.c  2005-01-07 14:47:05.894780600 -0200
@@ -1373,6 +1373,13 @@
         */
        address += 4 + PAGE_SIZE - 1;
        address &= PAGE_MASK;
+
+       /* already expanded while waiting for anon_vma lock? */
+       if (vma->vm_end >= address) {
+               anon_vma_unlock(vma);
+               return 0;
+       }
+
        grow = (address - vma->vm_end) >> PAGE_SHIFT;
 
        /* Overcommit.. */
@@ -1432,6 +1439,12 @@
                return -ENOMEM;
        anon_vma_lock(vma);
 
+       /* already expanded while waiting for anon_vma lock? */
+       if (vma->vm_start <= address) {
+               anon_vma_unlock(vma);
+               return 0;
+       }
+
        /*
         * vma->vm_start/vm_end cannot change under us because the caller
         * is required to hold the mmap_sem in read mode.  We need the
_
Comment 2 Sune Kloppenborg Jeppesen (RETIRED) gentoo-dev 2005-01-10 12:32:30 UTC
Alternative RH fix (http://rhn.redhat.com/errata/RHBA-2004-550.html):

+
+       /* check if another thread has already expanded the stack */
+       if (address >= vma->vm_start) {
+               spin_unlock(&vma->vm_mm->page_table_lock);
+               vm_validate_enough("exiting expand_stack - NOTHING TO 
DO");
+               return 0;
+       }
+
Comment 3 Sune Kloppenborg Jeppesen (RETIRED) gentoo-dev 2005-01-11 04:55:53 UTC
Disclosure is set to 20050112. This will be handled on a new bug as this one is CLASSIFIED and should _never_ be opened.
Comment 4 Thierry Carrez (RETIRED) gentoo-dev 2005-01-12 05:18:20 UTC
followup in bug 77666

*** This bug has been marked as a duplicate of 77666 ***