Bug 115285

Summary:	tpop3d crashes with "stack smashing attack in function connection_new()" when built with gcc-3.3.4-r1
Product:	Gentoo Linux	Reporter:	Alexander Stoll <technoworx>
Component:	[OLD] Server	Assignee:	The Gentoo Linux Hardened Team <hardened>
Status:	RESOLVED OBSOLETE
Severity:	critical	CC:	alex_y_xu, griffon26, kevquinn, vapier
Priority:	High
Version:	unspecified
Hardware:	x86
OS:	Linux
Whiteboard:
Package list:		Runtime testing required:	---
Attachments:	Removes the local blocks declared with the alloc_struct macro, thus avoiding a gcc-3.x SSP compiler bug

Description Alexander Stoll 2005-12-12 03:11:41 UTC

After upgrading to gcc-3.4.4-r1 an recompile system / world the former working
tpop3d daemon crashes on first connect via network:

tpop3d: stack smashing attack in function connection_new()

This is verified on several hardened boxes with different CFLAGS, the binary
built on the system with gcc-3.3.6 before upgrade runs fine in the fresh
gcc-3.4.4-r1 compiled system, it is definitely triggered from nre stable gcc
version.

Reproducible: Always
Steps to Reproduce:
1. upgrade to current stable gcc-3.4.4-r1
2. recompile whole box with this gcc
3. watch tpop3d abort on first connect

Actual Results:  
crash, coredump

Expected Results:  
proper operation, of course

Portage 2.0.51.22-r3 (default-linux/x86/no-nptl/2.4, gcc-3.4.4, glibc-2.3.5-r2,
2.4.32-grsec i686)
=================================================================
System uname: 2.4.32-grsec i686 Pentium III (Katmai)
Gentoo Base System version 1.6.13
dev-lang/python:     2.4.2
sys-apps/sandbox:    1.2.12
sys-devel/autoconf:  2.13, 2.59-r6
sys-devel/automake:  1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6-r1
sys-devel/binutils:  2.16.1
sys-devel/libtool:   1.5.20
virtual/os-headers:  2.4.22-r1
ACCEPT_KEYWORDS="x86"
AUTOCLEAN="yes"
CBUILD="i686-pc-linux-gnu"
CFLAGS="-O3 -march=pentiumpro -fomit-frame-pointer -pipe"
CHOST="i686-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/kde/2/share/config /usr/kde/3/share/config /usr/share/
config /var/qmail/control"
CONFIG_PROTECT_MASK="/etc/gconf /etc/terminfo /etc/env.d"
CXXFLAGS="-O3 -march=pentiumpro -fomit-frame-pointer -pipe"
DISTDIR="/usr/portage/distfiles"
FEATURES="autoconfig distlocks sandbox sfperms strict"
GENTOO_MIRRORS="http://distfiles.gentoo.org http://distro.ibiblio.org/pub/Linux/
distributions/gentoo"
LC_ALL="de_DE@euro"
LINGUAS="de en"
MAKEOPTS="-j2"
PKGDIR="/usr/portage/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
SYNC="rsync://rsync.gentoo.org/gentoo-portage"
USE="x86 activefilter bitmap-fonts bzip2 crypt ctype dbm eds emboss erandom ethe
real expat flatfile ftp gd gdbm gif gnutls hardened icc idn ifc ipppd jpeg maild
ir mbox memlimit mhash mime mmap mmx mp3 mysql mysqli ncurses nls ogg pam pcntl
pcre perl pic pie png posix readline recode sasl shared sharedmem slang sockets
socks5 spf sse ssl sysvipc szip tcpd tools truetype-fonts type1-fonts unicode vo
rbis zlib linguas_de linguas_en userland_GNU kernel_linux elibc_glibc"
Unset:  ASFLAGS, CTARGET, LANG, LDFLAGS, PORTDIR_OVERLAY

Comment 1 Alexander Stoll 2005-12-12 05:19:44 UTC

sorry for messing up the revision number of gcc in header... *grrr* short night...

additional info via strace:

open("/etc/hosts.allow", O_RDONLY)      = -1 ENOENT (No such file or directory)
open("/etc/hosts.deny", O_RDONLY)       = -1 ENOENT (No such file or directory)
setsockopt(5, SOL_SOCKET, SO_SNDBUF, [8192], 4) = 0
fcntl64(5, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
getsockname(5, {sa_family=AF_INET, sin_port=htons(110), sin_addr=inet_addr("0.0
.0.0")}, [16]) = 0
open("/dev/urandom", O_RDONLY|O_LARGEFILE) = 6
read(6, " @-G\27\307\21<\357Hygq\364\324 ", 16) = 16
close(6)                                = 0
time(NULL)                              = 1134392024
write(5, "+OK <20402d4717c7113cef48796771f"..., 53) = 53
time(NULL)                              = 1134392024
rt_sigprocmask(SIG_BLOCK, ~[ABRT], NULL, 8) = 0
write(2, "tpop3d", 6tpop3d)                   = 6
write(2, ": stack smashing attack in funct"..., 36: stack smashing attack in fun
ction ) = 36
write(2, "connection_new", 14connection_new)          = 14
write(2, "()\n", 3()
)                     = 3
rt_sigaction(SIGABRT, {SIG_DFL}, NULL, 8) = 0
getpid()                                = 16811
kill(16811, SIGABRT)                    = 0
--- SIGABRT (Aborted) @ 0 (0) ---
+++ killed by SIGABRT (core dumped) +++

BT is no help, because gdb has no symbols from external libs and a appropriate
debug environment is not in place for more than only building the tpop3d binary
with debug unformation and -pie ...

Comment 2 Maurice van der Pot (RETIRED) gentoo-dev

2005-12-12 09:13:46 UTC

Some other things I've noticed:

- It occurs with -O2 and higher, but not if -fstrict-aliasing is in CFLAGS
- Removing the call to make_timestamp() in connection_new() gets rid of the crash
- If i add a char array of 113 or larger as the first local variable of
make_timestamp(), the crash does not occur. If it's 112 bytes or smaller or if
it is added at the end of the list of local vars, then it does occur.

Comment 3 Alexander Stoll 2005-12-13 06:14:30 UTC

(In reply to comment #2)

> - It occurs with -O2 and higher, but not if -fstrict-aliasing is in CFLAGS
I can verify this on a few boxes, used FLAGS:
-O3 -march=pentiumpro -fomit-frame-pointer -fstrict-aliasing -pipe
-O3 -march=pentium3 -fomit-frame-pointer -fstrict-aliasing -pipe

No crash...

Comment 4 Alexander Stoll 2006-01-18 02:01:33 UTC

If nothing new comes up, please incorporate the CFLAG that circumvents this issue in an updated ebuild to restore proper function of compiled binaries.
Maybe it would be another option to ask upstream to release 1.6.0 from the current cvs version, that is - according to mail list archive - already distributed by many others for fixing other bugs...
If someone will provide an ebuild with cvs version, I will happily test it...

Comment 5 Maurice van der Pot (RETIRED) gentoo-dev

2006-01-23 12:28:42 UTC

Alexander, I'm not comfortable with adding -fstrict-aliasing, because 
there can be code that works without it, but not with it. And just because
it doesn't crash, doesn't mean it does the right thing.

Security people, I could really use some insight into this problem.

Comment 6 Alexander Stoll 2006-02-08 05:12:05 UTC

(In reply to comment #5)
> Alexander, I'm not comfortable with adding -fstrict-aliasing, because 
> there can be code that works without it, but not with it. And just because
> it doesn't crash, doesn't mean it does the right thing.

This is totally understood, I agree with you that this is the last and worst option that should be considered to keep the package working...
I would be sleeping better if the cause of this bug is understood - and of course - fixed.
I don

Comment 7 Alexander Stoll 2006-02-08 05:12:05 UTC

(In reply to comment #5)
> Alexander, I'm not comfortable with adding -fstrict-aliasing, because 
> there can be code that works without it, but not with it. And just because
> it doesn't crash, doesn't mean it does the right thing.

This is totally understood, I agree with you that this is the last and worst option that should be considered to keep the package working...
I would be sleeping better if the cause of this bug is understood - and of course - fixed.
I don´t want to bother or rush people, but this bug lurks around some time and I have the impression that the severity of this is underestimated. It may be not as widely deployed as apache, but it is identically exposed and people rely on pop3 service, so it deserves the same attention - my opinion.
Even when I don´t have a debug environment handy, if the only measure to fix this soon is, that I provide a full backtrace, I will do the escessary work - and stop complaining - until it is available to the security herd... ;-)
I hope my points are considered with benevolence, I don´t want to bug people...

Security, please advice.

Comment 8 solar (RETIRED) gentoo-dev

2006-02-08 10:29:50 UTC

connection.c:102:Bounds error: attempt to reference memory overrunning the end of an object.
connection.c:102:  Pointer value: 0x5d619f90, Size: 16
connection.c:102:  Object `sinlocal':
connection.c:102:    Address in memory:    0x5d619f80 .. 0x5d619f8f
connection.c:102:    Size:                 16 bytes
connection.c:102:    Element size:         1 bytes
connection.c:102:    Number of elements:   16
connection.c:102:    Created at:           netloop.c, line 125
connection.c:102:    Storage class:        stack
quit: signal 6 post_fork = 0
Aborted

Comment 9 solar (RETIRED) gentoo-dev

2006-02-08 11:17:33 UTC

Thats forcing -fno-striact-aliasing and with -fstack-protector -fbounds-checking 
enabled. Now with -fno-stack-protector -fno-strict-aliasing -fbounds-checking I 
cant make comment #7 happen again. So this almost appears to be a bug in ssp. 
Adding kevin quinn to the CC: list to ask him to take a look also.

Comment 10 Maurice van der Pot (RETIRED) gentoo-dev

2006-02-08 12:41:11 UTC

To quickly test, put these two lines in a tpop3d.conf:

  listen-address: 0.0.0.0:12345
  auth-pam-enable: yes

and start tpop3d with:

  ./tpop3d -f /path/to/tpop3d.conf -d

Then connect to it:

  telnet localhost 12345

Comment 11 Kevin F. Quinn (RETIRED) gentoo-dev

2006-02-08 16:12:14 UTC

Hmm; definitely looks like a bug in SSP.

First here's the preprocessed source at the start of connection_new:


connection connection_new(int s,
                          const struct sockaddr_in *sin,
                          listener L) {
    int n;
    connection c = ((void *)0);

    do { struct _connection as__z = {0};
         c = xmalloc(sizeof *c);
         *c = as__z;
       } while (0);
    ...

Note the static initialiser '= {0}' is built by the compiler as a call to 'memset' on a block declared locally, that is then memcpy'd to the xmalloc'd pointer.  Here's the assembler compiled with fno-stack-protector:


connection_new:
    pushl   %ebp
    movl    $128, %eax
    movl    %esp, %ebp
    pushl   %edi
    pushl   %esi
    pushl   %ebx
    subl    $588, %esp
    movl    12(%ebp), %ebx
    movl    8(%ebp), %edi
    movl    %eax, 8(%esp)
    xorl    %eax, %eax
    movl    %eax, 4(%esp)
    leal    -152(%ebp), %eax
    movl    %eax, (%esp)
    call    memset

As we can see, the a frame of 588 bytes is setup, which includes the a stack buffer at 152 bytes below the frame pointer (ebp - stack pointer on entry).  This is quite sufficient for the 128-byte _connection structure allocated with macro alloc_struct.

However if we look at the output when built with -fstack-protector we get:


connection_new:
    pushl   %ebp
    movl    %esp, %ebp
    leal    -56(%ebp), %edx
    pushl   %edi
    pushl   %esi
    pushl   %ebx
    subl    $604, %esp
    movl    __guard, %eax
    movl    %edx, (%esp)
    movl    12(%ebp), %ebx
    movl    8(%ebp), %edi
    movl    %eax, -40(%ebp)
    movl    16(%ebp), %eax
    movl    %eax, -584(%ebp)
    movl    $128, %eax
    movl    %eax, 8(%esp)
    xorl    %eax, %eax
    movl    %eax, 4(%esp)
    call    memset


Things are a little more involved, but importantly the address passed to the memset call is only 56 bytes less than the frame pointer (see the leal instruction to edx), so when memset blats it with 128 (0x80) bytes of zeros, the guard gets zeroed along with a bunch of other stuff.

This is clearly wrong.  It seems the 3.x stack protector messes up references to buffers declared in local blocks while trying to reorder frame data to put buffers after scalars.  Fixing the stack protector is another matter of course - but anything that can eliminate the buffers declared in local blocks may work.

It's easy to patch the source to fix the problem and I'll attach a patch shortly that avoids the buffer in the local block, however this is a serious shortcoming of the 3.x stack protector.  Incidentally, declaring stuff in local blocks is an extension to standard C so is probably relatively rarely used for C code, however it is standard C++.  Perhaps this explains why we find SSP so unreliable on C++ code, but have relatively few problems with C code.

I tried the original code with a 4.1.0 beta, which has a completely different stack-protector implementation, and this created code that is correct. However notably the 4.1.0 beta didn't reorder the data on the stack, which makes it weaker than the 3.x SSP implementation. So although the 4.1.0 stack protector generates working code, it's just not doing the part of the 3.x stack protector functionality that causes the problem in 3.x.

Comment 12 Kevin F. Quinn (RETIRED) gentoo-dev

2006-02-08 16:19:41 UTC

Created attachment 79288 [details, diff]
Removes the local blocks declared with the alloc_struct macro, thus avoiding a gcc-3.x SSP compiler bug

This seems to work on the limited testing I've done here.

It also has added benefits; it reduces overall stack usage (for what that's worth :) ) and saves a block copy (memcpy) whenever the alloc_struct macro is used.

Fixing the GCC-3.x stack protector implementation itself is left as an exercise for the reader ;)

Comment 13 Kevin F. Quinn (RETIRED) gentoo-dev

2006-02-11 04:25:40 UTC

reassigned - it's not security problem, just a bug in the compiler.

Maurice - up to you how to proceed; you could apply the patch, or you could filter fstack-protector.

It might be worth seeing what upstream think of the patch.  Not for the problem with SSP, but for code efficiency.  Their code is possibly more portable (but not much - memset is standard in C90 at least), but the patched code is quicker and uses less stack.

Comment 14 Maurice van der Pot (RETIRED) gentoo-dev

2006-02-13 13:12:22 UTC

tpop3d's Chris Lightfoot wrote:

  Your implementation of alloc_struct is not valid. It                                                                                                                                                                                      
  assumes that all-bits-zero is a valid representation of                                                                                                                                                                                   
  any element of a struct on any architecture, and that that                                                                                                                                                                                
  is the same representation as would be yielded by the =                                                                                                                                                                                   
  {0} declaration. There is no such guarantee in the C or                                                                                                                                                                                   
  POSIX standards, and while using memset(..., 0, ...) may                                                                                                                                                                                  
  work on your architecture it is not portable and therefore                                                                                                                                                                                
  best avoided unless there is some compelling reason to use                                                                                                                                                                                
  it.                                                                                                                                                                                                                                       
                                                                                                                                                                                                                                          
Kevin/hardened, what timeframe would we be talking about regarding a fix for
the bug in SSP? I'll disable stack protector for tpop3d for now.

Comment 15 Kevin F. Quinn (RETIRED) gentoo-dev

2006-02-14 00:10:05 UTC

I do assume the struct is contiguous, and had no idea '= {0}' could generate non-zero data.

> Kevin/hardened, what timeframe would we be talking about regarding a fix for
> the bug in SSP? I'll disable stack protector for tpop3d for now.

Well, upstream 3.x SSP author isn't responding to me so I can't say if we'll have any luck there.  I've been learning gcc internals so hope to be able to fix it eventually, but it's a lot to learn.

In other words, don't hold your breath ;)  I'll raise a separate bug to keep the SSP problem itself in mind so you can close this by filtering stack-protector; when you do so please add a comment referring to this bug so we know why the filtering is present.

Comment 16 Kevin F. Quinn (RETIRED) gentoo-dev

2006-02-14 02:17:59 UTC

Quick follow-up on the '={0}' thing; I had a look through the C90 standard to see how it could generate non-zero data, and it can occur on platforms where the null pointer is non-zero.  The '0' in the initialiser is interpreted as a "null pointer constant" (always zero) for any pointer elements of the structure.  The null pointer constant is converted into a null pointer - however the standard does not define the representation of a null pointer, leaving it to the implementation to choose a representation that it considers most sensible for the architecture.

Relevant bits of ISO 9899-1999 are sections 6.7.8 paras 10 and 21, and for definition of 'null pointer constant see section 6.3.2.3 para 3.

Comment 17 Maurice van der Pot (RETIRED) gentoo-dev

2006-02-14 09:43:06 UTC

You can use this bug for SSP, because this report really is about that issue. 
Once this one is closed, I can remove filter-flags again from tpop3d.

Comment 18 Kelly Price 2007-02-01 01:23:24 UTC

Has anyone tested this with GCC 4.1.1?

Comment 19 Kevin F. Quinn (RETIRED) gentoo-dev

2007-03-05 19:18:31 UTC

For info - it looks fine when built with 4.1.2.  It does re-order parameters (contrary to what I said before) so all in all it's a much better SSP implementation :)


vapier - are you happy to have gcc-major-version() used to control the filtering according to compiler version, e.g.:

[[ $(gcc-major-version) -eq 3 ]] && filter-flags -fstack-protector

?

Comment 20 SpanKY gentoo-dev

2007-03-06 19:15:37 UTC

since we're looking at deprecating gcc-3.x once gcc-4.x hardened is in place, that sounds fine to me

Comment 21 solar (RETIRED) gentoo-dev

2007-03-06 22:24:15 UTC

(In reply to comment #20)
> since we're looking at deprecating gcc-3.x once gcc-4.x hardened is in place,
> that sounds fine to me

As far as I know qemu can't be built under gcc-4.x without patching. Patches which are not in place at gentoo. It would be a shame to lose the qemu package.

Comment 22 SpanKY gentoo-dev

2007-03-07 01:33:11 UTC

what does that have to do with anything ?

gcc-3.x is deprecated/not-supported now for non-hardened systems, it doesnt mean i'm removing them from the tree

Comment 23 solar (RETIRED) gentoo-dev

2007-03-07 02:42:08 UTC

(In reply to comment #22)
> it doesnt mean i'm removing them from the tree

Ignore the last comment then.

Comment 24 Alex Xu (Hello71) 2014-02-09 16:47:24 UTC

This hasn't been touched in years, is it OBSOLETE?

Comment 25 Magnus Granberg gentoo-dev

2014-02-09 22:17:44 UTC

Reopen if is still fail with gcc 4.X for we don't support gcc 3.x on hardened any longer.

Comment 26 Larry the Git Cow gentoo-dev

2025-01-28 07:22:10 UTC

The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=97a888d8fd580498c116c643ad10b6fe937abdb9

commit 97a888d8fd580498c116c643ad10b6fe937abdb9
Author:     Sam James <sam@gentoo.org>
AuthorDate: 2025-01-28 07:14:36 +0000
Commit:     Sam James <sam@gentoo.org>
CommitDate: 2025-01-28 07:15:35 +0000

    net-mail/tpop3d: drop filter-flags -fno-stack-protector
    
    This was a bug in some ancient GCC when SSP was still brand-new and
    was confirmed to work w/ newer GCC (4.1.2) in https://bugs.gentoo.org/115285#c19
    in 2007.
    
    Bug: https://bugs.gentoo.org/115285
    Signed-off-by: Sam James <sam@gentoo.org>

 net-mail/tpop3d/tpop3d-1.5.5-r5.ebuild | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)