Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 517028

Summary: www-servers/nginx-1.7.2 on ABI_X86=x32 - if compiled with -O2, crash when closing TCP connexion.
Product: Gentoo Linux Reporter: Thibaud CANALE <thican>
Component: [OLD] ServerAssignee: Tiziano Müller (RETIRED) <dev-zero>
Status: RESOLVED TEST-REQUEST    
Severity: normal CC: bugs, proxy-maint
Priority: Normal    
Version: unspecified   
Hardware: All   
OS: Linux   
Whiteboard:
Package list:
Runtime testing required: ---
Bug Depends on:    
Bug Blocks: 393673    
Attachments: emerge --info
gdb /usr/sbin/nginx /tmp/cores/core
build.log with -O2
build.log with -O0

Description Thibaud CANALE 2014-07-13 15:16:54 UTC
Created attachment 380680 [details]
emerge --info

Hello,

When using www-server/nginx-1.7.2 on x32 ABIs, the process crashs after a TCP with FIN flag set, to a GET request (this behaviour occurred in versions 1.5.13 and in version 1.7.3 from nginx-overlay too).

I don't have any answer about why it is occurring, but I am sure this problem is related to this TCP/FIN flag; here how I found it, and how to reproduce it:

Steps to Reproduce:
1. emerge nginx, do a basic configuration file, create a simple HTML index page,
2. use a netanalyser (wireshark, tcpdump),
3. Do a GET of this web page (LAN and/or WAN),
4. wait to see a TCP/FIN set flag,
5. Refresh the web page,
6. The webpage is not displayed, nginx already crashed when the TCP/FIN set flag occurred.

Other method:
4. Request a full refresh (with Ctrl+Shift+r on Firefox),
5. The webpage is not displayed, the full refresh request sent a new connexion, and therefore a TCP/FIN set flag was sent.

Reproducibility: on x32, always.

The problem with this bug, there is no message and I am unable to execute nginx in foreground.

Thanks for support.

PS: is maybe related to bug 509978
Comment 1 Tiziano Müller (RETIRED) gentoo-dev 2014-07-14 22:10:39 UTC
Please make sure that you have debug symbols installed for nginx (with FEATURES="splitdebug compressdebug"), then enable coredumps (using `ulimit -c unlimited`), restart nginx (which should pickup the new ulimit, check with `cat /proc/$(pidof nginx)/limits`) and check before it crashes where the coredump will go (`ls -l /proc/$(pidof nginx)/cwd`).

After that, use `gdb /usr/sbin/nginx /your-core-dump-file` to load the coredump, and generate a backtrace using the `bt` command withing gdb.

Post the result here.
Comment 2 Thibaud CANALE 2014-07-15 11:22:49 UTC
Hello Tiziano,

Sorry, I am sure I am missing a step, but I didn't found any coredump where it should be (where the symlink is /proc/.../cwd --> /).

As you said, I did:
1. FEATURES="splitdebug compressdebug" emerge nginx
2. ulimit -c unlimited
3. restarting nginx, with rc-service or directly
4. checking the symlink /proc/.../cwd
5. crashing it.
6. nothing found where it should be.

Thanks for your support.
Comment 3 Thibaud CANALE 2014-07-15 12:17:02 UTC
I found this document on the wiki, I keep you update:
https://wiki.gentoo.org/wiki/Project:Hardened/Debugging
Comment 4 Thibaud CANALE 2014-07-15 13:58:13 UTC
Created attachment 380744 [details]
gdb /usr/sbin/nginx /tmp/cores/core

Ok, finally, here I am.

Because /proc/<pid>/cwd is a symlink to / with root:root 755, and the process has nginx:www-data rights, I needed to create a directory to store and redefine the variable /proc/sys/kernel/core_pattern with sysctl:

# mkdir /tmp/cores/
# chown root:root /tmp/cores/
# chmod 1773 /tmp/cores/
# sysctl kernel.core_pattern
kernel.core_pattern = core
# sysctl -w kernel.core_pattern="/tmp/cores/core"
kernel.core_pattern = /tmp/cores/core

Other detail, when I was only doing `ulimit -c unlimited`, it was still:
# cat /proc/<pid>/limits
Limit                     Soft Limit           Hard Limit           Units
../..
Max core file size        0                    unlimited            bytes

So, I see we should use `ulimit -Sc unlimited`. And now it works :-)

(Note: I followed https://wiki.gentoo.org/wiki/Project:Hardened/Debugging but I was unable to revealed the "??" in line 0)

PS: I didn't found any lib named linux_vdso.so.1 related to "warning: Could not load shared library symbols for linux-vdso.so.1." message (cf. attachment).
Comment 5 Tiziano Müller (RETIRED) gentoo-dev 2014-07-16 07:57:37 UTC
hmm, I don't see any obvious problems with the code at the positions given by the backtrace, but then I'm not the x32 specialist.

Do you mind rebuild nginx again without the optimization flags (no "-O2" and the like) such that we get a a view on the values __len=<optimized out>, __src=<optimized out>, __dest=<optimized out> passed to memcpy?

Can you possibly also attach the full build log? Maybe the compiler has some warnings for us.

The problem is that I don't have access to a x32 system.

If I'm still clueless after this I will have to open a bug report at upstream.
Comment 6 Thibaud CANALE 2014-07-18 11:23:33 UTC
Oh!

Extraordinary, I figured that using -O0 is not crashing the process anymore. I was using -O2, like every package on my system, only nginx was crashing with this.

I am sending build logs as attachments.

Thanks for support.
Comment 7 Thibaud CANALE 2014-07-18 11:36:13 UTC
Created attachment 380954 [details]
build.log with -O2

CFLAGS="-march=corei7-avx -O2 -pipe -ggdb"
Comment 8 Thibaud CANALE 2014-07-18 11:41:08 UTC
Created attachment 380956 [details]
build.log with -O0

As you can see in the build.log, there is a lot of lines with 2 instructions "-Ox":
x86_64-pc-linux-gnux32-gcc -c -march=corei7-avx -O2 -pipe -ggdb -O0 ...

The fact is I used /etc/portage/package.env and a file related to, with those informations:
in /etc/portage/make.conf: CFLAGS="-march=corei7-avx -O2 -pipe -ggdb"
in /etc/portage/env/: CFLAGS="${CFLAGS} -O0"
Comment 9 Tiziano Müller (RETIRED) gentoo-dev 2014-08-06 06:16:36 UTC
TBH, at the moment it seems to me like gcc is doing something strange here. Maybe the x32-porting people have some more insight.
You could also try to set CFLAGS to "-O2" only (without the march) and then play around with the optimization flags (not only -Ox but the ones enabled by -O1/-O2) to see which one is causing the issue.
Comment 10 Thibaud CANALE 2014-08-08 15:10:25 UTC
Hello,

(In reply to Tiziano Müller from comment #9)
> TBH, at the moment it seems to me like gcc is doing something strange here.
> Maybe the x32-porting people have some more insight.
> You could also try to set CFLAGS to "-O2" only (without the march) and then
> play around with the optimization flags (not only -Ox but the ones enabled
> by -O1/-O2) to see which one is causing the issue.

So, as requested, I set CFLAGS without the march (but still with "-pipe -ggdb"), and I have the same error and the same backtrace.
Unfortunately, I don't understand your request about "play around with the optimization flags (not only -Ox but the ones enabled by -O1/-O2)"

Other information, I "succeeded" to rise another segmentation fault simply by executing /usr/sbin/nginx with the wrong right on /var/log/nginx/ (like a simple user).

here the backtrace for this error (using -O2):
#0  0xf64c1650 in ?? () from /libx32/libc.so.6
#1  0x56575914 in memcpy (__len=2027, __src=<optimized out>, 
    __dest=<optimized out>) at /usr/include/bits/string3.h:51
#2  ngx_vslprintf (buf=<optimized out>, last=<optimized out>, 
    fmt=0x56620a36 "V] ", args=args@entry=0xffffcc6c)
    at src/core/ngx_string.c:238
#3  0x56575cba in ngx_slprintf (buf=<optimized out>, 
    last=last@entry=0xffffd560 "0", fmt=fmt@entry=0x56620a33 " [%V] ")
    at src/core/ngx_string.c:139
#4  0x56571237 in ngx_log_error_core (level=level@entry=1, 
    log=<optimized out>, err=0, fmt=fmt@entry=0x5662c4fe "%*s")
    at src/core/ngx_log.c:104
#5  0x56581cf0 in ngx_conf_log_error (level=level@entry=1, 
    cf=cf@entry=0xffffdd80, err=13, 
    fmt=fmt@entry=0x56620fdc "open() \"%s\" failed")
    at src/core/ngx_conf_file.c:926
#6  0x56582550 in ngx_conf_parse (cf=cf@entry=0xffffdd80, 
    filename=filename@entry=0x56870df8) at src/core/ngx_conf_file.c:125
#7  0x56580092 in ngx_init_cycle (old_cycle=<optimized out>)
    at src/core/ngx_cycle.c:264
#8  0x5656fc76 in main (argc=<optimized out>, argv=<optimized out>)
    at src/core/nginx.c:333

As you can notice, we have the same requests from #4 to #0 between the previous backtrace attachment 380744 [details] and this.

Thanks again for support.
Comment 11 Johan Bergström 2015-05-04 01:45:59 UTC
@Thibaud: are you able to reproduce on newer versions?
Comment 12 Manuel Rüger (RETIRED) gentoo-dev 2016-02-06 14:00:34 UTC
Please report back, if it's still an issue.