Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 672040

Summary: io_submit syscall oops on alpha linux 4.19.0+
Product: Gentoo Linux Reporter: Dmitry V. Levin <gentoo.dl>
Component: Current packagesAssignee: Alpha Porters <alpha>
Status: RESOLVED FIXED    
Severity: normal CC: slyfox
Priority: Normal    
Version: unspecified   
Hardware: Alpha   
OS: Linux   
URL: https://lkml.org/lkml/2018/12/30/141
Whiteboard:
Package list:
Runtime testing required: ---
Attachments: 0001-alpha-fix-page-fault-handling-for-r16-r18-targets.patch

Description Dmitry V. Levin 2018-11-27 06:00:57 UTC
Hello, strace upstream is speaking. :)

strace testsuite stared to hangup on aio.test around 4.18 kernels due to kernel oops in io_submit.

Here is a stripped down test:

$ cat aio.c
#include <err.h>
#include <unistd.h>
#include <sys/mman.h>
#include <asm/unistd.h>
int main(void)
{
	unsigned long ctx = 0;
	if (syscall(__NR_io_setup, 1, &ctx))
		err(1, "io_setup");
	const size_t page_size = sysconf(_SC_PAGESIZE);
	const size_t size = page_size * 2;
	void *ptr = mmap(NULL, size, PROT_READ | PROT_WRITE,
			 MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
	if (MAP_FAILED == ptr)
		err(1, "mmap(%zu)", size);
	if (munmap(ptr, size))
		err(1, "munmap");
	syscall(__NR_io_submit, ctx, 1, ptr + page_size);
	syscall(__NR_io_destroy, ctx);
	return 0;
}

Here is the relevant part of dmesg:
Unable to handle kernel paging request at virtual address ffffffffffff9468
CPU 3 
aio(26027): Oops 0
pc = [<fffffc00004eddf8>]  ra = [<fffffc00004edd5c>]  ps = 0000    Not tainted
pc is at sys_io_submit+0x108/0x200
ra is at sys_io_submit+0x6c/0x200
v0 = fffffc00c58e6300  t0 = fffffffffffffff2  t1 = 000002000025e000
t2 = fffffc01f159fef8  t3 = fffffc0001009640  t4 = fffffc0000e0f6e0
t5 = 0000020001002e9e  t6 = 4c41564e49452031  t7 = fffffc01f159c000
s0 = 0000000000000002  s1 = 000002000025e000  s2 = 0000000000000000
s3 = 0000000000000000  s4 = 0000000000000000  s5 = fffffffffffffff2
s6 = fffffc00c58e6300
a0 = fffffc00c58e6300  a1 = 0000000000000000  a2 = 000002000025e000
a3 = 00000200001ac260  a4 = 00000200001ac1e8  a5 = 0000000000000001
t8 = 0000000000000008  t9 = 000000011f8bce30  t10= 00000200001ac440
t11= 0000000000000000  pv = fffffc00006fd320  at = 0000000000000000
gp = 0000000000000000  sp = 00000000265fd174
Disabling lock debugging due to kernel taint
Trace:
[<fffffc0000311404>] entSys+0xa4/0xc0

Feel free to forward upstream if it's not a Gentoo bug.

Reproducible: Always




The host is monolith.alpha.dev.gentoo.org.
$ uname -a
Linux monolith 4.19.0+ #48 SMP Thu Oct 25 16:26:53 CEST 2018 alpha EV68AL Tsunami GNU/Linux
$ emerge --info
Portage 2.3.52 (python 3.6.6-final-0, default/linux/alpha/17.0, gcc-8.2.0, glibc-2.28-r2, 4.19.0+ alpha)
=================================================================
System uname: Linux-4.19.0+-alpha-EV68AL-with-gentoo-2.6
KiB Mem:     8300472 total,    315976 free
KiB Swap:     977912 total,    976264 free
Timestamp of repository gentoo: Mon, 26 Nov 2018 22:44:55 +0000
Head commit of repository gentoo: 605a36df0de7c862cd21665983d0e21460a748a3

sh bash 4.4_p23
ld GNU ld (Gentoo 2.31.1 p3) 2.31.1
ccache version 3.5 [disabled]
app-shells/bash:          4.4_p23::gentoo
dev-lang/perl:            5.26.2::gentoo
dev-lang/python:          2.7.15::gentoo, 3.6.6::gentoo
dev-util/ccache:          3.5-r1::gentoo
dev-util/cmake:           3.13.0::gentoo
dev-util/pkgconfig:       0.29.2::gentoo
sys-apps/baselayout:      2.6-r1::gentoo
sys-apps/openrc:          0.39.2::gentoo
sys-apps/sandbox:         2.13::gentoo
sys-devel/autoconf:       2.69-r4::gentoo
sys-devel/automake:       1.16.1-r1::gentoo
sys-devel/binutils:       2.31.1-r1::gentoo
sys-devel/gcc:            8.2.0-r4::gentoo
sys-devel/gcc-config:     2.0::gentoo
sys-devel/libtool:        2.4.6-r5::gentoo
sys-devel/make:           4.2.1-r4::gentoo
sys-kernel/linux-headers: 4.19::gentoo (virtual/os-headers)
sys-libs/glibc:           2.28-r2::gentoo
Repositories:

gentoo
    location: /usr/portage
    sync-type: git
    sync-uri: https://github.com/gentoo-mirror/gentoo
    priority: -1000

local
    location: /usr/local/portage
    masters: gentoo
    priority: 0

ACCEPT_KEYWORDS="alpha ~alpha"
ACCEPT_LICENSE="* -@EULA"
CBUILD="alpha-unknown-linux-gnu"
CFLAGS="-mieee -pipe -O2 -mcpu=ev67"
CHOST="alpha-unknown-linux-gnu"
CONFIG_PROTECT="/etc /usr/share/gnupg/qualified.txt"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo"
CXXFLAGS="-mieee -pipe -O2 -mcpu=ev67"
DISTDIR="/space/distfiles"
ENV_UNSET="DBUS_SESSION_BUS_ADDRESS DISPLAY GOBIN PERL5LIB PERL5OPT PERLPREFIX PERL_CORE PERL_MB_OPT PERL_MM_OPT XAUTHORITY XDG_CACHE_HOME XDG_CONFIG_HOME XDG_DATA_HOME XDG_RUNTIME_DIR"
FCFLAGS="-O2 -pipe"
FEATURES="assume-digests binpkg-docompress binpkg-dostrip binpkg-logs config-protect-if-modified distlocks ebuild-locks fixlafiles merge-sync multilib-strict news parallel-fetch preserve-libs protect-owned sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr"
FFLAGS="-O2 -pipe"
GENTOO_MIRRORS="http://distfiles.gentoo.org"
LC_ALL="en_US.utf8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
MAKEOPTS="-j5"
PKGDIR="/space/packages/system"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git"
PORTAGE_TMPDIR="/space/portage-tmp"
USE="acl alpha bash-completion berkdb bzip2 cli crypt cxx dri fortran ftp gdbm iconv ipv6 libtirpc mmap ncurses nls nptl nptlonly offensive pam pcre readline recode sharedmem sockets ssl unicode vim vim-pager xattr zlib" ALSA_CARDS="ali5451 als4000 bt87x ca0106 cmipci emu10k1 ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 maestro3 trident usb-audio via82xx ymfpci" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="karbon plan sheets stage words" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock isync itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf skytraq superstar2 timing tsip tripmate tnt ublox ubx" INPUT_DEVICES="keyboard mouse" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php5-6 php7-1" POSTGRES_TARGETS="postgres9_5 postgres10" PYTHON_SINGLE_TARGET="python3_6" PYTHON_TARGETS="python2_7 python3_6" RUBY_TARGETS="ruby23" USERLAND="GNU" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account"
Unset:  CC, CPPFLAGS, CTARGET, CXX, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LANG, LINGUAS, PORTAGE_BINHOST, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS
Comment 1 Sergei Trofimovich (RETIRED) gentoo-dev 2018-12-28 10:18:37 UTC
Confirmed it locally in qemu-system-alpha. Bisected down to:

commit 95af8496ac48263badf5b8dde5e06ef35aaace2b
Author: Al Viro <viro@zeniv.linux.org.uk>
Date:   Sat May 26 19:43:16 2018 -0400

    aio: shift copyin of iocb into io_submit_one()
    
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

:040000 040000 20dd44ac4706540b1c1d4085e4269bd8590f4e80 05d477161223e5062f2f781b462e0222c733fe3d M      fs
Comment 2 Sergei Trofimovich (RETIRED) gentoo-dev 2018-12-28 14:32:26 UTC
Poked at it a bit more. I think this commit only exposed a bug in gcc's code generation or a latent bug in declared register effects in alpha assembly macros in arch/alpha. Trying to pin-point exact place where I can isolate what goes wrong.

So far adding printk() statements after 'get_user':
    if (unlikely(get_user(user_iocb, iocbpp + i))) {
        ret = -EFAULT;
        goto err;
    }
in 'SYSCALL_DEFINE3(io_submit, ...' makes the bug disappear.

Looking more into it.
Comment 3 Sergei Trofimovich (RETIRED) gentoo-dev 2018-12-30 20:27:09 UTC
Created attachment 559046 [details, diff]
0001-alpha-fix-page-fault-handling-for-r16-r18-targets.patch

0001-alpha-fix-page-fault-handling-for-r16-r18-targets.patch fixes kernel crash for me in qemu-system-alpha.

Proposed the patch upstream as:
    https://lkml.org/lkml/2018/12/30/141
Comment 4 Sergei Trofimovich (RETIRED) gentoo-dev 2018-12-31 11:35:01 UTC
Tobias applied the patch on monolith and I ran 'make check' from strace git master. Machine seems to have survived.
Comment 5 Tobias Klausmann (RETIRED) gentoo-dev 2018-12-31 12:08:58 UTC
FTR, the machine always survived those calls (we don't panic on oops). However, it would leave behind a stuck-in-D aio kernel thread. These would accumulate over time and cause bogus load averages. With the patch applied, the strace test suite no longer causes stuck aio threads or oopses, so we're good.
Comment 6 Dmitry V. Levin 2018-12-31 13:51:11 UTC
The affected tests pass on monolith now, thanks!
Comment 7 Matt Turner gentoo-dev 2019-02-16 01:48:20 UTC
The patch is now in the master branch and will be in Linux 5.0.