Created attachment 440978 [details] 4.4.8-config When access_ok (arch/x86/include/asm/uaccess.h) has a large amount of ram to process, it takes quite a long time to process it. For example, when booting System Rescue CD in a >= 4GB ram qemu virtual machine, when it says "Loading kernel modules", it freezes for a few seconds (quite a few seconds if the guest ram size is 32GB) while access_ok faults all of the virtual machine's ram into host ram (which must mean the size parameter isn't a multiple of PAGE_SIZE but there could be other apps that have this problem as well). After patching access_ok to be the same as access_ok_noprefault, there is no freeze and the host ram usage stays low until the guest starts actually using the ram, but I was wondering if it would be possible to improve the performance of the get/put operations instead as it looks like they do provide additional security checks or is it perhaps not needed for access_ok? I had trouble finding info about what the code does as it looks like it has been in the grsecurity patch since at least hardened-patches-2.6.32-5.extras patches, but I can't see it in hardened-patches-2.6.28-8.extras My kernel build environment: Portage 2.2.20.1 (python 3.4.3-final-0, hardened/linux/amd64, gcc-4.9.3, glibc-2.21-r1, 4.4.8-hardened-r1 x86_64) ================================================================= System uname: Linux-4.4.8-hardened-r1-x86_64-AMD_FX-tm-4100_Quad-Core_Processor-with-gentoo-2.2 KiB Mem: 8176488 total, 66052 free KiB Swap: 8383484 total, 8060968 free Timestamp of repository gentoo: Tue, 12 Jul 2016 12:45:01 +0000 sh bash 4.3_p39 ld GNU ld (Gentoo 2.25.1 p1.1) 2.25.1 app-shells/bash: 4.3_p39::gentoo dev-lang/perl: 5.20.2::gentoo dev-lang/python: 2.7.10::gentoo, 3.4.3::gentoo dev-util/pkgconfig: 0.28-r2::gentoo sys-apps/baselayout: 2.2::gentoo sys-apps/openrc: 0.17::gentoo sys-apps/sandbox: 2.6-r1::gentoo sys-devel/autoconf: 2.69::gentoo sys-devel/automake: 1.14.1::gentoo, 1.15::gentoo sys-devel/binutils: 2.25.1-r1::gentoo sys-devel/gcc: 4.9.3::gentoo sys-devel/gcc-config: 1.7.3::gentoo sys-devel/libtool: 2.4.6::gentoo sys-devel/make: 4.1-r1::gentoo sys-kernel/linux-headers: 3.18::gentoo (virtual/os-headers) sys-libs/glibc: 2.21-r1::gentoo Repositories: gentoo location: /usr/portage sync-type: rsync sync-uri: rsync://mirror.internode.on.net/gentoo-portage priority: -1000 ACCEPT_KEYWORDS="amd64" ACCEPT_LICENSE="* -@EULA" CBUILD="x86_64-pc-linux-gnu" CFLAGS="-O2 -march=nehalem -mtune=bdver1 -fomit-frame-pointer -ftree-vectorize -fpredictive-commoning" CHOST="x86_64-pc-linux-gnu" CONFIG_PROTECT="/etc" CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/gconf /etc/gentoo-release /etc/sandbox.d /etc/terminfo" CXXFLAGS="-O2 -march=nehalem -mtune=bdver1 -fomit-frame-pointer -ftree-vectorize -fpredictive-commoning" DISTDIR="/usr/portage/distfiles" FCFLAGS="-O2 -pipe" FEATURES="assume-digests binpkg-logs compress-build-logs config-protect-if-modified distlocks ebuild-locks fixlafiles merge-sync news parallel-fetch preserve-libs protect-owned sandbox sfperms split-elog split-log strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr" FFLAGS="-O2 -pipe" GENTOO_MIRRORS="http://mirror.internode.on.net/pub/gentoo" LANG="en_AU.utf8" LDFLAGS="-Wl,-O1 -Wl,--as-needed" PKGDIR="/usr/portage/packages" PORTAGE_CONFIGROOT="/" PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages" PORTAGE_TMPDIR="/var/tmp" USE="acl amd64 berkdb bzip2 caps cli cracklib crypt cxx dri gdbm hardened iconv ipv6 justify mmx mmxext modules multilib ncurses nls nptl openmp pam pax_kernel pcre pie readline seccomp session sse sse2 ssl ssp tcpd unicode urandom xattr xtpax zlib" ABI_X86="64" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="kexi words flow plan sheets stage tables krita karbon braindump author" CAMERAS="ptp2" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" CPU_FLAGS_X86="mmx mmxext sse sse2 sse3 ssse3 sse4_1 sse4_2" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf superstar2 timing tsip tripmate tnt ublox ubx" INPUT_DEVICES="keyboard mouse evdev" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php5-5" PYTHON_SINGLE_TARGET="python2_7" PYTHON_TARGETS="python2_7 python3_4" RUBY_TARGETS="ruby20 ruby21" USERLAND="GNU" VIDEO_CARDS="amdgpu fbdev intel nouveau radeon radeonsi vesa dummy v4l" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account" Unset: CC, CPPFLAGS, CTARGET, CXX, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LC_ALL, MAKEOPTS, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, USE_PYTHON
(In reply to Matthew Stapleton from comment #0) > Created attachment 440978 [details] > 4.4.8-config > > When access_ok (arch/x86/include/asm/uaccess.h) has a large amount of ram to > process, it takes quite a long time to process it. For example, when > booting System Rescue CD in a >= 4GB ram qemu virtual machine, when it says > "Loading kernel modules", it freezes for a few seconds (quite a few seconds > if the guest ram size is 32GB) while access_ok faults all of the virtual > machine's ram into host ram (which must mean the size parameter isn't a > multiple of PAGE_SIZE but there could be other apps that have this problem > as well). > can we first determine whether this is a hardened issues? can you test the equivalent vanilla kernel with the same configure except grsecurity and see if you get the same slow path? also, any clue in dmesg?
When I first realised the problem, I think I first disabled grsecurity in hardened-sources, and then started looking in the guest dmesg and System Rescue CD to see if I could figure out what the problem was (as it didn't seem like it was grsecurity). dmesg suggested it might be the video driver scanning the ram, and initrd suggested it might be /dev/shm. The only indication on the host during the guess freeze is "watch -n 1 free" slowly increase the amount of ram in use. Then I decided to bootup System Rescue on gentoo-sources (my desktop) and it bootup with no delays on a 4GB VM and only allocated about 512MB which is when I realised something in the grsecurity patch is different even when grsecurity is disabled in kernel config. First I looked in the kvm section which pointed to a grsecurity modification in virt/kvm_main.c: __kvm_set_memory_region : access_ok is changed to access_ok_noprefault. When I looked at the access_ok_noprefault and access_ok functions I realised that the vanilla access_ok has been renamed to access_ok_noprefault and extra security checks have been added to access_ok, but it looks like they only run if size is not a multiple of PAGE_SIZE. After returning the access_ok back to vanilla and with a lot of grsecurity options enabled, though, a 4GB VM booting System Rescue CD guest boots up and only uses 512MB in the host just like on gentoo-sources.
okay passing this upstream. you may want to see if the bug is still there in the lastest 4.6.4-r2 = grsecurity-3.1-4.6.4-201607192040
the prefaulting mechanism was introduced in the 2.6.31 patch (around 2009.09) and it prefaults pages for ranges bigger than PAGE_SIZE (and not when the size isn't a multiple of it). as for fixing the performance impact, we can certainly convert the culprit access_ok check to its non-prefaulting variant, but first we need to find it. can you try the following patch and see if it helps? note that it may not be the final fix as these functions seem to be called from a bunch of places and we may not want to omit prefaulting for all of them (e.g., i can add a size limit to prefaulting). --- a/drivers/vhost/vhost.c 2016-05-22 01:55:48.127364746 +0200 +++ b/drivers/vhost/vhost.c 2016-07-21 22:51:46.920527852 +0200 @@ -597,7 +597,7 @@ a + (unsigned long)log_base > ULONG_MAX) return 0; - return access_ok(VERIFY_WRITE, log_base + a, + return access_ok_noprefault(VERIFY_WRITE, log_base + a, (sz + VHOST_PAGE_SIZE * 8 - 1) / VHOST_PAGE_SIZE / 8); } @@ -615,7 +615,7 @@ unsigned long a = m->userspace_addr; if (m->memory_size > ULONG_MAX) return 0; - else if (!access_ok(VERIFY_WRITE, (void __user *)a, + else if (!access_ok_noprefault(VERIFY_WRITE, (void __user *)a, m->memory_size)) return 0; else if (log_all && !log_access_ok(log_base,
It looks like that vhost patch fixed the problem. I've also added WARN_ON(size >= 67108864) in my new kernel build for access_ok which without the vhost patch has alerted to the vq_memory_access_ok function.
thanks, can you revert the previous patch and try this one then? it should omit prefaulting above a certain size (subject to tweaks if it's still a noticable impact). --- a/arch/x86/include/asm/uaccess.h 2016-05-24 02:26:14.570357804 +0200 +++ b/arch/x86/include/asm/uaccess.h 2016-07-21 23:57:38.964780165 +0200 @@ -99,7 +99,7 @@ unsigned long __size = size; \ unsigned long __addr = (unsigned long)addr; \ bool __ret_ao = __range_not_ok(__addr, __size, user_addr_max()) == 0;\ - if (__ret_ao && __size) { \ + if (__ret_ao && __size < 256 * PAGE_SIZE) { \ unsigned long __addr_ao = __addr & PAGE_MASK; \ unsigned long __end_ao = __addr + __size - 1; \ if (unlikely((__end_ao ^ __addr_ao) & PAGE_MASK)) { \
The problem doesn't occur with that new patch as well
is this fixed by now? i'd like to mark hardened-sources-4.7.6 as stable.
(In reply to Anthony Basile from comment #8) > is this fixed by now? i'd like to mark hardened-sources-4.7.6 as stable. yes.