Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 421853 - With 16GB RAM all Kernel versions >= 3.3 and < 3.5rc1 show performance loss, exceeding a factor of 10, when writing large (2G) files compared to 3.2.16
Summary: With 16GB RAM all Kernel versions >= 3.3 and < 3.5rc1 show performance loss, ...
Status: RESOLVED UPSTREAM
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: Gentoo Kernel Bug Wranglers and Kernel Maintainers
URL: https://bugzilla.kernel.org/show_bug....
Whiteboard:
Keywords: PATCH
Depends on:
Blocks:
 
Reported: 2012-06-18 21:14 UTC by Norman Back
Modified: 2014-09-26 18:43 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
VirtualBox logs (VBox.logs.tar.bz,30.46 KB, application/x-bzip)
2012-06-18 21:20 UTC, Norman Back
Details
vmware logs (vmware.logs.tar.bz,52.66 KB, application/x-bzip)
2012-06-18 21:20 UTC, Norman Back
Details
Kernel configs (kernel.configs.tar.bz,24.08 KB, application/x-bzip)
2012-06-18 21:28 UTC, Norman Back
Details
Backport small fix from 3.5-rc1 page-writeback.c (backport-global_dirty_limit-initial.patch,562 bytes, patch)
2012-07-13 18:27 UTC, Brennan Shacklett
Details | Diff
9ab6de8_partial_undo.patch (file_421853.txt,966 bytes, text/plain)
2012-11-08 18:51 UTC, Tom Wijsman (TomWij) (RETIRED)
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Norman Back 2012-06-18 21:14:02 UTC
All Kernel versions >= 3.3 show performance loss, exceeding a factor of 10, when suspending Virtual Box & VmWare VMs compared to 3.2.16 
 
On a VM configured with 2G of memery the 3.2.16 kernel achieves a suspend time 11 seconds with VirtualBox and 30 seconds with VmWare. All versions >= 3.3 suspend time is degraded by a factor > 10
 
I have extracted the relevent lines from the attached logs below.
 
grep "SAVING" gentoo-sources-3.2.16-VBox.log 
00:05:55.873 Changing the VM state from 'SUSPENDED' to 'SAVING'. 
00:06:06.672 Changing the VM state from 'SAVING' to 'SUSPENDED'. 
11 seconds 
 
grep "SAVING" gentoo-sources-3.3.8-VBox.log                                                                                
00:01:33.437 Changing the VM state from 'SUSPENDED' to 'SAVING'. 
00:04:22.603 Changing the VM state from 'SAVING' to 'SUSPENDED'. 
169 seconds 
 
grep "SAVING" gentoo-sources-3.4.2-r1-VBox.log 
00:23:03.281 Changing the VM state from 'SUSPENDED' to 'SAVING'. 
00:25:09.107 Changing the VM state from 'SAVING' to 'SUSPENDED'. 
126 seconds 
 
grep "SAVING" gentoo-sources-3.4.2-r1-tickless-VBox.log 
00:15:00.492 Changing the VM state from 'SUSPENDED' to 'SAVING'. 
00:17:57.280 Changing the VM state from 'SAVING' to 'SUSPENDED' 
177 seconds 

 
VmWare 
egrep "vcpu-0.*Progress 0%|vcpu-0.*Progress 100%" gentoo-sources-3.2.16-vmware.log 
2012-06-17T17:05:42.417Z| vcpu-0| I120: Progress 0% (none) 
2012-06-17T17:06:12.434Z| vcpu-0| I120: Progress 100% (none) 
30 seconds 
 
egrep "vcpu-0.*Progress 0%|vcpu-0.*Progress 100%" gentoo-sources-3.3.8-vmware.log 
2012-06-17T16:03:52.644Z| vcpu-0| I120: Progress 0% (none) 
2012-06-17T16:15:00.690Z| vcpu-0| I120: Progress 100% (none) 
12 minutes 8 seconds 
 
egrep "vcpu-0.*Progress 0%|vcpu-0.*Progress 100%" gentoo-sources-3.4.2-r1-vmware.log 
2012-06-18T19:33:46.340Z| vcpu-0| I120: Progress 0% (none) 
2012-06-18T19:41:59.220Z| vcpu-0| I120: Progress 100% (none) 
8 Minutes 13 seconds 
 
egrep "vcpu-0.*Progress 0%|vcpu-0.*Progress 100%" gentoo-sources-3.4.2-r1-tickless-vmware.log 
2012-06-17T18:19:22.483Z| vcpu-0| I120: Progress 0% (none) 
2012-06-17T18:35:38.001Z| vcpu-0| I120: Progress 100% (none) 
16 minutes 16 seconds 

Reproducible: Always

Steps to Reproduce:
1. With a kernel >= 3.3 suspend a VirtualBox or VmWare VM configured with 2G memory
2.
3.
Actual Results:  
The suspend completes OK but takes more than 10 times as long as kernel 3.2.16


Expected Results:  
Kernels >= 3.3 should not be slower the 3.2.16

emerge --info
Portage 2.2.0_alpha110 (default/linux/x86/10.0/desktop, gcc-4.5.3, glibc-2.14.1-r3, 3.2.16-gentoo-2 i686)
=================================================================
System uname: Linux-3.2.16-gentoo-2-i686-AMD_Phenom-tm-_II_X4_965_Processor-with-gentoo-2.1
Timestamp of tree: Mon, 18 Jun 2012 02:00:01 +0000
distcc 3.1 i686-pc-linux-gnu [disabled]
ccache version 3.1.7 [disabled]
app-shells/bash:          4.2_p20
dev-java/java-config:     2.1.11-r3
dev-lang/python:          2.6.8, 2.7.3-r2, 3.1.5, 3.2.3
dev-util/ccache:          3.1.7
dev-util/cmake:           2.8.8-r3
dev-util/pkgconfig:       0.26
sys-apps/baselayout:      2.1-r1
sys-apps/openrc:          0.9.9.3
sys-apps/sandbox:         2.5
sys-devel/autoconf:       2.13, 2.68
sys-devel/automake:       1.9.6-r3, 1.11.1
sys-devel/binutils:       2.21.1-r1
sys-devel/gcc:            4.3.6-r1, 4.4.7, 4.5.3-r2, 4.6.3
sys-devel/gcc-config:     1.7.3
sys-devel/libtool:        2.4.2
sys-devel/make:           3.82-r1
sys-kernel/linux-headers: 3.1 (virtual/os-headers)
sys-libs/glibc:           2.14.1-r3
Repositories: gentoo local ikelos
Installed sets: @system
ACCEPT_KEYWORDS="x86"
ACCEPT_LICENSE="*"
CBUILD="i686-pc-linux-gnu"
CFLAGS="-march=k8-sse3 -mtune=amdfam10 -O2 -pipe -fomit-frame-pointer"
CHOST="i686-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/share/config /usr/share/gnupg/qualified.txt /usr/share/themes/oxygen-gtk/gtk-2.0"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/env.d/java/ /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/php/apache2-php5.3/ext-active/ /etc/php/cgi-php5.3/ext-active/ /etc/php/cli-php5.3/ext-active/ /etc/revdep-rebuild /etc/sandbox.d /etc/splash /etc/terminfo /etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d /etc/texmf/web2c"
CXXFLAGS="-march=k8-sse3 -mtune=amdfam10 -O2 -pipe -fomit-frame-pointer"
DISTDIR="/mnt/portage.autofs/distfiles"
EMERGE_DEFAULT_OPTS="--with-bdeps=y"
FCFLAGS="-O2 -march=i686 -pipe"
FEATURES="assume-digests binpkg-logs buildpkg config-protect-if-modified distlocks ebuild-locks fixlafiles news parallel-fetch parallel-install parse-eapi-ebuild-head preserve-libs protect-owned sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch"
FFLAGS="-O2 -march=i686 -pipe"
GENTOO_MIRRORS="http://mirrors.linuxant.fr/distfiles.gentoo.org/ http://mirror.qubenet.net/mirror/gentoo/ http://mirror.ovh.net/gentoo-distfiles/ http://gentoo.mneisen.org/ http://mirror.bytemark.co.uk/gentoo/ http://ftp.snt.utwente.nl/pub/os/linux/gentoo http://www.mirrorservice.org/sites/www.ibiblio.org/gentoo/ http://ftp.heanet.ie/pub/gentoo/ http://mirror.netcologne.de/gentoo/ http://gentoo.tiscali.nl/"
LANG="en_GB.UTF-8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
LINGUAS="en en_GB"
MAKEOPTS="-j12 --load-average=15.2"
PKGDIR="/mnt/portage.autofs/packages"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages"
PORTAGE_TMPDIR="/var/tmp/portage/diamond"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/usr/local/portage /var/lib/layman/ikelos"
SYNC="rsync://blue/gentoo-portage"
USE="3dnow X a52 aac accessibility acl acpi alsa amarok amr authdaemond bash-completion berkdb bluetooth branding bzip2 cairo caps cdda cdr cli consolekit cracklib crypt cups cxx dbus dri dts dv dvd dvdr dvdread emboss encode exif fam firefox flac fortran gdbm gif gpm gtk iconv ieee1394 ipv6 jpeg kde lcms ldap libnotify lm_sensors lock mad mmx mng modules mp3 mp4 mpeg mudflap mysql ncurses network nls nptl ofx ogg opengl openmp oss pam pango pcre pdf png policykit ppds pppd qt3 qt3support qt4 readline samba sasl sdl semantic-desktop session spell sse ssl startup-notification svg tcpd thunar tiff truetype udev udisks uk_bleb uk_rt unicode upower usb v4l v4l2 vorbis wxwidgets x264 x86 xcb xine xinerama xml xorg xulrunner xv xvid zlib" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1 emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic auth_digest authn_anon authn_dbd authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache dav dav_fs dav_lock dbd deflate dir disk_cache env expires ext_filter file_cache filter headers ident imagemap include info log_config logio mem_cache mime mime_magic negotiation proxy proxy_ajp proxy_balancer proxy_connect proxy_http rewrite setenvif so speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="kexi words flow plan sheets stage tables krita karbon braindump" CAMERAS="ptp2" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" DRACUT_MODULES="lvm syslog" DVB_CARDS="usb-dib0700" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf superstar2 timing tsip tripmate tnt ubx" INPUT_DEVICES="evdev" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" LINGUAS="en en_GB" PHP_TARGETS="php5-3" PYTHON_TARGETS="python2_7" RUBY_TARGETS="ruby18 ruby19" SANE_BACKENDS="hp5590" USERLAND="GNU" VIDEO_CARDS="nvidia" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account"
Unset:  CPPFLAGS, CTARGET, INSTALL_MASK, LC_ALL, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, USE_PYTHON
Comment 1 Norman Back 2012-06-18 21:20:27 UTC
Created attachment 315719 [details]
VirtualBox logs
Comment 2 Norman Back 2012-06-18 21:20:51 UTC
Created attachment 315721 [details]
vmware logs
Comment 3 Norman Back 2012-06-18 21:23:57 UTC
https://bugzilla.redhat.com/show_bug.cgi?id=806548 may be related but I do not normally run a tickless system so CONFIG_RCU_FAST_NO_HZ=y is not relevent.
Comment 4 Norman Back 2012-06-18 21:28:21 UTC
Created attachment 315723 [details]
Kernel configs
Comment 5 Norman Back 2012-06-18 22:07:50 UTC
Using:

app-emulation/virtualbox-bin-4.1.16 for all tests

app-emulation/vmware-workstation-8.0.3.703057 for all but kernel-3.4.2-r1-non-tickless which used app-emulation/vmware-workstation-8.0.4.744019-r1
Comment 6 Norman Back 2012-06-19 08:23:06 UTC
This looks like a disc write issue (/mnt/t is my VM partition):

( time dd if=/dev/zero of=/mnt/t/tmp/write-test.bin count=512K bs=4096 ) 2> 3.2.16-write-test.txt
cat 3.2.16-write-test.txt 
524288+0 records in
524288+0 records out
2147483648 bytes (2.1 GB) copied, 35.202 s, 61.0 MB/s

real    0m35.465s
user    0m0.203s
sys     0m5.668s

( time dd if=/dev/zero of=/mnt/t/tmp/write-test.bin count=512K bs=4096 ) 2> 3.4.3-write-test.txt
cat 3.4.3-write-test.txt 
524288+0 records in
524288+0 records out
2147483648 bytes (2.1 GB) copied, 457.979 s, 4.7 MB/s

real    7m37.985s
user    0m0.970s
sys     0m20.705s

cat /proc/mounts | grep /mnt/t
/etc/autofs/auto.vm /mnt/t autofs rw,relatime,fd=30,pgrp=5548,timeout=300,minproto=5,maxproto=5,direct 0 0
/dev/mapper/vgsda15-vmware1 /mnt/t ext4 rw,noatime,nodiratime,user_xattr,acl,commit=600,barrier=1,data=ordered 0 0
Comment 7 Norman Back 2012-06-19 14:48:21 UTC
OK maybe it's just ext4 so lets try ntfs-3g


( time dd if=/dev/zero of=/mnt/s/Temp/write-test.bin count=512K bs=4096 ) 2> 3.2.16-ntfs-write-test.txt
cat 3.2.16-ntfs-write-test.txt
524288+0 records in
524288+0 records out
2147483648 bytes (2.1 GB) copied, 46.5575 s, 46.1 MB/s

real    0m46.766s
user    0m0.241s
sys     0m9.967s

Looks OK. A bit slower than ext4 as expected


( time dd if=/dev/zero of=/mnt/s/Temp/write-test.bin count=512K bs=4096 ) 2> 3.4.3-ntfs-write-test.txt
norman@diamond ~/slow-virtual-suspend $ cat 3.4.3-ntfs-write-test.txt
524288+0 records in
524288+0 records out
2147483648 bytes (2.1 GB) copied, 2745.52 s, 782 kB/s

real    45m45.708s
user    0m0.335s
sys     0m16.651s

Awful!!

So, not ext4 specific.
Comment 8 Norman Back 2012-06-19 16:44:25 UTC
I've updated the title as I had thought this issue was virtualisation specific but my tests above have shown that is general.
Comment 9 Norman Back 2012-06-20 07:28:44 UTC
To test whether this is hardware or kernel config related I did a test on my zbox

uname -a
Linux zbox1 3.2.16-gentoo-2 #2 SMP PREEMPT Mon Jun 18 20:14:13 BST 2012 i686 Intel(R) Atom(TM) CPU D525 @ 1.80GHz GenuineIntel GNU/Linux

( time dd if=/dev/zero of=/mnt/mythstore/tmp/write-test.bin count=512K bs=4096 ) 2> 3.2.16-zbox1-ext4-write-test.txt
cat 3.2.16-zbox1-ext4-write-test.txt
524288+0 records in 
524288+0 records out 
2147483648 bytes (2.1 GB) copied, 27.5527 s, 77.9 MB/s 
 
real    0m27.713s 
user    0m0.332s 
sys     0m7.298s

( time dd if=/dev/zero of=/mnt/mythstore/tmp/write-test.bin count=512K bs=4096 ) 2> 3.4.3-zbox1-ext4-write-test.txt
cat 3.4.3-zbox1-ext4-write-test.txt
524288+0 records in 
524288+0 records out 
2147483648 bytes (2.1 GB) copied, 26.5223 s, 81.0 MB/s 
 
real    0m26.525s 
user    0m0.419s 
sys     0m7.556s

So the issue do not appear on my zbox. That might explain why I could not find any bugs matching this issue. I will try a 'make distclean', reconfigure the 3.4.3 kernel and retest. (Sigh)
Comment 10 Norman Back 2012-06-20 21:05:29 UTC
OK. Oh what a puzzle. I copied the failing base system to another box that has an identical motherboard and same hard drives. I performed write tests and was surprised to see normal performance with little difference between 3.2.16 & 3.4.3 kernels.

The only difference between the two boxes is some extra pci cards and Akasa AllInOne on the failing box. 

From lspci extra PCI cards
03:05.0 Communication controller: Conexant Systems, Inc. HCF 56k Data/Fax/Voice Modem (rev 08)
03:06.0 Multimedia audio controller: Creative Labs SB Audigy (rev 04)
03:06.1 Input device controller: Creative Labs SB Audigy Game Port (rev 04)
03:06.2 FireWire (IEEE 1394): Creative Labs SB Audigy FireWire Port (rev 04)

From lsusb extra USB devices
Bus 001 Device 002: ID 058f:6362 Alcor Micro Corp. Flash Card Reader/Writer

Any suggestions as to what to try next?
Comment 11 Norman Back 2012-06-21 17:14:28 UTC
Progress so far:
I tried removing Audigy sound card, win modem and Akasa AllInOne but still very slow on kernel 3.4.3.

I now looked carefully at any other differences between the failing box (diamond) and the good one (oval). Appart from possible BIOS settings the only other differnce is that diamond has 16G RAM and oval only has 8GB.

So with not much hope I reduced diamond's RAM to 8GB and retested with kernel 3.4.3.

uname -a
Linux diamond 3.4.3-gentoo-3 #1 SMP PREEMPT Tue Jun 19 23:36:54 BST 2012 i686 AMD Phenom(tm) II X4 965 Processor AuthenticAMD GNU/Linux

( time dd if=/dev/zero of=/mnt/t/tmp/write-test.bin count=512K bs=4096 )
524288+0 records in
524288+0 records out
2147483648 bytes (2.1 GB) copied, 28.1102 s, 76.4 MB/s

Reducing the RAM has mitigated the issue.

Now will now update the subject to:
With 16GB RAM all Kernel versions >= 3.3 show performance loss, exceeding a factor of 10, when writing large (2G) files compared to 3.2.16
Comment 12 Norman Back 2012-06-21 19:58:34 UTC
Interestingly, adding mem=15G to the kernel boot parameters also seems to fix the issue.
Comment 13 Mike Pagano gentoo-dev 2012-06-22 18:47:30 UTC
This is a good amount of data. Can you please bring this upstream to http://bugzilla.kernel.org and post the link to the bug back here?
Comment 14 Norman Back 2012-06-22 21:08:46 UTC
Thanks.

For clarity, all the above tests were done with gentoo-sources kernels. 

I have now also tested with sys-kernel/vanilla-sources-3.4.3, it also fails.

I have just pulled the latest git kernel with a view to bisecting the issue and found that 3.5.0-rc3 writes at full speed, so does not have this issue. I will continue with the bisect to locate the cause but I not sure of the value of posting this upstream as they already seem to have fixed it.
Comment 15 Mike Pagano gentoo-dev 2012-06-22 22:26:40 UTC
If you could identify the patch that fixes it, that would be awesome.
Comment 16 Norman Back 2012-06-24 21:08:07 UTC
I attempted "git bisect start 'v3.5-rc1' 'v3.4-rc7'" but the first try resulted in sound card code and the second something else. 

So tried approaching from the othe end with "git bisect start 'v3.4' 'v3.2-rc7'". This reliably terminated with log:

# bad: [76e10d158efb6d4516018846f60c2ab5501900bc] Linux 3.4
# good: [5f0a6e2d503896062f641639dacfe5055c2f593b] Linux 3.2-rc7
git bisect start 'v3.4' 'v3.2-rc7'
# bad: [8d3709f3dd41769338cc383bec23673fd1ce34e7] usb: host: xhci: use __ffs() instead of hardcoding shift
git bisect bad 8d3709f3dd41769338cc383bec23673fd1ce34e7
# good: [1c8106528aa6bf16b3f457de80df1cf7462a49a4] Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
git bisect good 1c8106528aa6bf16b3f457de80df1cf7462a49a4
# bad: [122804ecb59493fbb4d31b3ba9ac59faaf45276f] Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
git bisect bad 122804ecb59493fbb4d31b3ba9ac59faaf45276f
# bad: [f1db7afd917e54711798c64d78f8f5fb090f950d] mm/vmalloc.c: eliminate extra loop in pcpu_get_vm_areas error path
git bisect bad f1db7afd917e54711798c64d78f8f5fb090f950d
# bad: [541048a1d31399ccdda27346a37eae4a2ad55186] Merge branch 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect bad 541048a1d31399ccdda27346a37eae4a2ad55186
# bad: [4690dfa8cd66c37fbe99bb8cd5baa86102110776] Merge branch 'next' of git://git.monstr.eu/linux-2.6-microblaze
git bisect bad 4690dfa8cd66c37fbe99bb8cd5baa86102110776
# good: [54c2c5761febcca46c8037d3a81612991e6c209a] Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
git bisect good 54c2c5761febcca46c8037d3a81612991e6c209a
# bad: [e01886ada28741d7cb2cfb3224e9caccfbc1a2d5] checkpatch: fix 'return is not a function' square bracket handling
git bisect bad e01886ada28741d7cb2cfb3224e9caccfbc1a2d5
# bad: [043bcbe5ec51e0478ef2b44acef17193e01d7f70] mm: test PageSwapBacked in lumpy reclaim
git bisect bad 043bcbe5ec51e0478ef2b44acef17193e01d7f70
# bad: [ab8fabd46f811d5153d8a0cd2fac9a0d41fb593d] mm: exclude reserved pages from dirtyable memory
git bisect bad ab8fabd46f811d5153d8a0cd2fac9a0d41fb593d
# good: [5f8aefd44e64ed2f6950a1dcc77309b7dd9979f4] mm: account reaped page cache on inode cache pruning
git bisect good 5f8aefd44e64ed2f6950a1dcc77309b7dd9979f4
# good: [c0a32fc5a2e470d0b02597b23ad79a317735253e] mm: more intensive memory corruption debugging
git bisect good c0a32fc5a2e470d0b02597b23ad79a317735253e
# good: [ad8a1b558e6c76fb53901956d3c8f29b82a4ccfa] fadvise: only initiate writeback for specified range with FADV_DONTNEED
git bisect good ad8a1b558e6c76fb53901956d3c8f29b82a4ccfa
# good: [25bd91bd27820d5971258cecd1c0e64b0e485144] vmscan: add task name to warn_scan_unevictable() messages
git bisect good 25bd91bd27820d5971258cecd1c0e64b0e485144

ab8fabd46f811d5153d8a0cd2fac9a0d41fb593d is the first bad commit
commit ab8fabd46f811d5153d8a0cd2fac9a0d41fb593d
Author: Johannes Weiner <jweiner@redhat.com>
Date:   Tue Jan 10 15:07:42 2012 -0800

    mm: exclude reserved pages from dirtyable memory

The compiled kernel at this patch commit point was extremely slow (25 minutes instead of 25 seconds).

I then reversed the patch on gentoo-sources-3.3.7 but it failed to compile. I found many other later patches rely on this patch and modify its functionality.

It rather looks like these patches are 'work in progress'. The patch functionality is still in v3.5-rc3.
Comment 17 Norman Back 2012-06-25 19:16:48 UTC
Upstreamed to kernel as https://bugzilla.kernel.org/show_bug.cgi?id=43781
Comment 18 Brennan Shacklett 2012-07-12 20:46:43 UTC
Here is a thread on lkml discussing a similar issue: https://lkml.org/lkml/2012/6/13/508

The creator of the patch suggests running 'sysctl vm.highmem_is_dirtyable=1' as a workaround to gain more dirtyable memory, because the issue manifests itself in 32 bit systems with a high highmem to lowmem ratio.

Also, is CONFIG_CMA enabled in your 3.5 kernel config? If so, does 3.5 still perform well with CONFIG_CMA disabled?
Comment 19 Norman Back 2012-07-13 06:30:32 UTC
In the kernel as tested:

linux-3.5-rc3 # grep CONFIG_CMA .config
# CONFIG_CMA is not set
Comment 20 Norman Back 2012-07-13 11:00:12 UTC
Interestingly, on gentoo-sources-3.4.4, with my normal kernel param of vmalloc=256MB and the issue apparent:

VmallocTotal:     262144 kB
VmallocUsed:       59764 kB
VmallocChunk:     126164 kB
.....
HighTotal:      16001672 kB
HighFree:       13966212 kB
LowTotal:         598456 kB
LowFree:          371184 kB

If I reboot with vmalloc=192MB the issue is gone.
Comment 21 Norman Back 2012-07-13 13:56:28 UTC
Correction. With vmalloc=192MB the issue is gone immediately after reboot but returns after some minutes.
Comment 22 Brennan Shacklett 2012-07-13 18:27:17 UTC
Created attachment 318102 [details, diff]
Backport small fix from 3.5-rc1 page-writeback.c

This is a small fix to the page writeback system that made it into 3.5-rc1, could you test to see if it improves the situation? (I tested compilation with gentoo-sources 3.4.4

Unfortunately there were a lot of changes to the memory management system between 3.4.4 to 3.5-rc1, so I'm having problems nailing down what fixed the regression (no test machine).
Comment 23 Norman Back 2012-07-14 13:21:02 UTC
(In reply to comment #22)
> Created attachment 318102 [details, diff] [details, diff]
> Backport small fix from 3.5-rc1 page-writeback.c
> 
> This is a small fix to the page writeback system that made it into 3.5-rc1,
> could you test to see if it improves the situation? (I tested compilation
> with gentoo-sources 3.4.4
> 
> Unfortunately there were a lot of changes to the memory management system
> between 3.4.4 to 3.5-rc1, so I'm having problems nailing down what fixed the
> regression (no test machine).

Applied patch to 3.4.4 and compiled OK.
Unfortunatily the issue is not fixed:

( time dd if=/dev/zero of=/mnt/t/tmp/write-test.bin count=512K bs=4096 )
524288+0 records in
524288+0 records out
2147483648 bytes (2.1 GB) copied, 213.322 s, 10.1 MB/s

real    3m33.603s
user    0m0.518s
sys     0m16.124s
Comment 24 Tom Wijsman (TomWij) (RETIRED) gentoo-dev 2012-11-07 22:41:20 UTC
1. Are you still experiencing this on recent 3.6.x kernels?

This reply mentions that 64 bit might not have this problem: 

https://lkml.org/lkml/2012/6/14/514

2. Perhaps worth considering to take a 64 bit kernel?

3. Since this worked prior to 3.3, can you find the last kernel that still worked?

This might allow you to do a git bisect to find the offending commit.

http://wiki.gentoo.org/wiki/Kernel_git-bisect
Comment 25 Norman Back 2012-11-08 15:07:08 UTC
(In reply to comment #24)
> 1. Are you still experiencing this on recent 3.6.x kernels?
> 
> This reply mentions that 64 bit might not have this problem: 
> 
> https://lkml.org/lkml/2012/6/14/514
> 
> 2. Perhaps worth considering to take a 64 bit kernel?
> 
> 3. Since this worked prior to 3.3, can you find the last kernel that still
> worked?
> 
> This might allow you to do a git bisect to find the offending commit.
> 
> http://wiki.gentoo.org/wiki/Kernel_git-bisect

1. You are correct, x86_64 does not have this issue. I have now migrated all of my 64bit capable systems to x86_64 with 3.6.x kernels. The remaining 32bit system is limited to 2G memory and so is not affected by this issue.
2. See 1 above.
3. The last kernel that worked was gentoo-sources-3.2.21

As indicated in https://bugs.gentoo.org/show_bug.cgi?id=421853#c16 I attempted to find the patch that fixed the issue but failed. During further normal use of 3.5.x 32bit kernels I very occaisionally still experienced the issue when suspending VMs. This was not repeatable but indicated that the issue was not fixed.

I have not tried 3.6.x kernels in 32bit mode.
Comment 26 Tom Wijsman (TomWij) (RETIRED) gentoo-dev 2012-11-08 18:51:11 UTC
Created attachment 328986 [details]
9ab6de8_partial_undo.patch

This patch only undoes the change this patch introduces without changing the variable future patches might depend on, thus should in effect fix this problem.
Comment 27 Tom Wijsman (TomWij) (RETIRED) gentoo-dev 2012-11-08 18:53:42 UTC
Above patch might work, but since you switched we might not know whether it does.

Resolving this bug as NEEDINFO for now so that when someone else experiences this in the future the patch could be tried to see whether it works and tell us that.

Using 64 bit is of course also a handy solution, and isn't necessarily temporary...
Comment 28 Norman Back 2012-11-09 17:35:56 UTC
(In reply to comment #27)
> Above patch might work, but since you switched we might not know whether it
> does.
> 
> Resolving this bug as NEEDINFO for now so that when someone else experiences
> this in the future the patch could be tried to see whether it works and tell
> us that.
> 
> Using 64 bit is of course also a handy solution, and isn't necessarily
> temporary...

I found an old 32bit backup and temporarily restored it to a system with 16G memory.
I tested with sys-kernel/gentoo-sources-3.3.8-r1 and sys-kernel/gentoo-sources-3.4.11. Interestingly I needed to compile the nvidia driver and start X for the issue to present, even without the patch.

3.8.3-r1-write-test.txt
( time dd if=/dev/zero of=/mnt/e/2/tmp/write-test.bin count=512K bs=4096 )
524288+0 records in
524288+0 records out
2147483648 bytes (2.1 GB) copied, 171.628 s, 12.5 MB/s
real    2m51.787s
user    0m0.410s
sys     0m11.581s

3.8.3-r1-patched-write-test.txt
( time dd if=/dev/zero of=/mnt/e/2/tmp/write-test.bin count=512K bs=4096 )
524288+0 records in
524288+0 records out
2147483648 bytes (2.1 GB) copied, 22.8798 s, 93.9 MB/s
real    0m23.033s
user    0m0.188s
sys     0m4.677s

3.4.11-write-test.txt
( time dd if=/dev/zero of=/mnt/e/2/tmp/write-test.bin count=512K bs=4096 )
524288+0 records in
524288+0 records out
2147483648 bytes (2.1 GB) copied, 306.309 s, 7.0 MB/s
real    5m6.676s
user    0m0.246s
sys     0m16.129s

3.4.11-patched-write-test.txt
( time dd if=/dev/zero of=/mnt/e/2/tmp/write-test.bin count=512K bs=4096 )
524288+0 records in
524288+0 records out
2147483648 bytes (2.1 GB) copied, 23.8139 s, 90.2 MB/s
real    0m24.076s
user    0m0.178s
sys     0m4.197s

So the patch appers to fix the issue.
BTW. The patch was rejected so I applyed it by hand.
Comment 29 Tom Wijsman (TomWij) (RETIRED) gentoo-dev 2012-11-09 21:55:32 UTC
Mentioned that upstream, it's up to them whether they do something with it.

And also handy for others that come along this topic... Thanks for testing!
Comment 30 Mike Pagano gentoo-dev 2014-09-26 18:43:23 UTC
We'll watch this through the upstream bug.