Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 773436 - sys-libs/glibc-2.32-r8: performance regression due to nss-systemd
Summary: sys-libs/glibc-2.32-r8: performance regression due to nss-systemd
Status: RESOLVED WORKSFORME
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal major (vote)
Assignee: Gentoo Toolchain Maintainers
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-02-28 16:55 UTC by Klaus Kusche
Modified: 2023-11-28 22:39 UTC (History)
9 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Klaus Kusche 2021-02-28 16:55:29 UTC
After emerging glibc-2.32-r8, a "find / -xdev -nogroup -ls" on about 600000 files (all metadata cached) became slower by a factor of more than 50 (!!!):
From just 1.3 sec to 1 min 15 sec.

Other find's (e.g. a find -nouser) are not affected.

It turned out that emerging glibc "automagically" changed /etc/nsswitch.conf without asking or even informing me (proven by the file modification date):

It changed "group: files" to "group: files [SUCCESS=merge] systemd".
(no other lines were changed).

Reverting that change causes "find" to be as fast as before (1.3 sec).

Ebuilds should *never* automagically and silently change config files,
even less if the change has such serious consequences!
Comment 1 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2021-02-28 16:59:24 UTC
Let's try remove the anger here. There's a legitimate reason for this change. Clearly, the intention was not to have serious consequences.

Now, let's get to debugging:
1) emerge --info please

2) systemd or OpenRC?

3) Which syscalls are taking longer if you run strace on the slower find command?
Comment 2 Klaus Kusche 2021-02-28 17:16:44 UTC
systemd.

Portage 3.0.16 (python 3.8.8-final-0, default/linux/amd64/17.1/no-multilib, gcc-10.2.0, glibc-2.32-r8, 5.11.2-gentoo x86_64)
=================================================================
System uname: Linux-5.11.2-gentoo-x86_64-Intel-R-_Core-TM-_i9-9980HK_CPU_@_2.40GHz-with-glibc2.2.5
KiB Mem:    49165524 total,  45661312 free
KiB Swap:   67108860 total,  67108860 free
Timestamp of repository gentoo: Sun, 28 Feb 2021 06:45:01 +0000
Head commit of repository gentoo: e9a076201762dfbbcffa27b1d685b7b286f6506b
Head commit of repository ace: ba22f1c3e817d36823e337c22d0a257d3f7f4c09

sh bash 5.1_p4
ld GNU gold (Gentoo 2.35.2 p1 2.35.2) 1.16
app-shells/bash:          5.1_p4::gentoo
dev-java/java-config:     2.3.1::gentoo
dev-lang/perl:            5.32.1::gentoo
dev-lang/python:          3.8.8::gentoo
dev-util/cmake:           3.19.6::gentoo
dev-util/pkgconfig:       0.29.2::gentoo
sys-apps/baselayout:      2.7-r1::gentoo
sys-apps/sandbox:         2.20::gentoo
sys-devel/autoconf:       2.13-r1::gentoo, 2.69-r5::gentoo
sys-devel/automake:       1.13.4-r2::gentoo, 1.16.3-r1::gentoo
sys-devel/binutils:       2.35.2::gentoo
sys-devel/gcc:            10.2.0-r5::gentoo
sys-devel/gcc-config:     2.3.3::gentoo
sys-devel/libtool:        2.4.6-r6::gentoo
sys-devel/make:           4.3::gentoo
sys-kernel/linux-headers: 5.11::gentoo (virtual/os-headers)
sys-libs/glibc:           2.32-r8::gentoo
Repositories:

gentoo
    location: /usr/portage
    sync-type: rsync
    sync-uri: rsync://rsync.de.gentoo.org/gentoo-portage
    priority: -1000
    sync-rsync-extra-opts: --new-compress
    sync-rsync-verify-max-age: 24
    sync-rsync-verify-jobs: 1
    sync-rsync-verify-metamanifest: yes

archive
    location: /usr/local/portage
    masters: gentoo

ace
    location: /var/lib/layman/ace
    sync-type: git
    sync-uri: https://github.com/ace13/overlay.git
    masters: gentoo
    priority: 50

ACCEPT_KEYWORDS="amd64 ~amd64"
ACCEPT_LICENSE="* -@EULA dlj-1.1 AdobeFlash-11.x Oracle-BCLA-JavaSE google-chrome googleearth Vivaldi FraunhoferFDK ms-teams-pre"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-march=native -mtune=native -msahf -flto -fuse-linker-plugin -O3 -fomit-frame-pointer -fsched-pressure -fgcse-after-reload -flive-range-shrinkage -fweb -ftracer -fivopts -ftree-loop-im -frename-registers -fstdarg-opt -maccumulate-outgoing-args -pipe"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/lib64/libreoffice/program/sofficerc /usr/share/gnupg/qualified.txt"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/dconf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo /etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d /etc/texmf/web2c"
CXXFLAGS="-march=native -mtune=native -msahf -flto -fuse-linker-plugin -O3 -fomit-frame-pointer -fsched-pressure -fgcse-after-reload -flive-range-shrinkage -fweb -ftracer -fivopts -ftree-loop-im -frename-registers -fstdarg-opt -maccumulate-outgoing-args -pipe"
DISTDIR="/usr/portage/distfiles"
EMERGE_DEFAULT_OPTS="--quiet-build --quiet-fail --with-bdeps=y"
ENV_UNSET="CARGO_HOME DBUS_SESSION_BUS_ADDRESS DISPLAY GOBIN GOPATH PERL5LIB PERL5OPT PERLPREFIX PERL_CORE PERL_MB_OPT PERL_MM_OPT XAUTHORITY XDG_CACHE_HOME XDG_CONFIG_HOME XDG_DATA_HOME XDG_RUNTIME_DIR"
FCFLAGS="-O2 -pipe"
FEATURES="assume-digests binpkg-docompress binpkg-dostrip binpkg-logs binpkg-multi-instance collision-protect config-protect-if-modified distlocks ebuild-locks fixlafiles ipc-sandbox keeptemp keepwork merge-sync multilib-strict network-sandbox news noclean parallel-fetch pid-sandbox preserve-libs protect-owned qa-unresolved-soname-deps sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync"
FFLAGS="-O2 -pipe"
GENTOO_MIRRORS="http://mirror.netcologne.de/gentoo/ http://mirror.eu.oneandone.net/linux/distributions/gentoo/gentoo/ http://ftp.fau.de/gentoo http://distfiles.gentoo.org"
LANG="en_DE.iso885915"
LC_ALL="en_DE.iso885915"
LDFLAGS="-march=native -mtune=native -msahf -flto -fuse-linker-plugin -O3 -fomit-frame-pointer -fsched-pressure -fgcse-after-reload -flive-range-shrinkage -fweb -ftracer -fivopts -ftree-loop-im -frename-registers -fstdarg-opt -maccumulate-outgoing-args -pipe"
LINGUAS="en"
MAKEOPTS="-j16"
PKGDIR="/var/cache/binpkgs"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_EXTRA_OPTS="--new-compress"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git"
PORTAGE_TMPDIR="/var/portage"
USE="10bit 12bit 64bit 8bit X \ a52 aac adobe-cff alsa alt-svc amd64 apng applet archive arping asm ass avx brotli bzip2 cacert cairo cdda cdparanoia clang cli clipboard clockdiff contrast cpu-flags-x86-rdrand cube cups curl custom-cflags custom-optimization cxx dav1d dbus dbusmenu dconf debug-frame default-gold dell demosaic detex devfs-compat device-mapper-only dga divx dns dns-over-tls dot dri dri3 drm dts dvd dvdnav dvdr dvi dvipdfm eapol-test egl eme-free encode epspdf evdev exif expat extra faad fdk ffmpeg fftw flac fontconfig fonts foomaticdb fts3 g3dvl gallium gbm gdk-pixbuf gegl gif gimp glamor gles gles1 gles2 glib glibc-omitfp gmp gold graphics graphviz gs gstreamer gtk gtk3 gudev gui gusb hardlink harfbuzz highbitdepth hostonly hpack-tools hpn hsts htmlreport http http2 hugepages hwaccel hwdb iconv icu ilbc imagemagick inotify javascript jbig jemalloc jemalloc3 jit jpeg jpeg2k jumbo-build keybinder keyrecord kms kpathsea kvazaar lasi latex latex3 lazy-lock lcdfilter lcms leaps-timezone lensfun libaom libdrm libevent libglvnd libgtop libilbc libkms libmpv libnotify libopts libsamplerate libtirpc libtommath libwww lightning llvm llvm-gcc llvm-shared-libs lto lz4 lzma lzo mad matroska metalink metric midi minizip mmap mms mmx mmxext mng modern-top mp3 mpeg mpfr mta mtp mudflap multicall native-extensions natspec ncat ncurses ndiff nghttp2 nping nptl nvme offensive ogg oldnet opengl openh264 openmax openmp openssl openvg optimize opus orc osmesa pam pango pcap pcre pcre16 pcre32 pdf pdfimport pkcs7 plugins png policykit postproc postscript ppds printsupport proprietary-codecs pstricks pth pulseaudio qmanifest quiche quicktime r600-llvm-compiler rar raw readline realmedia right_timezone rpc rsync-verify rtc rule_generator sanitize scanner schroedinger scope seccomp secure-delete session sha3 smp snappy sndfile sound speex split-usr sqlite sqlite3 srtp sse sse2 sse3 sse4 sse4_1 sse4_2 ssh ssl ssse3 svg symlink sync-plugin-portage system-av1 system-bootstrap system-bzip2 system-cairo system-cmark system-digest system-ffmpeg system-harfbuzz system-hunspell system-icu system-info system-jpeg system-jsoncpp system-libevent system-libvpx system-libwebp system-libyaml system-llvm system-lua system-mesa system-sqlite system-ssl system-webp system-zlib systemd sysv-utils t1lib texi2html theora threads thunar thunderbolt tiff tls-heartbeat tomsfastmath tools tracepath tremor truetype uchardet udev udisks uefi unicode unlock-notify unwind usb utils v4l v4l2 vaapi vdpau video vim-with-x vorbis vpx vulkan webkit2 webp wext widgets wifi wmf wmp wxwidgets x264 x265 xcb xetex xkb xmp xorg xpm xpresent xrandr xulrunner xv xvid xxhash zenmap zip zlib zstd" ABI_X86="64" ADA_TARGET="gnat_2018" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="karbon sheets words" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" CPU_FLAGS_X86="aes avx avx2 f16c fma3 mmx mmxext pclmul popcnt rdrand sse sse2 sse3 sse4_1 sse4_2 ssse3" CURL_SSL="openssl" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock greis isync itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf skytraq superstar2 timing tsip tripmate tnt ublox ubx" INPUT_DEVICES="evdev libinput" KERNEL="linux" L10N="en" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer pdfimport" LLVM_TARGETS="AMDGPU BPF X86" LUA_SINGLE_TARGET="lua5-3" LUA_TARGETS="lua5-1" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php7-3 php7-4" POSTGRES_TARGETS="postgres10 postgres11" PYTHON_SINGLE_TARGET="python3_8" PYTHON_TARGETS="python3_8" RUBY_TARGETS="ruby27" SANE_BACKENDS="epson" USERLAND="GNU" VIDEO_CARDS="radeonsi amdgpu" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account"
Unset:  CC, CPPFLAGS, CTARGET, CXX, INSTALL_MASK, PORTAGE_BINHOST, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS

strace of slow find:
~: strace -cw find / -xdev -nogroup -ls
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 67.73   62.628076          50   1250864           epoll_wait
  5.52    5.102167           2   2187918           close
  4.34    4.008899           2   1857384           epoll_ctl
  3.24    2.996429           2   1108929           openat
  3.16    2.922834           1   1497669           rt_sigprocmask
  2.37    2.188437           1   1108927           fstat
  1.90    1.758272           2    719741           getdents64
  1.77    1.635193           2    748846           read
  1.67    1.548686           2    644298           newfstatat
  1.55    1.435484           1    748834           lseek
  1.17    1.080568           3    309564           connect
  0.98    0.903370           2    309564           sendto
  0.92    0.851199           2    309564           socket
  0.90    0.833226           2    322172     12608 recvfrom
  0.76    0.706084           2    309564           epoll_create1
  0.75    0.697059           2    309565           timerfd_create
  0.73    0.675707           2    309603           timerfd_settime
  0.53    0.489833           1    250786           fcntl
  0.01    0.005985           3      1959           brk
  0.00    0.000123           3        40           mmap
  0.00    0.000060          59         1           execve
  0.00    0.000040           3        13           mprotect
  0.00    0.000020           5         4           munmap
  0.00    0.000006           5         1           mremap
  0.00    0.000004           2         2           ioctl
  0.00    0.000003           1         2           rt_sigaction
  0.00    0.000003           2         1         1 access
  0.00    0.000003           2         1           fstatfs
  0.00    0.000003           2         1           sysinfo
  0.00    0.000002           2         1           fchdir
  0.00    0.000002           1         1           futex
  0.00    0.000002           1         1           uname
  0.00    0.000002           1         1           prlimit64
  0.00    0.000002           1         1           set_tid_address
  0.00    0.000002           1         1           arch_prctl
  0.00    0.000002           1         1           getpid
  0.00    0.000002           1         1           set_robust_list
  0.00    0.000002           1         1           gettid
------ ----------- ----------- --------- --------- ----------------
100.00   92.467789           6  14305826     12609 total

strace of fast find:
~: strace -cw find / -xdev -nogroup -ls
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 17.53    1.619756           2    794871           close
 17.40    1.607508           2    644575           openat
 16.43    1.517884           2    644297           newfstatat
 13.64    1.260556           2    594056           read
 13.31    1.230066           1    644574           fstat
 12.60    1.164386           1    594051           lseek
  5.29    0.488413           1    250786           fcntl
  3.80    0.351411           3    100613           getdents64
  0.00    0.000112           3        35           brk
  0.00    0.000066          66         1           execve
  0.00    0.000064           3        18           mmap
  0.00    0.000042           7         6           mprotect
  0.00    0.000012           6         2           munmap
  0.00    0.000005           2         2           ioctl
  0.00    0.000003           3         1         1 access
  0.00    0.000003           3         1           sysinfo
  0.00    0.000003           3         1           fstatfs
  0.00    0.000002           2         1           fchdir
  0.00    0.000002           2         1           arch_prctl
  0.00    0.000002           1         1           uname
------ ----------- ----------- --------- --------- ----------------
100.00    9.240298           2   4267893         1 total
Comment 3 Mike Gilbert gentoo-dev 2021-02-28 17:42:09 UTC
(In reply to Klaus Kusche from comment #0)
> Ebuilds should *never* automagically and silently change config files,
> even less if the change has such serious consequences!

nsswitch.conf probably got replaced because you have FEATURES="config-protect-if-modified" enabled, and you have not made any modifications to nsswitch.conf yourself.
Comment 4 Sergei Trofimovich (RETIRED) gentoo-dev 2021-02-28 18:00:43 UTC
>  0.98    0.903370           2    309564           sendto
>  0.92    0.851199           2    309564           socket
>  0.90    0.833226           2    322172     12608 recvfrom

That's a lot of overhead for a file stat. We should not have enabled extra nss modules without extra guards.
Comment 5 Mike Gilbert gentoo-dev 2021-02-28 18:03:37 UTC
Could you please report the nss-systemd performance issue upstream and give us the link so I can track it?

https://github.com/systemd/systemd/issues
Comment 6 Klaus Kusche 2021-02-28 18:08:59 UTC
(In reply to Mike Gilbert from comment #3)
> (In reply to Klaus Kusche from comment #0)
> > Ebuilds should *never* automagically and silently change config files,
> > even less if the change has such serious consequences!
> 
> nsswitch.conf probably got replaced because you have
> FEATURES="config-protect-if-modified" enabled, and you have not made any
> modifications to nsswitch.conf yourself.

Now that's interesting.

At the time I initially installed glibc (many years ago!),
I definitely heavily modified nsswitch.conf!
(because at that time, the default nsswitch.conf had more sources set by default for each category, and I reduced that to "files" except for hosts).

It might be the case that after my installation, the default nsswitch.conf 
became identical to what I've initially set manually,
but that would be a very unlikely surprise.
Comment 7 Mike Gilbert gentoo-dev 2021-02-28 18:19:45 UTC
Yeah, it seems more likely that your nsswitch.conf file got replaced with the upstream default config at some point, but I can on speculate on how/when that might have happened.

If you have system backups, you could check to see what the file looked like before installing glibc-2.32-r8.
Comment 8 Klaus Kusche 2021-02-28 19:05:56 UTC
(In reply to Mike Gilbert from comment #5)
> Could you please report the nss-systemd performance issue upstream and give
> us the link so I can track it?
> 
> https://github.com/systemd/systemd/issues

Not today. I'll try tomorrow.
Comment 9 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2021-02-28 19:08:38 UTC
(In reply to Klaus Kusche from comment #8)
> (In reply to Mike Gilbert from comment #5)
> > Could you please report the nss-systemd performance issue upstream and give
> > us the link so I can track it?
> > 
> > https://github.com/systemd/systemd/issues
> 
> Not today. I'll try tomorrow.

Please make clear that with GOLD + the strange CFLAGS/LDFLAGS, it's not a supported configuration.
Comment 10 Andreas K. Hüttel archtester gentoo-dev 2021-02-28 19:10:45 UTC
(In reply to Mike Gilbert from comment #5)
> Could you please report the nss-systemd performance issue upstream and give
> us the link so I can track it?
> 
> https://github.com/systemd/systemd/issues

It might make sense to reproduce the problem with, ehm, somewhat more "normal" CFLAGS and LDFLAGS first.
Comment 11 Klaus Kusche 2021-02-28 19:15:35 UTC
(In reply to Sergei Trofimovich from comment #4)
> >  0.98    0.903370           2    309564           sendto
> >  0.92    0.851199           2    309564           socket
> >  0.90    0.833226           2    322172     12608 recvfrom
> 
> That's a lot of overhead for a file stat. We should not have enabled extra
> nss modules without extra guards.

What I find even more interesting is that we are talking about a "find" 
over about 600000 files in both cases. With nsswitch "files", the counts 
of the most frequent syscalls roughly agree with the number of files.

But with "files systemd",
we have twice as many fstat, openat and epoll_wait than we have files,
three times as many epoll_ctl, and 3.5 times as many close.
What is that piece of code doing per file?
Comment 12 Klaus Kusche 2021-02-28 19:21:29 UTC
(In reply to Sam James from comment #9)
> (In reply to Klaus Kusche from comment #8)
> > (In reply to Mike Gilbert from comment #5)
> > > Could you please report the nss-systemd performance issue upstream and give
> > > us the link so I can track it?
> > > 
> > > https://github.com/systemd/systemd/issues
> > 
> > Not today. I'll try tomorrow.
> 
> Please make clear that with GOLD + the strange CFLAGS/LDFLAGS, it's not a
> supported configuration.

I think gold is automatically replaced with bfd by the glibc ebuild?

Is -flto supported for glibc?

-fomit-frame-pointer is as far as I know.

All the other flags are optimizations which should not make any difference.
Comment 13 Sergei Trofimovich (RETIRED) gentoo-dev 2021-02-28 19:28:27 UTC
(In reply to Klaus Kusche from comment #11)
> (In reply to Sergei Trofimovich from comment #4)
> > >  0.98    0.903370           2    309564           sendto
> > >  0.92    0.851199           2    309564           socket
> > >  0.90    0.833226           2    322172     12608 recvfrom
> > 
> > That's a lot of overhead for a file stat. We should not have enabled extra
> > nss modules without extra guards.
> 
> What I find even more interesting is that we are talking about a "find" 
> over about 600000 files in both cases. With nsswitch "files", the counts 
> of the most frequent syscalls roughly agree with the number of files.
> 
> But with "files systemd",
> we have twice as many fstat, openat and epoll_wait than we have files,
> three times as many epoll_ctl, and 3.5 times as many close.
> What is that piece of code doing per file?

nsswitch.conf works by loading shared libraries to binary at first call of glibc calls that rely on NSS usually things like getpwnam() glibc call. getpwnam() redirects it to /usr/lib/libnss_systemd.so.2 hooks.

/usr/lib/libnss_systemd.so.2 defines an implementation of the hook:

$ nm -D /usr/lib/libnss_systemd.so.2 | fgrep getpwnam
00003b30 T _nss_systemd_getpwnam_r
         U getpwnam_r@GLIBC_2.1.2

Depending on implementation it could decide on connecting to remote host on every call or even parse every single file locally as a fallback. Looking at syscall counts it "just" connects for every file on your system.

Worth looking at what /usr/lib/libnss_systemd.so.2 actually does.
Comment 14 Sergei Trofimovich (RETIRED) gentoo-dev 2021-02-28 19:34:00 UTC
Do you happen to have many files with UIDs not resolvable via /etc/passwd on your system? (like chroots or containers for foreign systems with more users in chroots). I think that's the case when the fallback should kick in.
Comment 15 Klaus Kusche 2021-02-28 19:43:37 UTC
(In reply to Sergei Trofimovich from comment #14)
> Do you happen to have many files with UIDs not resolvable via /etc/passwd on
> your system? (like chroots or containers for foreign systems with more users
> in chroots). I think that's the case when the fallback should kick in.

We are talking about groups, not users.
Users have not been affected by the upgrade, 
they still just resolve to "files", not "systemd"

For groups, the groups all my files belong to
are resolvable by /etc/group: "find / -xdev -nogroup -ls"
produces no output at all, i.e. finds no file with unknown group,
no matter if "group: files" or "group: files [SUCCESS=merge] systemd"
is used.

So systemd should not be needed / contacted at all, because all groups find asks for are resolvable by /etc/group.
Comment 16 Sergei Trofimovich (RETIRED) gentoo-dev 2021-02-28 20:02:40 UTC
(In reply to Klaus Kusche from comment #15)
> (In reply to Sergei Trofimovich from comment #14)
> > Do you happen to have many files with UIDs not resolvable via /etc/passwd on
> > your system? (like chroots or containers for foreign systems with more users
> > in chroots). I think that's the case when the fallback should kick in.
> 
> We are talking about groups, not users.
> Users have not been affected by the upgrade, 
> they still just resolve to "files", not "systemd"
> 
> For groups, the groups all my files belong to
> are resolvable by /etc/group: "find / -xdev -nogroup -ls"
> produces no output at all, i.e. finds no file with unknown group,
> no matter if "group: files" or "group: files [SUCCESS=merge] systemd"
> is used.
> 
> So systemd should not be needed / contacted at all, because all groups find
> asks for are resolvable by /etc/group.

I think `SUCCESS=merge` means systemd will always be contacted for group resolution. From `man nsswitch.conf`:

           merge       [SUCCESS=merge]  is  used between two database entries.
                       When a group is located in the first of the  two  group
                       entries,  processing  will continue on to the next one.
                       If the group is also found in the next entry  (and  the
                       group name and GID are an exact match), the member list
                       of the second entry will be added to the  group  object
                       to be returned.  Available since glibc 2.24.  Note that
                       merging will not be done for getgrent(3) nor  will  du-
                       plicate  members  be pruned when they occur in both en-
                       tries being merged.

Here is a simple benchmark on tmpfs:

    # mkdir -p /tmp/bm && cd /tmp/bm
    # touch `seq 1 10000`
    # time find . -xdev -nogroup -ls >/dev/null
    real    0m0,134s
    user    0m0,019s
    sys     0m0,044s
    # chgrp bin `seq 1 10000`
    # time find . -xdev -nogroup -ls >/dev/null
    real    0m5,471s
    user    0m0,639s
    sys     0m0,696s
    # chgrp 123456 `seq 1 10000`
    # time find . -xdev -nogroup -ls >/dev/null
    real    0m2,348s
    user    0m0,682s
    sys     0m0,414s

For comparison ls probably has it's own resolution cache:
    # time ls -l >/dev/null
    real    0m0,057s
    user    0m0,016s
    sys     0m0,009s
Comment 17 Mike Gilbert gentoo-dev 2021-02-28 20:15:54 UTC
groups are always looked up using nss-systemd by design.

Users defined via the systemd interface will not exist in /etc/passwd, and their group memberships will not exist in /etc/group.

The [SUCCESS=merge] action comes directly from the upstream documentation for nss-systemd.

https://www.freedesktop.org/software/systemd/man/nss-systemd.html
Comment 18 Klaus Kusche 2021-03-01 08:57:24 UTC
(In reply to Mike Gilbert from comment #17)
> groups are always looked up using nss-systemd by design.
>
> Users defined via the systemd interface will not exist in /etc/passwd, and
> their group memberships will not exist in /etc/group.

What are those systemd-defined users and groups, 
where do they come from, when and for what purpose are they created?
Do I have such users and groups on a "plain" desktop system
as long as I don't use containers etc.,
i.e. do I need "systemd" name resolution on such a system?

> The [SUCCESS=merge] action comes directly from the upstream documentation
> for nss-systemd.
> 
> https://www.freedesktop.org/software/systemd/man/nss-systemd.html

Emerging glibc added "[SUCCESS=merge] systemd" 
to *group* resolution in nsswitch.conf,
but it did *not* add "systemd" to *user* resolution.

The documentation mentioned by you talks about users *and* groups,
and and it shows "systemd" added to *both*.

Did emerge forget to modify "users"?
If I don't have / don't check for systemd-defined users,
I see absolutely no reason to check for systemd-defined group membership?
Comment 19 Mike Gilbert gentoo-dev 2021-03-01 13:21:40 UTC
The nsswitch.conf file installed by glibc adds “system” to both the “group” and “passed” lines.
Comment 20 Mike Gilbert gentoo-dev 2021-03-01 16:31:58 UTC
(In reply to Mike Gilbert from comment #19)
> The nsswitch.conf file installed by glibc adds “system” to both the “group”
> and “passed” lines.

My iPad mangled this. It should say:

The nsswitch.conf file install by glibc has "systemd" on both the "group" and "passwd" lines.

> group:      files [SUCCESS=merge] systemd
> ...
> passwd:     files systemd

If your copy of the file is missing the latter, I'm not sure how that happened.
Comment 21 Mike Gilbert gentoo-dev 2021-03-01 16:37:11 UTC
(In reply to Klaus Kusche from comment #18)
> What are those systemd-defined users and groups, 
> where do they come from, when and for what purpose are they created?
> Do I have such users and groups on a "plain" desktop system
> as long as I don't use containers etc.,
> i.e. do I need "systemd" name resolution on such a system?

A couple of examples:

1. Dynamic users: http://0pointer.net/blog/dynamic-users-with-systemd.html
2. homed users: https://systemd.io/HOME_DIRECTORY/

We added systemd to the default NSS config so that these features will work "out-of-the-box", on the assumption that it would not cause problems.

If you don't use these features, you are welcome to remove the systemd module from your NSS configuration.

If we don't get some resolution on this performance problem, it is possible we might revert the NSS config change, or make it conditional on some USE flag.
Comment 22 Klaus Kusche 2021-03-01 17:46:53 UTC
(In reply to Mike Gilbert from comment #20)
> The nsswitch.conf file install by glibc has "systemd" on both the "group"
> and "passwd" lines.
> 
> > group:      files [SUCCESS=merge] systemd
> > ...
> > passwd:     files systemd
> 
> If your copy of the file is missing the latter, I'm not sure how that
> happened.

Sorry, my mistake, emerge did add "systemd" to both.

But as the performance consequences of "systemd" on userid's seem to be zero, I didn't notice that "systemd" was added to "passwd", too.
Comment 23 Klaus Kusche 2021-03-02 17:30:36 UTC
(In reply to Andreas K. Hüttel from comment #10)
> (In reply to Mike Gilbert from comment #5)
> > Could you please report the nss-systemd performance issue upstream and give
> > us the link so I can track it?
> > 
> > https://github.com/systemd/systemd/issues
> 
> It might make sense to reproduce the problem with, ehm, somewhat more
> "normal" CFLAGS and LDFLAGS first.

Tried a build with -O1 only.

Didn't change anything, same problem.
Comment 24 Mikle Kolyada (RETIRED) archtester Gentoo Infrastructure gentoo-dev Security 2021-03-02 17:49:06 UTC
FTR: I can't reproduce something like that, find speed did not change for me at all
Comment 25 Klaus Kusche 2021-03-02 18:39:18 UTC
(In reply to Mike Gilbert from comment #5)
> Could you please report the nss-systemd performance issue upstream and give
> us the link so I can track it?
> 
> https://github.com/systemd/systemd/issues

https://github.com/systemd/systemd/issues/18846
Comment 26 Andreas K. Hüttel archtester gentoo-dev 2021-03-17 20:28:12 UTC
For now I suggest you just undo the change manually in the config file (since you seem not to need it).

Apart from that, let's keep the bug open and wait if anyone else makes similar observations.
Comment 27 Ben Kohler gentoo-dev 2021-04-09 18:42:07 UTC
I can reproduce this on my work desktop with openrc.
Comment 28 Ben Kohler gentoo-dev 2021-04-09 18:49:59 UTC
I should be more specific-- I'm able to reproduce *a* problem introduced by these new nsswitch entries, but probably not the exact same issue.

Various commands which seem to be doing some group lookups (emerge, eselect news read, su -, more) are hanging for multiples of 30 seconds.  Sometimes 30s, sometimes 90s, sometimes 300s.
Comment 29 Sergei Trofimovich (RETIRED) gentoo-dev 2021-04-09 20:28:01 UTC
Can you share example 'strace -r -f <cmd>'? That should show where stuckness happens.

Be careful not to log too much if you type passwords there.
Comment 30 Michael Hofmann 2021-04-10 02:43:32 UTC
I just updated to glibc-2.32-r8 - and I also see the change in nsswitch.conf.

Here is what I did:

cd /etc
cp nsswitch.conf nsswitch.conf.orig
touch nssswitch.conf
echo "=sys-libs/glibc-2.32*" >>/etc/portage/package.accept_keywords
emerge --update --deep --changed-use @world
diff nsswitch.conf nsswitch.conf.orig

56c56
< group:      files [SUCCESS=merge] systemd
---
> group:      files
63c63
< passwd:     files systemd
---
> passwd:     files
Comment 31 Michael Hofmann 2021-04-10 03:18:33 UTC
I can confirm that the two systemd entries in nsswitch.conf slow down 'find' considerably.

Command: time find /var -xdev -nogroup -ls 

Result without systemd entries: real: 0m0.729s
Result *with*  systemd entries: real: 0m29.004s

It seems that 0050-Gentoo-Enable-nss-systemd-in-nsswitch.conf.patch causes some trouble.
Comment 32 Mikle Kolyada (RETIRED) archtester Gentoo Infrastructure gentoo-dev Security 2021-04-10 10:53:09 UTC
Can anyone test without [SUCCESS=merge]? As it is not really mandatory as far as I remember
Comment 33 Michael Hofmann 2021-04-10 13:44:00 UTC
Command: time find /var -xdev -nogroup -ls 

Result *without* systemd entries: real: 0m0.593s
Result *with*    systemd entries: real: 0m27.062s
Result *with*    systemd entries, but without [SUCCESS=merge]: real: 0m0.596s

'[SUCCESS=merge]' seems to be the culprit.
Comment 34 Mikle Kolyada (RETIRED) archtester Gentoo Infrastructure gentoo-dev Security 2021-04-10 13:50:10 UTC
(In reply to Michael Hofmann from comment #33)
> Command: time find /var -xdev -nogroup -ls 
> 
> Result *without* systemd entries: real: 0m0.593s
> Result *with*    systemd entries: real: 0m27.062s
> Result *with*    systemd entries, but without [SUCCESS=merge]: real: 0m0.596s
> 
> '[SUCCESS=merge]' seems to be the culprit.

yes, because SUCESS=merge means 'check ALSO systemd's nss database (== spend additional time)*. I think we must drop a merge option as it is not mandatory. If an entry found in a files databases search will be immediately ceased, otherwise another database will be tried, there is literally no a use case where openrc users may want to access systemd's nss, while systemd can check both if needed.
Comment 35 Mikle Kolyada (RETIRED) archtester Gentoo Infrastructure gentoo-dev Security 2021-04-10 13:56:14 UTC
i.e. I propose plain stacking:

...
group:      files systemd
            ^^^^^
            openrc users
            stop here...
...
passwd:     files systemd
            ^^^^^
            and here
...
Comment 36 Mike Gilbert gentoo-dev 2021-04-10 14:11:36 UTC
(In reply to Mikle Kolyada from comment #32)
> Can anyone test without [SUCCESS=merge]? As it is not really mandatory as
> far as I remember

Please see comment 17 for reasoning on why this might not be a smart idea.
Comment 37 Mikle Kolyada (RETIRED) archtester Gentoo Infrastructure gentoo-dev Security 2021-04-10 14:20:07 UTC
(In reply to Mike Gilbert from comment #36)
> (In reply to Mikle Kolyada from comment #32)
> > Can anyone test without [SUCCESS=merge]? As it is not really mandatory as
> > far as I remember
> 
> Please see comment 17 for reasoning on why this might not be a smart idea.

there is no counter-action in having no merge enabled, upstream examples are not guidelines. 

systemd-difened users will not exist in /etc/passwd /etc/shadow, etc (i.e. in the 'files' database). That is why it must pick the systemd one (which we put next).
Comment 38 Mikle Kolyada (RETIRED) archtester Gentoo Infrastructure gentoo-dev Security 2021-04-10 14:33:16 UTC
(In reply to Mike Gilbert from comment #17)
> groups are always looked up using nss-systemd by design.
> 
> Users defined via the systemd interface will not exist in /etc/passwd, and
> their group memberships will not exist in /etc/group.
> 
> The [SUCCESS=merge] action comes directly from the upstream documentation
> for nss-systemd.
> 
> https://www.freedesktop.org/software/systemd/man/nss-systemd.html

But with users defined via systemd interfaces youstill need using shadow because systemd-homed can not replace your root account, root is still defined in a plain files structure and co-exists with systemd without merge successfully.
Comment 39 Mike Gilbert gentoo-dev 2021-04-10 14:45:54 UTC
Consider a systemd-managed user 'foo' who is a member of the 'bar' group.

This 'foo' user  not exist in the members list of the 'bar' group as defined in /etc/group.

With "group: files [SUCCESS=merge] systemd" in nsswitch.conf, "getent group bar" will look like this:

bar:x:123:foo

With "group: files systemd" in nsswitch.conf, "getent group bar" will look like this:

bar:x:123:
Comment 40 Andreas K. Hüttel archtester gentoo-dev 2021-04-10 14:52:29 UTC
This is a configuration file. As such, it's intended to be adapted by the admin, and we can only provide a sane default setting.

My suggestion would be that I add a comment to the default file along the lines of 

# If you encounter slowdowns of file operations and do not use systemd-generated
# users and groups, you can disable the corresponding lookups by replacing
# these two lines with, e.g.,
# group:      files
# passwd:     files
Comment 41 Mike Gilbert gentoo-dev 2021-04-10 14:56:06 UTC
(In reply to Andreas K. Hüttel from comment #40)

That sounds like a reasonable next step to me.
Comment 42 Andreas K. Hüttel archtester gentoo-dev 2021-04-10 15:11:42 UTC
Done, will be in the next patchset.

# If you encounter slowdowns of file operations and do not use
# systemd-generated users and groups, you can disable the corresponding
# lookups by replacing the group and passwd lines with, e.g.,
# group:    files
# passwd:   files
# See also https://bugs.gentoo.org/773436
Comment 43 Ben Kohler gentoo-dev 2021-04-10 15:12:20 UTC
I may need to open an additional bug, and I won't be able to do much testing on it until Monday, but on my openrc machine where I had problems, it was absolutely crippling, not just that an operation on 600k files was measurably slower.

When this is flaring up, it takes over 5 minutes for me to get a new login shell.

An emerge --info took 2.5 minutes to print its result.
Comment 44 Andreas K. Hüttel archtester gentoo-dev 2021-04-10 15:27:52 UTC
(In reply to Ben Kohler from comment #43)
> I may need to open an additional bug, and I won't be able to do much testing
> on it until Monday, but on my openrc machine where I had problems, it was
> absolutely crippling, not just that an operation on 600k files was
> measurably slower.
> 
> When this is flaring up, it takes over 5 minutes for me to get a new login
> shell.
> 
> An emerge --info took 2.5 minutes to print its result.

This looks like some other factor comes in as well. Yes let's make a separate bug please, only for the case where this slows down so much that the system hangs.
(Multiple of 25sec would be a DBUS timeout, for example.)