Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 926056 - ixgbe Detected Tx Unit Hang under load
Summary: ixgbe Detected Tx Unit Hang under load
Status: UNCONFIRMED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: Gentoo Kernel Bug Wranglers and Kernel Maintainers
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-03-02 23:17 UTC by Vjaceslavs Klimovs
Modified: 2024-03-08 22:59 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Vjaceslavs Klimovs 2024-03-02 23:17:38 UTC
Under heavy load, reliably, network adapter resets:

[  355.762722] ixgbe 0000:41:00.0 enp65s0f0: Detected Tx Unit Hang 
                 Tx Queue             <0>
                 TDH, TDT             <ab>, <ab>
                 next_to_use          <ab>
                 next_to_clean        <7fe6>
               tx_buffer_info[next_to_clean]
                 time_stamp           <1000031f8>
                 jiffies              <100003600>
[  355.762747] ixgbe 0000:41:00.0 enp65s0f0: tx hang 3 detected on queue 0, resetting adapter
[  355.762755] ixgbe 0000:41:00.0 enp65s0f0: initiating reset due to tx timeout
[  355.762805] ixgbe 0000:41:00.0 enp65s0f0: Reset adapter

the problem does not occur without the load. This started happening only after upgrading to kernel 6.6, no problem on kernel 6.1. 

gearbox /var/log # lspci | grep -i net
41:00.0 Ethernet controller: Intel Corporation Ethernet Controller X550 (rev 01)
41:00.1 Ethernet controller: Intel Corporation Ethernet Controller X550 (rev 01)

emerge --info:

gearbox /var/log # emerge --info
Portage 3.0.61 (python 3.11.8-final-0, default/linux/amd64/17.1/no-multilib/hardened, gcc-13, glibc-2.38-r10, 6.6.19-gentoo x86_64)
=================================================================
System uname: Linux-6.6.19-gentoo-x86_64-AMD_EPYC_7302P_16-Core_Processor-with-glibc2.38
KiB Mem:   131771976 total, 115209788 free
KiB Swap:  134217720 total, 134217720 free
Timestamp of repository gentoo: Sat, 02 Mar 2024 17:30:00 +0000
Head commit of repository gentoo: f2178b9fdbd159f71d3daa1c7d47b83ce0083d40
sh bash 5.1_p16-r6
ld GNU ld (Gentoo 2.41 p5) 2.41.0
app-misc/pax-utils:        1.3.7::gentoo
app-shells/bash:           5.1_p16-r6::gentoo
dev-build/autoconf:        2.71-r6::gentoo
dev-build/automake:        1.16.5-r2::gentoo
dev-build/cmake:           3.27.9::gentoo
dev-build/libtool:         2.4.7-r2::gentoo
dev-build/make:            4.4.1-r1::gentoo
dev-build/meson:           1.3.1-r1::gentoo
dev-lang/perl:             5.38.2-r2::gentoo
dev-lang/python:           3.11.8_p1::gentoo, 3.12.2_p1::gentoo
sys-apps/baselayout:       2.14-r2::gentoo
sys-apps/openrc:           0.53::gentoo
sys-apps/sandbox:          2.38::gentoo
sys-devel/binutils:        2.41-r5::gentoo
sys-devel/binutils-config: 5.5::gentoo
sys-devel/gcc:             13.2.1_p20240113-r1::gentoo
sys-devel/gcc-config:      2.11::gentoo
sys-kernel/linux-headers:  6.6::gentoo (virtual/os-headers)
sys-libs/glibc:            2.38-r10::gentoo
Repositories:

gentoo
    location: /var/db/repos/gentoo
    sync-type: rsync
    sync-uri: rsync://pulley.lan/gentoo-portage
    priority: -1000
    volatile: False
    sync-rsync-verify-max-age: 3
    sync-rsync-verify-jobs: 1
    sync-rsync-extra-opts: 
    sync-rsync-verify-metamanifest: no

vklimovs
    location: /usr/local/portage/vklimovs
    masters: gentoo
    volatile: True

ACCEPT_KEYWORDS="amd64"
ACCEPT_LICENSE="@FREE"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-march=native -O2"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /etc/stunnel/stunnel.conf /usr/share/gnupg/qualified.txt"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo"
CXXFLAGS="-march=native -O2"
DISTDIR="/var/cache/distfiles"
ENV_UNSET="CARGO_HOME DBUS_SESSION_BUS_ADDRESS DISPLAY GDK_PIXBUF_MODULE_FILE GOBIN GOPATH PERL5LIB PERL5OPT PERLPREFIX PERL_CORE PERL_MB_OPT PERL_MM_OPT XAUTHORITY XDG_CACHE_HOME XDG_CONFIG_HOME XDG_DATA_HOME XDG_RUNTIME_DIR XDG_STATE_HOME"
FCFLAGS="-O2 -pipe"
FEATURES="assume-digests binpkg-docompress binpkg-dostrip binpkg-logs buildpkg-live config-protect-if-modified distlocks ebuild-locks fixlafiles ipc-sandbox merge-sync multilib-strict network-sandbox news parallel-fetch pid-sandbox pkgdir-index-trusted preserve-libs protect-owned qa-unresolved-soname-deps sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr"
FFLAGS="-O2 -pipe"
GENTOO_MIRRORS="https://pulley.lan:8080"
INSTALL_MASK="*.la /usr/share/qemu/firmware/*"
LANG="en_US.utf8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
LEX="flex"
LINGUAS="en"
MAKEOPTS="-j33"
PKGDIR="/var/cache/binpkgs"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git"
PORTAGE_TMPDIR="/var/tmp"
SHELL="/bin/bash"
USE="acl amd64 bzip2 caps cet cli crypt dri fortran gdbm hardened iconv ipv6 kerberos ldap libtirpc modules-sign ncurses nls openmp pam pcre pic pie readline sasl seccomp split-usr ssl ssp test-rust unicode verify-sig vhosts xattr xtpax zlib" ABI_X86="64" ADA_TARGET="gnat_2021" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_anon authn_dbm authn_file authz_dbm authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir env expires ext_filter file_cache filter headers include info log_config logio mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="karbon sheets words" COLLECTD_PLUGINS="cpu interface ipmi memory network sensors smart syslog" CPU_FLAGS_X86="aes avx avx2 f16c fma3 mmx mmxext pclmul popcnt rdrand sha sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock greis isync itrax mtk3301 ntrip navcom oceanserver oncore rtcm104v2 rtcm104v3 sirf skytraq superstar2 tsip tripmate tnt ublox" GRUB_PLATFORMS="efi-64" INPUT_DEVICES="libinput" KERNEL="linux" L10N="en" LCD_DEVICES="bayrad cfontz glk hd44780 lb216 lcdm001 mtxorb text" LUA_SINGLE_TARGET="lua5-1" LUA_TARGETS="lua5-1" NGINX_MODULES_HTTP="access auth_request autoindex brotli fastcgi gzip proxy rewrite uwsgi" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php8-1" POSTGRES_TARGETS="postgres15" PYTHON_SINGLE_TARGET="python3_11" PYTHON_TARGETS="python3_11" RUBY_TARGETS="ruby31" UWSGI_PLUGINS="syslog" VIDEO_CARDS="amdgpu fbdev intel nouveau radeon radeonsi vesa dummy" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipp2p iface geoip fuzzy condition tarpit sysrq proto logmark ipmark dhcpmac delude chaos account"
Unset:  ADDR2LINE, AR, ARFLAGS, AS, ASFLAGS, CC, CCLD, CONFIG_SHELL, CPP, CPPFLAGS, CTARGET, CXX, CXXFILT, ELFEDIT, EMERGE_DEFAULT_OPTS, EXTRA_ECONF, F77FLAGS, FC, GCOV, GPROF, LC_ALL, LD, LFLAGS, LIBTOOL, MAKE, MAKEFLAGS, NM, OBJCOPY, OBJDUMP, PORTAGE_BINHOST, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, PYTHONPATH, RANLIB, READELF, RUSTFLAGS, SIZE, STRINGS, STRIP, YACC, YFLAGS
Comment 1 Vjaceslavs Klimovs 2024-03-02 23:54:14 UTC
Upgrading to latest firmware using
https://www.intel.com/content/www/us/en/download/19358/non-volatile-memory-nvm-update-utility-for-intel-ethernet-network-adapter-x550-series.html
did not change the results

gearbox ~ # ethtool -i enp65s0f0
driver: ixgbe
version: 6.6.19-gentoo
firmware-version: 0x80001733, 1.1876.0
expansion-rom-version: 
bus-info: 0000:41:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes
Comment 2 Mike Pagano gentoo-dev 2024-03-08 22:59:26 UTC
Couple of things you can try.

Test with the latest 6.7.X kernel, to see if it was resolved already.

Do a git bisect from the last working 6.1.X kernel to the first non-working one