I recently installed a new Radeon R9 380 and use the amdgpu driver. My dekstop suddenly freezes, no input (mouse, keyboard) is possible. I still can login via ssh from another box and run some commands but I do not see any error messages. Dmesg, .xsession-erros, Xorg.0.log, journalctl are all clean. If I try to shutdown via ssh I am kicked of from the ssh session (obviously) but the box is is not powered off. Somewhere during the shutdown process the box gets stuck. All the time I still see the frozen desktop. The only option is to forcefully power off the PC by pressing the power button for 3sec. After a reboot there are no errors in any log. I prevalently use amd64 stable with some exception, because amdgpu needs llvm 3.9 and this again depends on mesa 13.x.y. which both are still marked as ~amd64.
@x11, would you be able to give a hand here?
Some new information (probably not very helpful): I re-compiled the kernel and put all graphic-related stuff into modules instead of building it directly into the kernel. My hope was I could recover from a freeze by unloading the amdgpu modules or at least get a glimpse of information about what went wrong if I tried unloading the amdgpu module. After the next freeze occurred I tried the following steps: (1) Logged in from another box via SSH (2) systemctl rescue (3) lsmod still reported that amdgpu was used by 36 processes, although no X11-related processes were running anymore, monitor still showed the frozen screen (4) rmmod amdgpu killed the machine entirely, after that even my SSH session was stuck, i.e. the shell never returned to its prompt, any attempt to login via SSH a second time failed (5) Only remaining option: forcefully power off the PC After reboot: No error messages anywhere
Somewhere else somebody suspected a GPU hang that is likely caused by LLVM or Mesa. He asked me to set the environment variable GALLIUM_DDEBUG="pipelined 2000" for the Xorg process and for the compositor which is kwin in my case such that the radeonsi driver might detect the hang and dump some information about it in a file in ~/ddebug_dumps/ How and where do I do it? I use systemd as my init process and the the active service file is sddm.service. Presumably, I need to modify some unit files but offhand I do not have an idea which one.
I have the same situation on my laptop with integrated Intel video. Random freezes. I'm working with additional monitor and freeze happens more times then I switch processes with ALT+TAB or with mouse on left upper corner. emerge --info Portage 2.3.3 (python 3.4.5-final-0, default/linux/amd64/13.0/desktop/plasma, gcc-5.4.0, glibc-2.23-r3, 4.9.7-gentoo x86_64) ================================================================= System uname: Linux-4.9.7-gentoo-x86_64-Intel-R-_Core-TM-_i5-4210M_CPU_@_2.60GHz-with-gentoo-2.3 KiB Mem: 11733756 total, 1817988 free KiB Swap: 10234900 total, 10234900 free Timestamp of repository gentoo: Thu, 02 Feb 2017 11:00:01 +0000 sh bash 4.4_p11 ld GNU ld (Gentoo 2.27 p1.0) 2.27 app-shells/bash: 4.4_p11::gentoo dev-java/java-config: 2.2.0-r3::gentoo dev-lang/perl: 5.24.1_rc4::gentoo dev-lang/python: 2.7.12::gentoo, 3.4.5::gentoo dev-util/cmake: 3.7.2::gentoo dev-util/pkgconfig: 0.29.1::gentoo sys-apps/baselayout: 2.3::gentoo sys-apps/openrc: 0.23.2::gentoo sys-apps/sandbox: 2.10-r3::gentoo sys-devel/autoconf: 2.13::gentoo, 2.69-r2::gentoo sys-devel/automake: 1.13.4-r1::gentoo, 1.14.1-r1::gentoo, 1.15-r2::gentoo sys-devel/binutils: 2.27::gentoo sys-devel/gcc: 5.4.0-r2::gentoo sys-devel/gcc-config: 1.8-r1::gentoo sys-devel/libtool: 2.4.6-r2::gentoo sys-devel/make: 4.2.1::gentoo sys-kernel/linux-headers: 4.9::gentoo (virtual/os-headers) sys-libs/glibc: 2.23-r3::gentoo Repositories: gentoo location: /usr/portage sync-type: rsync sync-uri: rsync://rsync.gentoo.org/gentoo-portage priority: -1000 ACCEPT_KEYWORDS="amd64 ~amd64" ACCEPT_LICENSE="* -@EULA" CBUILD="x86_64-pc-linux-gnu" CFLAGS="-march=native -O2 -pipe" CHOST="x86_64-pc-linux-gnu" CONFIG_PROTECT="/etc /usr/share/config /usr/share/gnupg/qualified.txt /usr/share/themes/oxygen-gtk/gtk-2.0 /var/lib/hsqldb" CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/dconf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo /etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d /etc/texmf/web2c" CXXFLAGS="-march=native -O2 -pipe" DISTDIR="/usr/portage/distfiles" FCFLAGS="-O2 -pipe" FEATURES="assume-digests binpkg-logs config-protect-if-modified distlocks ebuild-locks fixlafiles merge-sync news parallel-fetch preserve-libs protect-owned sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr" FFLAGS="-O2 -pipe" GENTOO_MIRRORS="http://linux.rz.ruhr-uni-bochum.de/download/gentoo-mirror/ http://trumpetti.atm.tut.fi/gentoo/ http://gd.tuwien.ac.at/opsys/linux/gentoo/ http://mirrors.linuxant.fr/distfiles.gentoo.org/ http://ftp.fi.muni.cz/pub/linux/gentoo/" LANG="en_US.UTF-8" LDFLAGS="-Wl,-O1 -Wl,--as-needed" MAKEOPTS="-j3" PKGDIR="/usr/portage/packages" PORTAGE_CONFIGROOT="/" PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git" PORTAGE_TMPDIR="/var/tmp" USE="X a52 aac acl acpi alsa amd64 berkdb bindist bluetooth branding bzip2 cairo cdda cdr cli consolekit cracklib crypt cups cxx dbus declarative dri dts dvd dvdr emboss encode exif fam firefox flac fortran gdbm gif glamor gpm gtk iconv icu ipv6 jpeg kde kipi lcms ldap libnotify mad mmx mng modules mp3 mp4 mpeg multilib ncurses nls nptl ogg opengl openmp pam pango pcre pdf phonon plasma png policykit ppds qml qt3support qt4 qt5 readline samba sdl seccomp semantic-desktop session spell sse sse2 ssl startup-notification svg tcpd tiff truetype udev udisks unicode upower usb vorbis widgets wxwidgets x264 xattr xcb xcomposite xinerama xml xscreensaver xv xvid zlib" ABI_X86="64" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="kexi words flow plan sheets stage tables krita karbon braindump author" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" CPU_FLAGS_X86="mmx mmxext sse sse2" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock isync itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf skytraq superstar2 timing tsip tripmate tnt ublox ubx" INPUT_DEVICES="libinput keyboard mouse" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php5-6" PYTHON_SINGLE_TARGET="python2_7" PYTHON_TARGETS="python2_7 python3_4" RUBY_TARGETS="ruby21" USERLAND="GNU" VIDEO_CARDS="intel" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account" Unset: CC, CPPFLAGS, CTARGET, CXX, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LC_ALL, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, USE_PYTHON
Also, I can go to the shell with CTRL+ALT+F1 and only X process restart is working for me. Session restartedd and I can work again. Reboot working only if I close all mounted devices and opened shell(on alt+ctrl+F1)
As written in comment #3 I have already created some LLVM/Mesa crash dumps and it is reported upstream. https://bugs.freedesktop.org/show_bug.cgi?id=98874 However, since the bug was assigned to the radeonsi group, there has been to progress. Perhaps someone else could poke upstream.
Upstream driver bugs like this are really outside the capabilities of a distro to fix.