Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 410631 - x11-drivers/ati-drivers-12.3 DMA bug with intel_iommu
Summary: x11-drivers/ati-drivers-12.3 DMA bug with intel_iommu
Status: RESOLVED TEST-REQUEST
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: AMD64 Linux
: Normal normal (vote)
Assignee: Luca Barbato
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-04-03 08:37 UTC by Hanspeter Spalinger
Modified: 2015-07-10 17:13 UTC (History)
4 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
dmesg output (dmesg.txt,68.75 KB, text/plain)
2012-04-03 08:37 UTC, Hanspeter Spalinger
Details
emerge --info (emerge.info,14.55 KB, text/plain)
2012-04-03 08:37 UTC, Hanspeter Spalinger
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Hanspeter Spalinger 2012-04-03 08:37:06 UTC
Created attachment 307591 [details]
dmesg output

On my HP Elitebook 8560p the ATI drivers fail with a DMA error leaving behind Xorg in a unkillable state:

root      3798 99.0  0.1  81152 12844 tty7     Rs+  10:24   2:03 /usr/bin/Xorg :0 -br -verbose -logverbose 7 -auth /var/run/gdm/auth-for-gdm-Gqu8wg/database -nolisten tcp vt7

Only a restart helps.
The solution is to turn intel_iommu off using "intel_iommu=off" at command line or not building the intel_iommu in the kernel.

This happens with ati-drivers 12.2 and 12.3 (I did not test older ones).
This happens on kernel 3.2.12-gentoo and 3.3.0-rc7 (as I got some other problem with 3.4 I was not able to test this so far)

This does not happen with the open source driver.
relevant part of dmesg (I upload complete one as a attachment)

[   36.563299] [fglrx] ATIF platform detected with notification ID: 0xd0
[   36.812625] fglrx_pci 0000:01:00.0: irq 60 for MSI/MSI-X
[   36.813111] [fglrx] Firegl kernel thread PID: 3922
[   36.813287] [fglrx] Firegl kernel thread PID: 3923
[   36.813472] [fglrx] Firegl kernel thread PID: 3924
[   36.813612] [fglrx] IRQ 60 Enabled
[   36.990811] [fglrx] Gart USWC size:1280 M.
[   36.990813] [fglrx] Gart cacheable size:508 M.
[   36.990816] [fglrx] Reserved FB block: Shared offset:0, size:1000000 
[   36.990817] [fglrx] Reserved FB block: Unshared offset:f8fd000, size:403000 
[   36.990819] [fglrx] Reserved FB block: Unshared offset:3fff4000, size:c000 
[   37.000546] DRHD: handling fault status reg 3
[   37.000550] DMAR:[DMA Read] Request device [01:00.0] fault addr 22485e000 
[   37.000551] DMAR:[fault reason 02] Present bit in context entry is clear

and after some waiting i get this oops:
[  216.732920] [fglrx] ASIC hang happened
[  216.732923] Pid: 3798, comm: Xorg Tainted: P           O 3.2.12-gentoo #1
[  216.732924] Call Trace:
[  216.732947]  [<ffffffffa02c1719>] KCL_DEBUG_OsDump+0x9/0x10 [fglrx]
[  216.732964]  [<ffffffffa02ceacc>] firegl_hardwareHangRecovery+0x1c/0x50 [fglrx]
[  216.732993]  [<ffffffffa0369919>] ? _ZN4Asic9WaitUntil15ResetASICIfHungEv+0x9/0x10 [fglrx]
[  216.733022]  [<ffffffffa03698bc>] ? _ZN4Asic9WaitUntil15WaitForCompleteEv+0x9c/0xf0 [fglrx]
[  216.733050]  [<ffffffffa03644be>] ? _ZN15ExecutableUnits10CPRingIdleE15idle_WaitMethod12_QS_CP_RING_+0x11e/0x1e0 [fglrx]
[  216.733078]  [<ffffffffa036434c>] ? _ZN15ExecutableUnits7PM4idleE15idle_WaitMethod+0x4c/0x90 [fglrx]
[  216.733105]  [<ffffffffa0363eb6>] ? _ZN15ExecutableUnits9assertPM4Eb+0x56/0x70 [fglrx]
[  216.733133]  [<ffffffffa036e1b9>] ? _ZN8AsicR6009assertPM4Eb+0x39/0x80 [fglrx]
[  216.733158]  [<ffffffffa033c983>] ? CMMQS_Initialize_WA+0x183/0x1b0 [fglrx]
[  216.733177]  [<ffffffffa02ee3c2>] ? firegl_cmmqs_init+0x642/0xb80 [fglrx]
[  216.733193]  [<ffffffffa02d14d4>] ? firegl_init_iommu+0x94/0x170 [fglrx]
[  216.733211]  [<ffffffffa02ed616>] ? firegl_cmmqs_createdriver+0x96/0x1a0 [fglrx]
[  216.733214]  [<ffffffff810582c2>] ? capable+0x12/0x20
[  216.733232]  [<ffffffffa02ed580>] ? firegl_uvd_destroy+0x4e0/0x4e0 [fglrx]
[  216.733247]  [<ffffffffa02ca62d>] ? firegl_ioctl+0x1ed/0xf30 [fglrx]
[  216.733256]  [<ffffffffa02bba29>] ? ip_firegl_unlocked_ioctl+0x9/0x10 [fglrx]
[  216.733259]  [<ffffffff81117a4e>] ? do_vfs_ioctl+0x8e/0x500
[  216.733261]  [<ffffffff81106be0>] ? vfs_write+0x120/0x160
[  216.733263]  [<ffffffff81117f0a>] ? sys_ioctl+0x4a/0x80
[  216.733266]  [<ffffffff81401cbb>] ? system_call_fastpath+0x16/0x1b
[  216.733269] pubdev:0xffffffffa055ce40, num of device:1 , name:fglrx, major 8, minor 95. 
[  216.733270] device 0 : 0xffff88022f370000 .
[  216.733272] Asic ID:0x6760, revision:0x3c, MMIOReg:0xffffc900118c0000.
[  216.733273] FB phys addr: 0xc0000000, MC :0xf00000000, Total FB size :0x40000000.
[  216.733275] gart table MC:0xf0f8fd000, Physical:0xcf8fd000, size:0x402000.
[  216.733277] mc_node :FB, total 1 zones
[  216.733278]     MC start:0xf00000000, Physical:0xc0000000, size:0xfd00000.
[  216.733279]     Mapped heap -- Offset:0x0, size:0xf8fd000, reference count:1, mapping count:0,
[  216.733281]     Mapped heap -- Offset:0x0, size:0x1000000, reference count:1, mapping count:0,
[  216.733282]     Mapped heap -- Offset:0xf8fd000, size:0x403000, reference count:1, mapping count:0,
[  216.733284] mc_node :INV_FB, total 1 zones
[  216.733285]     MC start:0xf0fd00000, Physical:0xcfd00000, size:0x30300000.
[  216.733286]     Mapped heap -- Offset:0x302f4000, size:0xc000, reference count:1, mapping count:0,
[  216.733288] mc_node :GART_USWC, total 3 zones
[  216.733289]     MC start:0x40100000, Physical:0x0, size:0x50000000.
[  216.733290]     Mapped heap -- Offset:0x0, size:0x2000000, reference count:1, mapping count:0,
[  216.733292] mc_node :GART_CACHEABLE, total 3 zones
[  216.733293]     MC start:0x10400000, Physical:0x0, size:0x2fd00000.
[  216.733294]     Mapped heap -- Offset:0x0, size:0x200000, reference count:1, mapping count:0,
[  216.733296]     Mapped heap -- Offset:0xef000, size:0x11000, reference count:1, mapping count:0,
[  216.733298] GRBM : 0xa0003828, SRBM : 0x200000c0 .
[  216.733300] CP_RB_BASE : 0x401000, CP_RB_RPTR : 0x10 , CP_RB_WPTR :0x10.
[  216.733302] CP_IB1_BUFSZ:0x0, CP_IB1_BASE_HI:0x0, CP_IB1_BASE_LO:0x0.
[  216.733304] last submit IB buffer -- MC :0x0. Can't found mapped physical page for this MC .
[  216.733305] Dump the trace queue.
[  216.733306] End of dump

As I did not find any bug report tool at ATIs website, I report this here, so other people with the same problem can find this.
Comment 1 Hanspeter Spalinger 2012-04-03 08:37:52 UTC
Created attachment 307593 [details]
emerge --info
Comment 2 Jeroen Roovers (RETIRED) gentoo-dev 2012-04-03 20:43:50 UTC
Please post your output of `emerge -vpq x11-drivers/ati-drivers' in a comment.
Comment 3 Hanspeter Spalinger 2012-04-03 20:50:01 UTC
lisa ~ # emerge -vpq x11-drivers/ati-drivers
[ebuild   R   ] x11-drivers/ati-drivers-12.3  USE="modules (multilib) qt4 -debug -pax_kernel -static-libs"
Comment 4 Enrico Tagliavini 2012-04-04 07:55:57 UTC
Not a gentoo bug. I will add an ewarn to the ebuild about this and i will resolv this bug as upstream once done.

Thank you for the report.

By the way, if you want you can report to the unofficial AMD bugzilla http://ati.cchtml.com/ and add a "See Also" link here.

I will also try to reproduce the issue on my computer, but I doubt i will be able to. I already enable intel iommu as a module and it doesn't get autoloaded, so I guess i haven't the hardware to test with.
Comment 5 Enrico Tagliavini 2012-04-26 18:04:13 UTC
I changed my mind. I will leave the bug open. It is far better then 2 lines of an ewarn. Googling the error should quickly point here.

Btw I just uploaded 12.4 in the X11 overlay if you want to test it.
Comment 6 Hanspeter Spalinger 2012-04-27 09:48:14 UTC
The same problem with 12.4 :-(
Comment 7 Manuel Rüger (RETIRED) gentoo-dev 2015-07-10 17:13:31 UTC
Closing this bug, as 12.2 is not available in tree anymore. Please reopen, if you can reproduce with a recent version.