Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 485836 - =sys-kernel/gentoo-sources-3.11.0 - btrfs errors in fs/btrfs/inode.c record_one_backref
Summary: =sys-kernel/gentoo-sources-3.11.0 - btrfs errors in fs/btrfs/inode.c record_o...
Status: RESOLVED OBSOLETE
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: AMD64 Linux
: Normal normal (vote)
Assignee: Gentoo Kernel Bug Wranglers and Kernel Maintainers
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-09-24 09:32 UTC by Liam Dennehy
Modified: 2019-07-14 11:11 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
Kernel Config 3.11 (kernelconfig,74.34 KB, text/plain)
2013-09-24 09:34 UTC, Liam Dennehy
Details
Obligatory emerge --info (emerge-info.txt,4.66 KB, text/plain)
2013-09-24 09:36 UTC, Liam Dennehy
Details
/proc/self/mounts (mounts.txt,2.30 KB, text/plain)
2013-09-24 11:39 UTC, Liam Dennehy
Details
dmesg output (dmesg,197.83 KB, text/plain)
2013-09-24 11:45 UTC, Liam Dennehy
Details
Upstream patch (Btrfs-reset-ret-in-record_one_backref.patch,1009 bytes, patch)
2013-09-25 11:17 UTC, emil karlson
Details | Diff
BTRFS oops (IMG_20131004_104005.png,33.31 KB, image/png)
2013-10-07 09:19 UTC, Liam Dennehy
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Liam Dennehy 2013-09-24 09:32:54 UTC
The system is running with periodic snapshots, and is taking snaps at 5min, hourly, daily, weekly and monthly intervals. A cron job creates new snaps and cleans expired snaps.
No apparent loss of functionality, system instability or filesystem corruption.


Reproducible: Sometimes

Steps to Reproduce:
1. Linux on btrfs root filesystem, separate subvolume
2. Other factors correlate weakly, but user activity seems the trigger (zero instances overnight)

Actual Results:  
Frequent kernel oops, indicating:
WARNING: CPU: 3 PID: 22121 at fs/btrfs/inode.c:2206 record_one_backref+0x3c9/0x440()

Expected Results:  
Kernel should continue to service the filesystem without oops

System runs a variety of workloads - tomcat, mysql. Errors do not coincide with any tasks, but appear absent overnight when no clients are active, but this also means snaps have far fewer changes at this time.
Issue seems to have occurred on previous 3.10.10 kernel.
Kernel error message do not coincide directly with the cron snap tasks, but this may be relevant.
sys-fs/btrfs-progs-0.20_rc1_p358
sys-kernel/gentoo-sources-3.11.0
Comment 1 Liam Dennehy 2013-09-24 09:34:00 UTC
Created attachment 359346 [details]
Kernel Config 3.11
Comment 2 Liam Dennehy 2013-09-24 09:36:18 UTC
Created attachment 359348 [details]
Obligatory emerge --info
Comment 3 Liam Dennehy 2013-09-24 11:39:19 UTC
Created attachment 359352 [details]
/proc/self/mounts

Output of /proc/self/mounts
Comment 4 Liam Dennehy 2013-09-24 11:45:51 UTC
Created attachment 359354 [details]
dmesg output
Comment 5 emil karlson 2013-09-25 11:17:50 UTC
Created attachment 359424 [details, diff]
Upstream patch
Comment 6 emil karlson 2013-09-25 11:23:28 UTC
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=50f1319cb5f7690e4d9de18d1a75ea89296d0e53

As apparently my Patch did not include the whole email including the commit name
Comment 7 Liam Dennehy 2013-09-26 14:38:40 UTC
Thanks Emil, I've built a new kernel with this patch incorporated (wow, small) and I'll let it run for a few days, hopefully with zero indications.
Comment 8 Liam Dennehy 2013-09-30 10:40:25 UTC
After a day and a half, referenced errors are absent. Server oopsed the same day after the patched kernel was booted, nothing captured in /var/log/messages due to /var partition going read-only, no further details.

Will continue to monitor for the remainder of the week.
Comment 9 Liam Dennehy 2013-10-07 09:19:41 UTC
Created attachment 360296 [details]
BTRFS oops

Second instance of a kernal oops since applying the patch recommended. Fairly certain both oopses are the same, but did not note details of the first instance.

Apologies for the photo, but login fails so cannot dmesg, and this error doesn't make it through syslog-ng to disk (which happens to be on the volume being snapped).
Comment 10 Mike Pagano gentoo-dev 2013-11-05 01:12:12 UTC
Can you please try to reproduce with 3.12
Comment 11 Liam Dennehy 2013-11-15 10:49:57 UTC
After two days at 3.12.0-gentoo previously described log errors are not apparent. Occasional (previously highly frequent) lines of:

kernel: [782682.259647] BTRFS debug (device dm-1): unlinked 2 orphans

also seem to have disappeared entirely.

Will continue to monitor for another week before considering this resolved.
Comment 12 Mike Pagano gentoo-dev 2014-01-29 18:55:17 UTC
It's been two months, if everything is still not functional, please comment here and I will reopen.