Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 187095 - >=sys-kernel/suspend2-sources-2.6.20 - Oops and VFS problem while resuming
Summary: >=sys-kernel/suspend2-sources-2.6.20 - Oops and VFS problem while resuming
Status: RESOLVED NEEDINFO
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: x86 Linux
: Highest major (vote)
Assignee: Alon Bar-Lev (RETIRED)
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-07-30 06:59 UTC by gto_la
Modified: 2007-08-31 07:28 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
Screenshot step1: Resume does not work (img_1032-cut.jpg,291.26 KB, image/jpeg)
2007-08-03 15:01 UTC, gto_la
Details
Screenshot step2: noresume does not help at once (img_1039-cut.jpg,261.88 KB, image/jpeg)
2007-08-03 15:02 UTC, gto_la
Details
Screenshot step1 on other machine: Resume does not work (image/jpeg,244.52 KB, image/jpeg)
2007-08-05 08:53 UTC, gto_la
Details

Note You need to log in before you can comment on or make changes to this bug.
Description gto_la 2007-07-30 06:59:07 UTC
On several mashines resume from suspend2-sources-2.6.20 up to
 including suspend2-sources-2.6.22 causes an Oops #0000 or similar.

Not tainted and "VLI" are the major keywords I wrote down.  The
 probabilty of this incident seems to be at least 60% !

When this happend, a reset and reboot with option "noresume"
 (=noresume2 in my kernel configuration) is only the first step to
 system-recoverage, which again enforces the reset-button to be
 pressed, because the root-fs seems to have vanished: As having
 IDE-drives, block-special(3,8) is my root-fs /dev/hda8 but it always
 (in that situation, and then only once) complains about some
 "unknown-block(3,8)".

On an other system (here root-fs beeing /dev/hda9) it happily lists
 all available partitions, also showing the matter of complaint:
 /dev/hda9 !
Comment 1 Alon Bar-Lev (RETIRED) gentoo-dev 2007-07-30 16:57:57 UTC
Hmmm...
Thanks for the report...
But I cannot help you... It seems an issue only upstream can solve.

I suggest you post a message to list:
suspend2-devel@lists.suspend2.net

If you have digital camera you should take a picture before you post... It will allow Nigel to know what going on.

Sorry...
Comment 2 gto_la 2007-08-03 15:01:06 UTC
Created attachment 126809 [details]
Screenshot step1: Resume does not work
Comment 3 gto_la 2007-08-03 15:02:19 UTC
Created attachment 126811 [details]
Screenshot step2: noresume does not help at once
Comment 4 Alon Bar-Lev (RETIRED) gentoo-dev 2007-08-03 16:39:48 UTC
Hello Nigel,
Any thoughts?
Comment 5 Nigel Cunningham 2007-08-03 22:10:20 UTC
Off the top of my head, failing to initialise LZF makes me wonder if the LZF code has been built as a module and not loaded prior to trying to resume. That said, I thought I tested only recently that this is handled properly.

Regarding the second issue, if we're managing to invalidate the image, the /dev node for the device the image header is using must exist. Could /dev/hda6 be missing from the filesystem in the initrd/initramfs?
Comment 6 Nigel Cunningham 2007-08-03 22:18:24 UTC
I should also ask, is there any chance of reproducing this using a kernel with debugging information (before it gets fixed)? I have zero chance of fixing the underlying problem without that.
Comment 7 gto_la 2007-08-04 15:38:35 UTC
No, suspend2-compression is NOT built as module AND
I do NOT use any initrd which could cause trouble by missing any device-files.
I am now compiling a new kernel with some debugging enabled,
but please let me know what checkboxes I should have checked there!
Comment 8 Nigel Cunningham 2007-08-04 21:59:15 UTC
Is Cryptoapi LZF support built in?
Comment 9 gto_la 2007-08-05 08:45:41 UTC
You made half a point there, Cryptoapi LZF was not built at all, but is
now in static kernel... without much better results (at least 1 out of 4 suspend-resume-cycles means death for the running system)
As I suspected LZF to be part of the problem, I disabled suspend2 compression
once uppon a time, without better results.
Comment 10 gto_la 2007-08-05 08:53:30 UTC
Created attachment 126924 [details]
Screenshot step1 on other machine: Resume does not work
Comment 11 Nigel Cunningham 2007-08-05 09:10:53 UTC
I'm leaving for a conference now, and won't be back until Friday evening GMT+10. I don't think I'll have web access, so please accept my apologies if I'm slow to reply. I'll grab your latest screenshot now; perhaps I'll be able to do some work on it while away.

Nigel
Comment 12 Nigel Cunningham 2007-08-05 09:14:52 UTC
Hmm. Before I go, I've looked at the screenshot. Do you have more than one swap device? If so, you need to have all swap devices accessible at resume time. Unlike [u]swsusp, Suspend2 will use all available swap when writing the image, so you can't assume that the image has only been stored on the device pointed to by resume=. If you don't want a swap partition/file to be used for writing the image, you'll need to swapoff it before starting the cycle.

See you Friday.
Comment 13 gto_la 2007-08-05 10:11:54 UTC
It is true that I have a main swap device and two swapfiles!
They reside on ext3 and reiser-fs, that are on the same harddisk as the
swap device. As having all major filesystems statically
built into the kernel I don't think that this is a problem, but I
will make some tests without any secondary swap-devices/-files to insure this.

Comment 14 Alon Bar-Lev (RETIRED) gentoo-dev 2007-08-31 07:28:32 UTC
gto_la: Please reopen if you have more information.