Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 849077 - sys-block/noflushd: no longer works correctly with kernel > 5.15 (potential data loss)
Summary: sys-block/noflushd: no longer works correctly with kernel > 5.15 (potential d...
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: Gentoo's Team for Core System packages
URL:
Whiteboard:
Keywords: PMASKED
Depends on:
Blocks:
 
Reported: 2022-06-01 19:11 UTC by Holger Hoffstätte
Modified: 2024-02-02 16:44 UTC (History)
4 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Holger Hoffstätte 2022-06-01 19:11:21 UTC
This is just a heads-up for everyone using noflushd, still the only
reliable solution to powering down rotational drives.

After moving from 5.15 to 5.18 I noticed that dirty writeback cache for any
*unstopped* devices (e.g. system root or other SSDs) would accumulate without
being written periodically; after some time I found that this was due to noflushd
running. It would still shut down the rotational drive(s) it was supposed to
suspend, but due to how it works (slightly fugly) and recent changes in the kernel,
writeback would no longer happen when noflushd was active. Manually performing sync
or stopping noflushd would bring things back to normal.

After some debugging I think I have found the reason (kernel >5.15 "cleanups"/changes
no longer performing fsync/syncfs on an fd of the raw block device), but need to
confirm this further. This might actually be an unintentional kernel bug,
though I don't think fsync on a raw bdev was ever _really_ supposed to work.

Usually one could just move to using hdparm/sdparm suspend, but - as I just found
out - the firmware in my hard drive is batshit buggy and only accepts -B timer
values from 1-11; 12+ no longer works and never suspends the drive.

Unfortunately noflushd is abandonded on SourceForge, and only available
in CVS. I might try to rescue it to Github (help welcome, I have forgotten
all about CVS) but in the meantime this is just a warning, since overly large
amounts of accumulated dirty bytes can result in unexpectedly large data loss
when a power outage or system crash happens.


Reproducible: Always

Steps to Reproduce:
1. start noflushd on kernel >5.15
2. dirty writeback data is no longer written back to *unstopped* devices
Comment 1 Holger Hoffstätte 2022-06-02 07:08:18 UTC
According to Christoph Hellwig fsync on a block device should
still work, yet whatever is going on in noflushd is now working
differently than before.

Potential replacements that are also less invasive/fragile as they
do not interfere with the kernel writeback mechanism:

- https://sourceforge.net/projects/hd-idle/
  Already in portage, but unmaintained. Lots of copies on GH.

- https://github.com/adelolmo/hd-idle
  An up-to-date & maintained rewrite in Go

- https://github.com/Piezoid/rust-idle
  An up-to-date & maintained rewrite in Rust

Hope this helps.
Comment 2 Holger Hoffstätte 2023-09-09 18:19:42 UTC
Given that there are several alternatives and one in the tree (hd-idle worked well) can we give last rites to noflushd?
Comment 3 Larry the Git Cow gentoo-dev 2023-12-31 10:59:54 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=e23d31bc532d7f2b15ef09b6fdc321cc1619f9f3

commit e23d31bc532d7f2b15ef09b6fdc321cc1619f9f3
Author:     Michał Górny <mgorny@gentoo.org>
AuthorDate: 2023-12-31 10:58:23 +0000
Commit:     Michał Górny <mgorny@gentoo.org>
CommitDate: 2023-12-31 10:59:50 +0000

    package.mask: Last rite sys-block/noflushd
    
    Bug: https://bugs.gentoo.org/849077
    Signed-off-by: Michał Górny <mgorny@gentoo.org>

 profiles/package.mask | 6 ++++++
 1 file changed, 6 insertions(+)
Comment 4 Larry the Git Cow gentoo-dev 2024-02-02 16:44:54 UTC
The bug has been closed via the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=3adc0ce822a378279633e7d7127e0e83ad2c5812

commit 3adc0ce822a378279633e7d7127e0e83ad2c5812
Author:     Michał Górny <mgorny@gentoo.org>
AuthorDate: 2024-02-02 16:41:15 +0000
Commit:     Michał Górny <mgorny@gentoo.org>
CommitDate: 2024-02-02 16:41:15 +0000

    sys-block/noflushd: Remove last-rited pkg
    
    Closes: https://bugs.gentoo.org/849077
    Signed-off-by: Michał Górny <mgorny@gentoo.org>

 profiles/package.mask                     |  6 ------
 sys-block/noflushd/Manifest               |  1 -
 sys-block/noflushd/files/noflushd.confd   | 10 ----------
 sys-block/noflushd/files/noflushd.rc6     | 30 ------------------------------
 sys-block/noflushd/metadata.xml           | 11 -----------
 sys-block/noflushd/noflushd-2.8-r1.ebuild | 30 ------------------------------
 6 files changed, 88 deletions(-)