On some machines with high I/O or interrupt rates, some tools from procps are failing with the message "Unknown HZ value!" which might lead to problems in many scripts. See debian bug 460331 for a further explanation. Diff for procps-3.2.8.ebuild: -------------- --- procps-3.2.8.ebuild 2009-11-23 06:06:46.000000000 +0100 +++ procps-3.2.8.ebuild.new 2010-02-01 18:10:00.538338887 +0100 @@ -48,6 +48,11 @@ if ! use n32 ; then epatch "${FILESDIR}"/${PN}-3.2.6-mips-n32_isnt_usable_on_mips64_yet.patch fi + + # Patch to fix an error in procps with newer kernels to get rid + # of the message "Unknown HZ value!". + # See Debian bug 460331. + epatch "${FILESDIR}"/30_sysinfo_7numbers.dpatch } src_compile() { -------------- The patch found in debians bugzilla should be placed (unmodified) in files. I'm attaching the patch.
Created attachment 218096 [details, diff] patch from debian (30_sysinfo_7numbers.dpatch)
Created attachment 218098 [details, diff] the patch for the ebuild (procps-3.2.8.ebuild.diff)
Thank you for report. Does there exist upstream bug report/fix?
The patch is not in the repo at sf and it seems there is no bugtracker active. So someone should mail that to one of the mls. Maybe the debian people have forgotten that.
Same thing is on Marvell sheevaplug: * Configuring kernel parameters ...Unknown HZ value! (68) Assume 100. [ ok ] * Cleaning /var/lock, /var/run ...Unknown HZ value! (67) Assume 100. Unknown HZ value! (67) Assume 100. Unknown HZ value! (67) Assume 100. Unknown HZ value! (67) Assume 100. [ ok ] ... Unknown HZ value! (85) Assume 100. It's harder to see the messages when system is idle, but they pop up on each boot. sys-process/procps-3.2.8
I tried that patch, but was still having issues with it showing up, I found a different patch in https://bugs.launchpad.net/debian/+source/procps/+bug/364656 that I'm applying in the Efika overlay ( http://github.com/steev/efikamx ) and it has appeared to fix the issue for me. I had to re-work it though as it wouldn't apply cleanly for me for some reason.
The problem is, that procps reads some values from /proc/somewhere and everytime something changes in the kernel, procps needs changed too. That interface doesn't seem to be very stable. I've tried to send the maintainter an e-mail, no response. And the mailing list is full of spam. So I don't know if upstream exists. ;)
that's the whole point of picking AT_CLKTCK out of the ELF auxv the kernel provides. really, all __linux__ and __ELF__ systems should be preferring that over anything /proc/ has to say. so a better fix imo would be to first walk the stack and find the AT_CLKTCK value before even thinking of looking in /proc/.
The debian patch doesn't fix the problem on my system. I think know why too. I've been looking into this and I think the problem we are all experiencing actually stems from something else entirely. The debian patch addresses the issue by modifying the old_Hertz_hack() function. I've read the code in proc/sysinfo.c and it seems that old_Hertz_hack() is only meant to be there as a fallback. It's not even supposed to be used on modern (>2.4.0) Linux systems. (In reply to comment #8) > that's the whole point of picking AT_CLKTCK out of the ELF auxv the kernel > provides. really, all __linux__ and __ELF__ systems should be preferring that > over anything /proc/ has to say. so a better fix imo would be to first walk > the stack and find the AT_CLKTCK value before even thinking of looking in > /proc/. > By design, it should be doing that. But there's a problem. This is init_libproc() from proc/sysinfo.c: static void init_libproc(void) __attribute__((constructor)) { ... if(linux_version_code > LINUX_VERSION(2, 4, 0)){ Hertz = find_elf_note(AT_CLKTCK); if(Hertz!=NOTE_NOT_FOUND) return; fputs("2.4+ kernel w/o ELF notes? -- report this\n", stderr); } old_Hertz_hack(); } The line that calls find_elf_note(AT_CLKTCK) _never_ gets executed. The function _always_ falls back on old_Hertz_hack(). The problem is that `linux_version_code` is still zero at this point because init_Linux_version(), which initialises it, hasn't been called yet. Why? Because, like init_libproc(), init_Linux_version() is declared with __attribute__((constructor)): static void init_Linux_version(void) __attribute__((constructor)); These functions are both automatically called before main(), but because no priority is specified for either they are being executed in the wrong order. My suggestion is to change the function declarations and add a priority value to control the order of execution. -static void init_libproc(void) __attribute__((constructor)); +static void init_libproc(void) __attribute__((constructor(100))) -static void init_Linux_version(void) __attribute__((constructor)); +static void init_Linux_version(void) __attribute__((constructor(200)));
(In reply to comment #9) > -static void init_libproc(void) __attribute__((constructor)); > +static void init_libproc(void) __attribute__((constructor(100))) > > -static void init_Linux_version(void) __attribute__((constructor)); > +static void init_Linux_version(void) __attribute__((constructor(200))); > Sorry, that would be backwards. This would be correct: -static void init_libproc(void) __attribute__((constructor)); +static void init_libproc(void) __attribute__((constructor(200))) -static void init_Linux_version(void) __attribute__((constructor)); +static void init_Linux_version(void) __attribute__((constructor(100)));
Created attachment 252457 [details, diff] patch: call init_Linux_version() before init_libproc()
I got a message like this on every startup after compiling procps with make 3.82. The default linking order is different from the one with make 3.81. Restoring it appears to take care of it. Here's one way of doing that: http://git.exherbo.org/?p=arbor.git;a=blobdiff;f=packages/sys-process/procps/files/procps-3.2.8-make-3.82.patch;h=b64693f1c620033b98e0fa0452ba6c091f586121;hp=e52fc375083980278c5f2b1490ee9d0fea52506c;hb=913d4b625c48c28fc4a06ffbba30e04006df5e44;hpb=a0f34f3cc9bf4795713defefdf47f4f4a0da2ae2
I think linking order determines the order in which constructor functions are called. So enforcing a particular linking order would indeed solve the problem, as would assigning priorities to the constructor functions.
that make patch is simply a hack that ignores the real issue. Chris's examination sounds pretty good/sane to me. the only sticking point would be whether upstream would accept prioritized constructors. but i dont see why they wouldnt. although, the way constructors are defined, all prioritized ones are executed before non-prioritized ones. so simply giving init_Linux_version() a priority value at all will guarantee it gets executed first. another way to address the issue would be to turn linux_version_code() into a function that cached its result so that it always returned the correct value.
Created attachment 254257 [details, diff] procps-3.2.8-linux-ver-init.patch slight tweak of Chris's patch. can people experiencing this bug try just this patch and see if it fixes things for them ?
(In reply to comment #15) > Created an attachment (id=254257) [details] > procps-3.2.8-linux-ver-init.patch > > slight tweak of Chris's patch. can people experiencing this bug try just this > patch and see if it fixes things for them ? > That patch works for me.
Incidentally, I have already submitted my patch upstream (to Albert Cahalan via sourceforge) but there has been no response. Albert is the admin of procps on sf and the sole committer of code. But he hasn't committed anything for 181 days (as I write this). So I think this bug might not be fixed upstream for quite a while. (In reply to comment #14) > another way to address the issue would be to turn linux_version_code() into a > function that cached its result so that it always returned the correct value. That is a good idea, but if we're free to rethink everything, wouldn't it be better to just not use constructors at all? The init functions could just be called from main().
whichever patch gets merged doesnt matter to me as long as it gets fixed. but i think you're right that you might not get a response as he has been AFK from development for a while. the point of using constructors is that this is a library. calling it from main() would require every app that happens to use the library to be updated.
(In reply to comment #18) > whichever patch gets merged doesnt matter to me as long as it gets fixed. but > i think you're right that you might not get a response as he has been AFK from > development for a while. I've filed a report with the Debian bug tracking system. Craig Small is the maintainer of procps there. And there are some commits by a csmall on sourceforge. There is hope. http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=603759 > the point of using constructors is that this is a library. calling it from > main() would require every app that happens to use the library to be updated. I was thinking that apps could explicitly call init_libproc() which could call init_Linux version(). But I didn't realise that there were apps outside of procps that used libproc.
Created attachment 254687 [details, diff] debian already has a patch for this it also needs gnu-kbsd-version.patch
Created attachment 254689 [details, diff] above patch can't be applied without this one But it has nothing to with this bug.
yeah, i dont like their approach. it's fraught with problems if other constructors are written that need the linux code. i think yours makes more sense and is a lot cleaner. Alexander: you're the original reporter. can you please try the patch i posted ?
This cropped up today when I upgraded to procps-3.2.8-r1.
(In reply to comment #23) > This cropped up today when I upgraded to procps-3.2.8-r1. > Patch fixes the issue. This should be committed, as my original Googling told me that this error indicates a rootkit. We don't want people panicking thinking that they have a rootkit when it is a simple error.
(In reply to comment #24) > (In reply to comment #23) > > This cropped up today when I upgraded to procps-3.2.8-r1. > > > > Patch fixes the issue. > > This should be committed, as my original Googling told me that this error > indicates a rootkit. > We don't want people panicking thinking that they have a rootkit when it is a > simple error. > Indeed we don't. I'd like to congratulate you, sir, on your very British way of confirming a bug report. Doubt not that we shall go forth and fix this simple mistake.
I've tried procps-3.2.8-linux-ver-init.patch and it does to fix the issue. I assume applying 30_sysinfo_7numbers.dpatch doesn't cause any harm, so maybe this could be applied too. Or just use the constructor-fix until upstream has cleaned up the mess.
should be all set in procps-3.2.8-r2 then ... thanks Chris & Alexander
(In reply to comment #26) > I assume applying 30_sysinfo_7numbers.dpatch doesn't cause any harm, so maybe > this could be applied too. It wouldn't do any harm, but it wouldn't do any good. That patch adds support for newer kernels to a function that isn't used with newer kernels. > Or just use the constructor-fix until upstream has cleaned up the mess. It's not really a mess. And this bug wasn't entirely upstream's fault. It was caused by a change in GNU Make. As of version 3.82, GNU Make no longer sorts lists of file names when expanding automatic variables or wildcards. That change in behaviour revealed a problem in procps. Its constructors expected to be called in a certain order. So we specified the order. Also, upstream is a man named Abert Cahalan. And he's gone fishing.
the bug lies wholly in procps. the fact that a newer version of make caused the bug to be exposed is purely incidental. the ELF spec is clear that unprioritized constructors may be run in any order the ldso feels like. if the glibc ldso randomly sorted the constructors at runtime and executed them, the bug would be exposed as well. or if the linker just happened to assemble the input objects into the output in a different order. so yes, this bug is entirely upstream's fault.
You're right, this bug actually is entirely upstream's fault. I wouldn't place any of the blame on GNU Make. When I said that this bug was "caused" by a change in Make, what I should have said was "revealed" or "triggered".
Nobody has to be blamed for something. Shit happens ;) Errors as well, at least until fault tolerant machines replace us. When I've written "until upstream has cleaned up the mess" I didn't want to blame someone, I just wanted to note, that this bug is long known not fixed upstream.
(In reply to comment #31) > Nobody has to be blamed for something. Shit happens ;) Errors as well, at least > until fault tolerant machines replace us. I like your philosophy. > When I've written "until upstream has cleaned up the mess" I didn't want to > blame someone, I just wanted to note, that this bug is long known not fixed > upstream. I was only trying to explain how this bug came to be a problem. I didn't mean to throw your words back at you. Will someone adopt procps? From: csmall@debian.org Craig Small To: chrsclmn@gmail.com Chris Coleman Date: Fri, 19 Nov 2010 11:22:04 +0000 Subject: Re: Bug#603759 closed by Craig Small <csmall@debian.org> (Bug#603759: fixed in procps 1:3.2.8-10) On Thu, Nov 18, 2010 at 10:11:37PM +0000, Chris Coleman wrote: > All seems quiet on procps.sf.net. Is Albert Cahalan still active? Do > you send him your patches? I don't think he is active, the times he's got patches he got them from the deb. - Craig
Created attachment 261747 [details, diff] rewrite to fix compilation with <gcc-4.3 Apparently assigning priorities to constructors wasn't possible until gcc-4.3. See bug #353630. I've rewritten the patch so that init_libproc() calls init_Linux_version().
re-open due to comment #33
procps-3.2.8_p10-r1 should work fine