dev-util/systemtap-2.5 ebuild removes -Werror flag in src_prepare. That leads to unsuitable kernel configurations being accepted by the ebuild. Chat exchange with jistone@redhat.com on #systemtap @ freenode: <jistone> some of our autoconf tests depend on -Werror <jistone> so if you run: stap -e 'probe process.begin { }' -u -p4 --disable-cache --vp 0005 |& grep autoconf-tracepoint [21:16] <jistone> you should see lines like <omitted> <bendlas> http://ur1.ca/i703f <jistone> right, so you get just a warning <jistone> I get: error: passing argument 1 of ‘tracepoint_probe_register’ from incompatible pointer type [-Werror] <jistone> we need that explicit failure <jistone> stap actually has a secret command line option to disable -Werror dynamically. it's secret because it breaks stuff like this <jistone> via commit c72dd3c713cc2b21eacae39ce6898f8e5c14e0ad <jistone> it's possible gentoo once ran into a bleeding-edge gcc that tripped on -Werror unnecessarily <jistone> but we usually keep on top of that through fedora rawhide <jistone> we'd rather see patches to fix the warning, instead of kill -Werror Reproducible: Always Steps to Reproduce: ~# stap -e 'probe process.begin { }' -u -p4 --disable-cache --vp 0005 |& grep autoconf-tracepoint Actual Results: warning: passing argument 1 of ‘tracepoint_probe_register’ from incompatible pointer type [enabled by default] Expected Results: error: passing argument 1 of ‘tracepoint_probe_register’ from incompatible pointer type [-Werror] Please contact jistone@redhat.com for further information.
Amended from #systemtap: <jistone> does gentoo let people stay on older kernels? might be relevant to mention that warning/error is only expected on 3.15+ <jistone> it might suffice to just exempt buildrun.cxx from the sed, if they want to keep the rest
Tests that expect -Werror to work for them are pretty retarded. What if $MY_GCC throws a warning for a reason that upstream didn't anticipate?
Maybe upstream would like to take a look at the resulting bug reports?
It's fine if you want to force out the -Werror in stap's own makefiles and configure scripts. We strive hard for warning-free code, and these days we build weekly development snapshots on Fedora rawhide that ought to catch even new compiler diagnostics early. But it you don't want that downstream, fine. The two build-failure bugs you cite were both in building stap itself, so relaxing here would be enough. However, in the kernel feature tests for modules that stap compiles itself, it's necessary to get things exactly right. In this case we find an "incompatible pointer type" because tracepoint_probe_register() changed in kernel 3.15. Leaving this as just a warning leads to laughably wrong code. We're prepared to generate kernel modules for before and after this change, but we must detect it. As mentioned in comment 1, it would suffice to just let buildrun.cxx have its -Werror. If you want to narrow that even further, the CHECK_BUILD line in compile_pass() is the particularly necessary piece.
To be absolutely clear, I'm not talking about -Werror when compiling buildrun.cxx. The code in buildrun.cxx invokes kernel module builds itself, and passes -Werror in the configuration therein. That's what we need to keep.
+1 here, I spent about a week trying to figure out why systemtap wasn't working on my machine, only to discover that it was caused by this (thanks jistone!) Just to add some google juice for the next poor soul, a direct indicator of this bug is if you run stap -e 'probe process("/bin/ls").function("main") { printf("hello\n"); }' and you get a bunch of undefined symbols (in dmesg) like: kernel: [401161.200950] stap_0f6f433748f4f487c8dd44fd546b89e_6721: Unknown symbol __tracepoint_sched_process_exit (err 0) kernel: [401161.201012] stap_0f6f433748f4f487c8dd44fd546b89e_6721: Unknown symbol __tracepoint_sched_process_exec (err 0) kernel: [401161.201041] stap_0f6f433748f4f487c8dd44fd546b89e_6721: Unknown symbol __tracepoint_sys_enter (err 0) kernel: [401161.201055] stap_0f6f433748f4f487c8dd44fd546b89e_6721: Unknown symbol __tracepoint_sys_exit (err 0) kernel: [401161.201067] stap_0f6f433748f4f487c8dd44fd546b89e_6721: Unknown symbol __tracepoint_sched_process_fork (err 0) Then fork the ebuild to your overlay and remove the -Werror stripping and your tracepoints will work again.
Created attachment 415670 [details, diff] remove buildrun.cxx from -Werror patching SystemTap really needs to keep -Werror in its buildrun.cxx source to properly detect differences in kernel configurations. The rest of the -Werror cases should be harmless to sed remove.
// A bit of obfuscation for Gentoo's sake. // We *need* -Werror for stapconf to work correctly. // https://bugs.gentoo.org/show_bug.cgi?id=522908 #define WERROR ("-W" "error")
(In reply to Josh Stone from comment #7) > Created attachment 415670 [details, diff] [details, diff] > remove buildrun.cxx from -Werror patching > > SystemTap really needs to keep -Werror in its buildrun.cxx source to > properly detect differences in kernel configurations. The rest of the > -Werror cases should be harmless to sed remove. Thanks. Applied to all ebuilds. Since you already worked around the problem for us, this only requires a stable keyword for ARM (see bug #634276), so I left out the revision bumps for the older versions that should go away soon.