Summary: | sys-devel/gcc-14: unstable behaviour | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Gentoo Linux | Reporter: | Roman <roma251078> | ||||||
Component: | Stabilization | Assignee: | Gentoo Toolchain Maintainers <toolchain> | ||||||
Status: | RESOLVED INVALID | ||||||||
Severity: | normal | CC: | arsen, roma251078 | ||||||
Priority: | Normal | ||||||||
Version: | unspecified | ||||||||
Hardware: | All | ||||||||
OS: | Linux | ||||||||
See Also: |
https://mirror.git.trinitydesktop.org/gitea/TDE/tqt3/issues/171 https://mirror.git.trinitydesktop.org/gitea/TDE/tqt3/issues/174 https://mirror.git.trinitydesktop.org/gitea/TDE/tqt3/issues/175 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116151 |
||||||||
Whiteboard: | |||||||||
Package list: | Runtime testing required: | --- | |||||||
Bug Depends on: | |||||||||
Bug Blocks: | 914580 | ||||||||
Attachments: |
|
Description
Roman
2024-07-18 04:35:03 UTC
> This is already instability in the behavior of binary files.
I'm not sure what you mean by this.
Anyway, compiler bugs can and do exist, but it doesn't mean it's the case here. TDE isn't in ::gentoo and it's not easy for me to quickly install it and try to debug it. Please work with the TDE developers to make a smaller testcase.
That said, I suggest trying UBSAN first.
If UBSAN yields nothing, with regard to https://mirror.git.trinitydesktop.org/gitea/TDE/tqt3/issues/171#issuecomment-55279, you should start with: * build it manually * have a builddir with good/bad compilers (13/14) * bisect by object files (copy bad objects one by one into the good dir, run make repeatedly) * bisect by optimize pragma to narrow down function(s) (if it's an IPA issue, multiple might be involved, but hopefully it's not) Then once you know the function, you can start to try e.g. emulate passing the args by printing whatever its args are in gdb then construct it in a standalone file. (In reply to Sam James from comment #2) > If UBSAN yields nothing, with regard to > https://mirror.git.trinitydesktop.org/gitea/TDE/tqt3/issues/171#issuecomment- > 55279, you should start with: > * build it manually > * have a builddir with good/bad compilers (13/14) > * bisect by object files (copy bad objects one by one into the good dir, run > make repeatedly) > * bisect by optimize pragma to narrow down function(s) (if it's an IPA > issue, multiple might be involved, but hopefully it's not) > > Then once you know the function, you can start to try e.g. emulate passing > the args by printing whatever its args are in gdb then construct it in a > standalone file. Thanks for the answer, but as I see it, there will still be a lot to figure out with tqt. At this point, already during assembly, the following was revealed: >/var/tmp/portage/dev-tqt/tqt-9999/work/tqt-9999/src/tools/tqstring.cpp:1906:12: runtime error: null pointer passed as argument 2, which is declared to never be null I will need to address the build issues one by one. np. Keep us posted and I can try give some more interactive advice on IRC if needed too, as we're used to helping with debugging miscompilations. (In reply to Sam James from comment #4) > np. Keep us posted and I can try give some more interactive advice on IRC if > needed too, as we're used to helping with debugging miscompilations. Ok. For now I need to solve all the problems with "runtime error", then I will move on. I see several such errors there, having solved all, the assembly will be successful. In more detail. The problem manifests itself when assembling with flags: -O2, -O3 With flags: -O1, -Oz, this problem did not manifest itself. It can be concluded that the problem manifests itself when using speed optimization. could you attach a preprocessed version of the file with the apparently faulty loop? and point out the loop. you can generate that with either -E (which runs only the preprocessor) or -save-temps. Created attachment 898009 [details]
File generated by preprocessor
File generated by preprocessor. See the for loop:
for (uint i = 0; i < TQFont::NScripts; ++i)
supported_scripts[i] = checkScript(i);
I don't know what's the point of looking at the preprocessor if the reason is optimization. I think you need to look at the assembler code here. (In reply to Roman from comment #9) > I don't know what's the point of looking at the preprocessor if the reason > is optimization. I think you need to look at the assembler code here. Because he can then check if there's obvious UB and also compile it locally with adjustments. If you just share assembly, there's not much that can be done with it. But let me just say clearly, it's almost certainly UB, but if it isn't, it'll still almost certainly be something where a bad optimisation is done way before RTL (i.e. it won't be a "pure asm" thing). I've only ever seen a handful of actual insn issues in gcc for targets. Also, to address the stuff on the bug: it's not going to be as simple as a for loop in isolation (obviously). Really, trying to reason about it like that and thinking it's just "one very serious bug" is misunderstanding how it works. Compiler bugs happen all the time -- they're just pretty rare in terms of the grand scheme of things, and also most of the time when people think there's a bug, it ends up being a bug in their code (UB manifesting). (In reply to Sam James from comment #12) > Compiler bugs happen all the time -- they're just pretty rare in terms of > the grand scheme of things, and also most of the time when people think > there's a bug, it ends up being a bug in their code (UB manifesting). Quite possible. There are UB in the code, but they still need to be sorted out. I am not a tqt developer, so it will take me a lot of time to fix all the errors by adding a fix to /etc/portage/patches/... Well, of course I can do this when I have enough free time. Thanks for working on it & good luck. Let us know if need our take on something too as you go. (In reply to Roman from comment #9) > I don't know what's the point of looking at the preprocessor if the reason > is optimization. I think you need to look at the assembler code here. because a preprocessed file is a self-contained TU I can pick apart and inspect :) I'm not trying to verify the assembly is wrong (that can be done by running it as you did), but that the input code is or that an optimization pass is the ot_scripts array is seven elements short. please tell upstream to add: static_assert (std::size (ot_scripts) == TQFont::NScripts, "not enough scripts"); ... after its declaration and report back whether the misbehavior stopped. TIA (In reply to Arsen Arsenović from comment #16) > the ot_scripts array is seven elements short. please tell upstream to add: > > static_assert (std::size (ot_scripts) == TQFont::NScripts, "not enough > scripts"); > > ... after its declaration and report back whether the misbehavior stopped. > TIA Thanks. This is just what I needed. UBSAN shows the same thing: >kernel/qfontengine_x11.cpp:2441:41: runtime error: index 48 out of bounds for type 'OTScripts [48]' I see that there are no elements related to Unicode. Instead of TQFont::NScripts(55) I set 48, and everything worked. I will ask them why they did it this way. But they still have many shortcomings related to the code. (In reply to Arsen Arsenović from comment #16) > the ot_scripts array is seven elements short. please tell upstream to add: > > static_assert (std::size (ot_scripts) == TQFont::NScripts, "not enough > scripts"); > > ... after its declaration and report back whether the misbehavior stopped. > TIA Thanks. This is just what I needed. UBSAN shows the same thing: >kernel/qfontengine_x11.cpp:2441:41: runtime error: index 48 out of bounds for type 'OTScripts [48]' I see that there are no elements related to Unicode. Instead of TQFont::NScripts(55) I set 48, and everything worked. I will ask them why they did it this way. But they still have many shortcomings related to the code. sure, and urge them to add such static checks to avoid errors in the future. Created attachment 898735 [details]
A simple example
Looked at the last link.
If you split it into objects, the problem is reproduced not only on g++ but also on gcc. You can add optimization and select a compiler in the Makefile.
the optimization is correct because the program is invalid, I reported the missing diagnostic already. I've linked the bug in see also when I reported it ah, sorry, I misread your message - I see that you saw the problem report now. but I'm unsure what you mean by "the problem is reproduced not only on g++ but also on gcc" though (the tarball you attached only invokes the C++ compiler, and building the code in it via the C compiler without exceptions results in a diagnostic) (In reply to Arsen Arsenović from comment #22) > ah, sorry, I misread your message - I see that you saw the problem report > now. but I'm unsure what you mean by "the problem is reproduced not only on > g++ but also on gcc" though (the tarball you attached only invokes the C++ > compiler, and building the code in it via the C compiler without exceptions > results in a diagnostic) If you compile such a file, then yes, the problem will only be reproduced with C++: ------------------ #include <stdio.h> int g (int i) { return i; } void f() { int arr[50]; for (int i = 0; i < 55; i++) { arr[i] = g (i); printf("%d\n",i); } } int main( int argc, char **argv) { f(); } ----------------- But the example that I posted earlier, if you replace g++ with gcc, the problem will still be reproduced after assembly. Video, You can view it without downloading: https://uploadnow.io/f/S9tryy6 I checked it on both gcc-14 and gcc-13, the result is the same. Sorry for the previous comment, I checked, if gcc encounters a .cpp file extension, it treats it as C++. Then yes, the problem only concerns C++. (In reply to Roman from comment #24) > Sorry for the previous comment, I checked, if gcc encounters a .cpp file > extension, it treats it as C++. Then yes, the problem only concerns C++. Indeed - but, as noted in the bug report, it also happens in C with exceptions enabled (which is supported). |