Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 936237 - sys-devel/gcc-14: unstable behaviour
Summary: sys-devel/gcc-14: unstable behaviour
Status: RESOLVED INVALID
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Stabilization (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: Gentoo Toolchain Maintainers
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: gcc-14
  Show dependency tree
 
Reported: 2024-07-18 04:35 UTC by Roman
Modified: 2024-08-01 17:26 UTC (History)
2 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
File generated by preprocessor (qfontengine_x11.ii.gz,344.31 KB, application/gzip)
2024-07-20 04:43 UTC, Roman
no flags Details
A simple example (gcc-test.tar.gz,466 bytes, application/gzip)
2024-08-01 05:17 UTC, Roman
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Roman 2024-07-18 04:35:03 UTC
Using gcc-14.1.1_p20240622, I got this problem: https://mirror.git.trinitydesktop.org/gitea/TDE/tqt3/issues/171#issuecomment-55281
This is the first time I've encountered a case where the loop condition is not met. This is already instability in the behavior of binary files.

Reproducible: Always
Comment 1 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2024-07-18 04:39:14 UTC
> This is already instability in the behavior of binary files.

I'm not sure what you mean by this.

Anyway, compiler bugs can and do exist, but it doesn't mean it's the case here. TDE isn't in ::gentoo and it's not easy for me to quickly install it and try to debug it. Please work with the TDE developers to make a smaller testcase.

That said, I suggest trying UBSAN first.
Comment 2 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2024-07-18 05:02:36 UTC
If UBSAN yields nothing, with regard to https://mirror.git.trinitydesktop.org/gitea/TDE/tqt3/issues/171#issuecomment-55279, you should start with:
* build it manually
* have a builddir with good/bad compilers (13/14)
* bisect by object files (copy bad objects one by one into the good dir, run make repeatedly)
* bisect by optimize pragma to narrow down function(s) (if it's an IPA issue, multiple might be involved, but hopefully it's not)

Then once you know the function, you can start to try e.g. emulate passing the args by printing whatever its args are in gdb then construct it in a standalone file.
Comment 3 Roman 2024-07-18 08:55:49 UTC
(In reply to Sam James from comment #2)
> If UBSAN yields nothing, with regard to
> https://mirror.git.trinitydesktop.org/gitea/TDE/tqt3/issues/171#issuecomment-
> 55279, you should start with:
> * build it manually
> * have a builddir with good/bad compilers (13/14)
> * bisect by object files (copy bad objects one by one into the good dir, run
> make repeatedly)
> * bisect by optimize pragma to narrow down function(s) (if it's an IPA
> issue, multiple might be involved, but hopefully it's not)
> 
> Then once you know the function, you can start to try e.g. emulate passing
> the args by printing whatever its args are in gdb then construct it in a
> standalone file.

Thanks for the answer, but as I see it, there will still be a lot to figure out with tqt. At this point, already during assembly, the following was revealed:
>/var/tmp/portage/dev-tqt/tqt-9999/work/tqt-9999/src/tools/tqstring.cpp:1906:12: runtime error: null pointer passed as argument 2, which is declared to never be null

I will need to address the build issues one by one.
Comment 4 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2024-07-18 09:06:45 UTC
np. Keep us posted and I can try give some more interactive advice on IRC if needed too, as we're used to helping with debugging miscompilations.
Comment 5 Roman 2024-07-18 10:00:46 UTC
(In reply to Sam James from comment #4)
> np. Keep us posted and I can try give some more interactive advice on IRC if
> needed too, as we're used to helping with debugging miscompilations.

Ok. For now I need to solve all the problems with "runtime error", then I will move on. I see several such errors there, having solved all, the assembly will be successful.
Comment 6 Roman 2024-07-19 12:51:06 UTC
In more detail. The problem manifests itself when assembling with flags: -O2, -O3
With flags: -O1, -Oz, this problem did not manifest itself.
It can be concluded that the problem manifests itself when using speed optimization.
Comment 7 Arsen Arsenović gentoo-dev 2024-07-19 20:46:29 UTC
could you attach a preprocessed version of the file with the apparently faulty loop?  and point out the loop.

you can generate that with either -E (which runs only the preprocessor) or -save-temps.
Comment 8 Roman 2024-07-20 04:43:53 UTC
Created attachment 898009 [details]
File generated by preprocessor

File generated by preprocessor. See the for loop:
    for (uint i = 0; i < TQFont::NScripts; ++i)
        supported_scripts[i] = checkScript(i);
Comment 9 Roman 2024-07-20 04:46:41 UTC
I don't know what's the point of looking at the preprocessor if the reason is optimization. I think you need to look at the assembler code here.
Comment 10 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2024-07-20 04:49:57 UTC
(In reply to Roman from comment #9)
> I don't know what's the point of looking at the preprocessor if the reason
> is optimization. I think you need to look at the assembler code here.

Because he can then check if there's obvious UB and also compile it locally with adjustments. If you just share assembly, there's not much that can be done with it.
Comment 11 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2024-07-20 04:50:55 UTC
But let me just say clearly, it's almost certainly UB, but if it isn't, it'll still almost certainly be something where a bad optimisation is done way before RTL (i.e. it won't be a "pure asm" thing). I've only ever seen a handful of actual insn issues in gcc for targets.
Comment 12 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2024-07-20 04:52:08 UTC
Also, to address the stuff on the bug: it's not going to be as simple as a for loop in isolation (obviously). Really, trying to reason about it like that and thinking it's just "one very serious bug" is misunderstanding how it works.

Compiler bugs happen all the time -- they're just pretty rare in terms of the grand scheme of things, and also most of the time when people think there's a bug, it ends up being a bug in their code (UB manifesting).
Comment 13 Roman 2024-07-20 05:07:53 UTC
(In reply to Sam James from comment #12)
> Compiler bugs happen all the time -- they're just pretty rare in terms of
> the grand scheme of things, and also most of the time when people think
> there's a bug, it ends up being a bug in their code (UB manifesting).

Quite possible. There are UB in the code, but they still need to be sorted out. I am not a tqt developer, so it will take me a lot of time to fix all the errors by adding a fix to /etc/portage/patches/... Well, of course I can do this when I have enough free time.
Comment 14 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2024-07-20 05:12:32 UTC
Thanks for working on it & good luck. Let us know if need our take on something too as you go.
Comment 15 Arsen Arsenović gentoo-dev 2024-07-20 11:01:02 UTC
(In reply to Roman from comment #9)
> I don't know what's the point of looking at the preprocessor if the reason
> is optimization. I think you need to look at the assembler code here.

because a preprocessed file is a self-contained TU I can pick apart and inspect :)

I'm not trying to verify the assembly is wrong (that can be done by running it as you did), but that the input code is or that an optimization pass is
Comment 16 Arsen Arsenović gentoo-dev 2024-07-20 15:26:04 UTC
the ot_scripts array is seven elements short.  please tell upstream to add:

  static_assert (std::size (ot_scripts) == TQFont::NScripts, "not enough scripts");

... after its declaration and report back whether the misbehavior stopped.  TIA
Comment 17 Roman 2024-07-20 17:42:38 UTC
(In reply to Arsen Arsenović from comment #16)
> the ot_scripts array is seven elements short.  please tell upstream to add:
> 
>   static_assert (std::size (ot_scripts) == TQFont::NScripts, "not enough
> scripts");
> 
> ... after its declaration and report back whether the misbehavior stopped. 
> TIA

Thanks. This is just what I needed. UBSAN shows the same thing:
>kernel/qfontengine_x11.cpp:2441:41: runtime error: index 48 out of bounds for type 'OTScripts [48]'

I see that there are no elements related to Unicode. Instead of TQFont::NScripts(55) I set 48, and everything worked. I will ask them why they did it this way. But they still have many shortcomings related to the code.
Comment 18 Roman 2024-07-20 17:44:55 UTC
(In reply to Arsen Arsenović from comment #16)
> the ot_scripts array is seven elements short.  please tell upstream to add:
> 
>   static_assert (std::size (ot_scripts) == TQFont::NScripts, "not enough
> scripts");
> 
> ... after its declaration and report back whether the misbehavior stopped. 
> TIA

Thanks. This is just what I needed. UBSAN shows the same thing:
>kernel/qfontengine_x11.cpp:2441:41: runtime error: index 48 out of bounds for type 'OTScripts [48]'

I see that there are no elements related to Unicode. Instead of TQFont::NScripts(55) I set 48, and everything worked. I will ask them why they did it this way. But they still have many shortcomings related to the code.
Comment 19 Arsen Arsenović gentoo-dev 2024-07-20 17:59:16 UTC
sure, and urge them to add such static checks to avoid errors in the future.
Comment 20 Roman 2024-08-01 05:17:34 UTC
Created attachment 898735 [details]
A simple example

Looked at the last link.
If you split it into objects, the problem is reproduced not only on g++ but also on gcc. You can add optimization and select a compiler in the Makefile.
Comment 21 Arsen Arsenović gentoo-dev 2024-08-01 10:18:14 UTC
the optimization is correct because the program is invalid, I reported the missing diagnostic already.  I've linked the bug in see also when I reported it
Comment 22 Arsen Arsenović gentoo-dev 2024-08-01 10:21:34 UTC
ah, sorry, I misread your message - I see that you saw the problem report now.  but I'm unsure what you mean by "the problem is reproduced not only on g++ but also on gcc" though (the tarball you attached only invokes the C++ compiler, and building the code in it via the C compiler without exceptions results in a diagnostic)
Comment 23 Roman 2024-08-01 10:48:01 UTC
(In reply to Arsen Arsenović from comment #22)
> ah, sorry, I misread your message - I see that you saw the problem report
> now.  but I'm unsure what you mean by "the problem is reproduced not only on
> g++ but also on gcc" though (the tarball you attached only invokes the C++
> compiler, and building the code in it via the C compiler without exceptions
> results in a diagnostic)

If you compile such a file, then yes, the problem will only be reproduced with C++:
------------------
#include <stdio.h>

int g (int i)
{
        return i;
}

void f()
{
        int arr[50];
        for (int i = 0; i < 55; i++) {
                arr[i] = g (i);
                printf("%d\n",i);
        }
}

int main( int argc, char **argv)
{
        f();
}
-----------------

But the example that I posted earlier, if you replace g++ with gcc, the problem will still be reproduced after assembly.
Video, You can view it without downloading:
https://uploadnow.io/f/S9tryy6

I checked it on both gcc-14 and gcc-13, the result is the same.
Comment 24 Roman 2024-08-01 11:22:35 UTC
Sorry for the previous comment, I checked, if gcc encounters a .cpp file extension, it treats it as C++. Then yes, the problem only concerns C++.
Comment 25 Arsen Arsenović gentoo-dev 2024-08-01 17:26:45 UTC
(In reply to Roman from comment #24)
> Sorry for the previous comment, I checked, if gcc encounters a .cpp file
> extension, it treats it as C++. Then yes, the problem only concerns C++.

Indeed - but, as noted in the bug report, it also happens in C with exceptions enabled (which is supported).