Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 953931 - games-fps/alephone: enemies invisible when built with LTO
Summary: games-fps/alephone: enemies invisible when built with LTO
Status: CONFIRMED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal normal
Assignee: Gentoo Games
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: lto
  Show dependency tree
 
Reported: 2025-04-17 03:01 UTC by Sam James
Modified: 2025-04-20 06:32 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Sam James archtester Gentoo Infrastructure gentoo-dev Security 2025-04-17 03:01:59 UTC
Kangie found this and reported it upstream at https://github.com/Aleph-One-Marathon/alephone/issues/518.

I was helping csfore with it but it's a bit complex and it's better for me to take over, I think.

I'm just filing this downstream so I have somewhere to take notes.
Comment 1 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2025-04-17 05:53:50 UTC
OK, I can reproduce with master at 99ac83f5851bf28d8fb02a8d1074fea093be5d9e with:

autoreconf -fiv && ~/git/alephone/configure --without-ffmpeg CFLAGS="-O2 -ggdb3 -flto=auto" CXXFLAGS="-O2 -ggdb3 -flto=auto" LDFLAGS="-Wl,-O1" --disable-dependency-tracking && make -j$(nproc) -l$(nproc)
Source_Files/alephone -Q /usr/share/alephone-marathon

But note that it appears to not be deterministic! A good build with aliens may have aliens disappear after turning around. A good build may also no longer work after either doing "new game" or closing/repoening. Make sure to check a build several times.

So, next steps (not necessarily in this order):
* Reduce iteration time by skipping the menu screen
* Maybe set some seed so aliens are immediately where we spawn, or at least in the same place every time
* Find an older GCC that works, maybe, for bisecting (useful even if it's not GCC's fault). Will need to disable Boost for this, not sure if can disable any other libs needing C++
Comment 2 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2025-04-19 07:32:50 UTC
-O1 -fstrict-aliasing -flto fails too
Comment 3 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2025-04-19 07:48:04 UTC
Clang's -fsanitize=type produces a tonne of output but it's very new and a lot of them seem bogus, so not directly useful. May be useful later once narrowed it down a bit.
Comment 4 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2025-04-20 01:09:09 UTC
Minor checkpoint, using these two is enough:
/home/sam/git/alephone/build-bad/Source_Files/GameWorld/libgameworld.a
/home/sam/git/alephone/build-bad/Source_Files/RenderMain/librendermain.a
Comment 5 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2025-04-20 01:35:22 UTC
(In reply to Sam James from comment #4)
> Minor checkpoint, using these two is enough:
> /home/sam/git/alephone/build-bad/Source_Files/GameWorld/libgameworld.a
> /home/sam/git/alephone/build-bad/Source_Files/RenderMain/librendermain.a

OK, reduced down to:
/home/sam/git/alephone/build-bad/Source_Files/GameWorld/world.o
/home/sam/git/alephone/build-bad/Source_Files/RenderMain/RenderPlaceObjs.o
Comment 6 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2025-04-20 01:37:08 UTC
(In reply to Sam James from comment #5)
> (In reply to Sam James from comment #4)
> > Minor checkpoint, using these two is enough:
> > /home/sam/git/alephone/build-bad/Source_Files/GameWorld/libgameworld.a
> > /home/sam/git/alephone/build-bad/Source_Files/RenderMain/librendermain.a
> 
> OK, reduced down to:
> /home/sam/git/alephone/build-bad/Source_Files/GameWorld/world.o
> /home/sam/git/alephone/build-bad/Source_Files/RenderMain/RenderPlaceObjs.o

... corresponding to:
* https://github.com/Aleph-One-Marathon/alephone/blob/99ac83f5851bf28d8fb02a8d1074fea093be5d9e/Source_Files/GameWorld/world.cpp
* https://github.com/Aleph-One-Marathon/alephone/blob/99ac83f5851bf28d8fb02a8d1074fea093be5d9e/Source_Files/RenderMain/RenderPlaceObjs.cpp
Comment 7 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2025-04-20 01:38:47 UTC
(In reply to Sam James from comment #6)
> https://github.com/Aleph-One-Marathon/alephone/blob/
> 99ac83f5851bf28d8fb02a8d1074fea093be5d9e/Source_Files/RenderMain/
> RenderPlaceObjs.cpp

build_aggregate_render_object_clipping_window has some maybe dodgy casts. It's had issues before: https://github.com/Aleph-One-Marathon/alephone/commit/f469c47a909aadb2b4ad28926fc39dd17fb94331
Comment 8 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2025-04-20 01:54:14 UTC
This doesn't help, though:
```
--- a/Source_Files/RenderMain/RenderVisTree.h
+++ b/Source_Files/RenderMain/RenderVisTree.h
@@ -44,7 +44,7 @@ Oct 13, 2000


 // Made pointers more general
-typedef byte *POINTER_DATA;
+typedef __attribute__((may_alias)) byte *POINTER_DATA;
 #define POINTER_CAST(x) ((POINTER_DATA)(x))
```
Comment 9 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2025-04-20 04:44:49 UTC
```
--- a/Source_Files/RenderMain/RenderPlaceObjs.cpp
+++ b/Source_Files/RenderMain/RenderPlaceObjs.cpp
@@ -392,7 +392,8 @@ render_object_data *RenderPlaceObjsClass::build_render_object(
                                render_object->next_object= NULL;
                                if (object->parasitic_object!=NONE)
                                {
-                                       __builtin_printf("RenderPlaceObjsClass::build_render_object: object->parasitic_object!=NONE\n");
+                                       __builtin_printf("RenderPlaceObjsClass::build_render_object: object->parasitic_object!=NONE; aborting\n");
+                                       __asm__ volatile("int $0x03");

                                        render_object_data *parasitic_render_object;
                                        long_point3d parasitic_origin= transformed_origin;
```

This consistently fails on the *broken* builds. Using __builtin_abort() (or _trap()) didn't work out as I guess it ends up not being a blackbox as I was hoping.
Comment 10 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2025-04-20 04:51:48 UTC
When we hit the trap:
```
RenderPlaceObjsClass::build_render_object: object->location.x=6368, object.location.y=9824
RenderPlaceObjsClass::build_render_object: before parasitic_object check
RenderPlaceObjsClass::build_render_object: object->parasitic_object!=NONE; aborting

Thread 1 received signal SIGTRAP, Trace/breakpoint trap.
RenderPlaceObjsClass::build_render_object (this=this@entry=0x563afe1405c0 <RenderPlaceObjs>, object=object@entry=0x563b053e3c90, floor_intensity=floor_intensity@entry=20521,
    ceiling_intensity=ceiling_intensity@entry=20521, Opacity=Opacity@entry=1, origin=origin@entry=0x0, rel_origin=rel_origin@entry=0x0)
    at /home/sam/git/alephone/Source_Files/RenderMain/RenderPlaceObjs.cpp:402
402                                             long_point3d parasitic_origin= transformed_origin;
(rr) bt
#0  RenderPlaceObjsClass::build_render_object (this=this@entry=0x563afe1405c0 <RenderPlaceObjs>, object=object@entry=0x563b053e3c90, floor_intensity=floor_intensity@entry=20521,
    ceiling_intensity=ceiling_intensity@entry=20521, Opacity=Opacity@entry=1, origin=origin@entry=0x0, rel_origin=rel_origin@entry=0x0)
    at /home/sam/git/alephone/Source_Files/RenderMain/RenderPlaceObjs.cpp:402
#1  0x0000563afdbf0a06 in RenderPlaceObjsClass::add_object_to_sorted_nodes (this=this@entry=0x563afe1405c0 <RenderPlaceObjs>, object=0x563b053e3c90,
    floor_intensity=floor_intensity@entry=20521, ceiling_intensity=ceiling_intensity@entry=20521, Opacity=Opacity@entry=1)
    at /home/sam/git/alephone/Source_Files/RenderMain/RenderPlaceObjs.cpp:862
#2  0x0000563afdbf1b59 in RenderPlaceObjsClass::build_render_object_list (this=this@entry=0x563afe1405c0 <RenderPlaceObjs>)
    at /home/sam/git/alephone/Source_Files/RenderMain/RenderPlaceObjs.cpp:145
#3  0x0000563afde8fe11 in render_view (view=0x563b0586e3c0, software_render_dest=0x563b08f06a30) at /home/sam/git/alephone/Source_Files/RenderMain/render.cpp:461
#4  0x0000563afdee3bbb in render_screen (ticks_elapsed=ticks_elapsed@entry=1) at /home/sam/git/alephone/Source_Files/RenderOther/screen.cpp:1448
#5  0x0000563afdd75f9e in idle_game_state (time=<optimized out>) at /home/sam/git/alephone/Source_Files/Misc/interface.cpp:1130
#6  0x0000563afdb733c4 in main_event_loop () at /home/sam/git/alephone/Source_Files/shell.cpp:808
#7  0x0000563afdb7d15f in main (argc=<optimized out>, argv=0x7ffe661a66d8) at /home/sam/git/alephone/Source_Files/main.cpp:
```
Comment 11 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2025-04-20 06:31:50 UTC
I'm leaving it for now, but:
```
--- a/Source_Files/RenderMain/RenderPlaceObjs.cpp
+++ b/Source_Files/RenderMain/RenderPlaceObjs.cpp
@@ -204,11 +204,15 @@ render_object_data *RenderPlaceObjsClass::build_render_object(

                        __builtin_printf("RenderPlaceObjsClass::build_render_object: temp_tfm_origin.x=%d, temp_tfm_origin.y=%d, transformed_origin.z=%d\n",
                                temp_tfm_origin.x, temp_tfm_origin.y, transformed_origin.z);
-
+                       __builtin_printf("RenderPlaceObjsClass::build_render_object: transforming...\n");
                        uint16 tfm_origin_flags;
                        transform_overflow_point2d(&temp_tfm_origin, (world_point2d *)&view->origin, view->yaw, &tfm_origin_flags);
                        long_vector2d *tfm_origin_ptr = (long_vector2d *)(&transformed_origin);
                        overflow_short_to_long_2d(temp_tfm_origin,tfm_origin_flags,*tfm_origin_ptr);
+                       __builtin_printf("RenderPlaceObjsClass::build_render_object: transformed\n");
+                        __builtin_printf("RenderPlaceObjsClass::build_render_object (after transform): temp_tfm_origin.x=%d, temp_tfm_origin.y=%d, transformed_origin.z=%d\n",
+                                temp_tfm_origin.x, temp_tfm_origin.y, transformed_origin.z);
+                        __builtin_printf("RenderPlaceObjsClass::build_render_object (after transform, 2): transformed_origin.x=%d\n", transformed_origin.x);
                }
```

seems to fix it, and the casts there look suspicious for transform_overflow_point2d/tfm_origin_ptr/overflow_short_to_long_2d. Dropping the printfs breaks it again.

I sprinkled some further may_alias but that didn't help.
Comment 12 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2025-04-20 06:32:07 UTC
(In reply to Sam James from comment #11)
> I'm leaving it for now, but:

(Will return to it over next few days.)