Kangie found this and reported it upstream at https://github.com/Aleph-One-Marathon/alephone/issues/518. I was helping csfore with it but it's a bit complex and it's better for me to take over, I think. I'm just filing this downstream so I have somewhere to take notes.
OK, I can reproduce with master at 99ac83f5851bf28d8fb02a8d1074fea093be5d9e with: autoreconf -fiv && ~/git/alephone/configure --without-ffmpeg CFLAGS="-O2 -ggdb3 -flto=auto" CXXFLAGS="-O2 -ggdb3 -flto=auto" LDFLAGS="-Wl,-O1" --disable-dependency-tracking && make -j$(nproc) -l$(nproc) Source_Files/alephone -Q /usr/share/alephone-marathon But note that it appears to not be deterministic! A good build with aliens may have aliens disappear after turning around. A good build may also no longer work after either doing "new game" or closing/repoening. Make sure to check a build several times. So, next steps (not necessarily in this order): * Reduce iteration time by skipping the menu screen * Maybe set some seed so aliens are immediately where we spawn, or at least in the same place every time * Find an older GCC that works, maybe, for bisecting (useful even if it's not GCC's fault). Will need to disable Boost for this, not sure if can disable any other libs needing C++
-O1 -fstrict-aliasing -flto fails too
Clang's -fsanitize=type produces a tonne of output but it's very new and a lot of them seem bogus, so not directly useful. May be useful later once narrowed it down a bit.
Minor checkpoint, using these two is enough: /home/sam/git/alephone/build-bad/Source_Files/GameWorld/libgameworld.a /home/sam/git/alephone/build-bad/Source_Files/RenderMain/librendermain.a
(In reply to Sam James from comment #4) > Minor checkpoint, using these two is enough: > /home/sam/git/alephone/build-bad/Source_Files/GameWorld/libgameworld.a > /home/sam/git/alephone/build-bad/Source_Files/RenderMain/librendermain.a OK, reduced down to: /home/sam/git/alephone/build-bad/Source_Files/GameWorld/world.o /home/sam/git/alephone/build-bad/Source_Files/RenderMain/RenderPlaceObjs.o
(In reply to Sam James from comment #5) > (In reply to Sam James from comment #4) > > Minor checkpoint, using these two is enough: > > /home/sam/git/alephone/build-bad/Source_Files/GameWorld/libgameworld.a > > /home/sam/git/alephone/build-bad/Source_Files/RenderMain/librendermain.a > > OK, reduced down to: > /home/sam/git/alephone/build-bad/Source_Files/GameWorld/world.o > /home/sam/git/alephone/build-bad/Source_Files/RenderMain/RenderPlaceObjs.o ... corresponding to: * https://github.com/Aleph-One-Marathon/alephone/blob/99ac83f5851bf28d8fb02a8d1074fea093be5d9e/Source_Files/GameWorld/world.cpp * https://github.com/Aleph-One-Marathon/alephone/blob/99ac83f5851bf28d8fb02a8d1074fea093be5d9e/Source_Files/RenderMain/RenderPlaceObjs.cpp
(In reply to Sam James from comment #6) > https://github.com/Aleph-One-Marathon/alephone/blob/ > 99ac83f5851bf28d8fb02a8d1074fea093be5d9e/Source_Files/RenderMain/ > RenderPlaceObjs.cpp build_aggregate_render_object_clipping_window has some maybe dodgy casts. It's had issues before: https://github.com/Aleph-One-Marathon/alephone/commit/f469c47a909aadb2b4ad28926fc39dd17fb94331
This doesn't help, though: ``` --- a/Source_Files/RenderMain/RenderVisTree.h +++ b/Source_Files/RenderMain/RenderVisTree.h @@ -44,7 +44,7 @@ Oct 13, 2000 // Made pointers more general -typedef byte *POINTER_DATA; +typedef __attribute__((may_alias)) byte *POINTER_DATA; #define POINTER_CAST(x) ((POINTER_DATA)(x)) ```
``` --- a/Source_Files/RenderMain/RenderPlaceObjs.cpp +++ b/Source_Files/RenderMain/RenderPlaceObjs.cpp @@ -392,7 +392,8 @@ render_object_data *RenderPlaceObjsClass::build_render_object( render_object->next_object= NULL; if (object->parasitic_object!=NONE) { - __builtin_printf("RenderPlaceObjsClass::build_render_object: object->parasitic_object!=NONE\n"); + __builtin_printf("RenderPlaceObjsClass::build_render_object: object->parasitic_object!=NONE; aborting\n"); + __asm__ volatile("int $0x03"); render_object_data *parasitic_render_object; long_point3d parasitic_origin= transformed_origin; ``` This consistently fails on the *broken* builds. Using __builtin_abort() (or _trap()) didn't work out as I guess it ends up not being a blackbox as I was hoping.
When we hit the trap: ``` RenderPlaceObjsClass::build_render_object: object->location.x=6368, object.location.y=9824 RenderPlaceObjsClass::build_render_object: before parasitic_object check RenderPlaceObjsClass::build_render_object: object->parasitic_object!=NONE; aborting Thread 1 received signal SIGTRAP, Trace/breakpoint trap. RenderPlaceObjsClass::build_render_object (this=this@entry=0x563afe1405c0 <RenderPlaceObjs>, object=object@entry=0x563b053e3c90, floor_intensity=floor_intensity@entry=20521, ceiling_intensity=ceiling_intensity@entry=20521, Opacity=Opacity@entry=1, origin=origin@entry=0x0, rel_origin=rel_origin@entry=0x0) at /home/sam/git/alephone/Source_Files/RenderMain/RenderPlaceObjs.cpp:402 402 long_point3d parasitic_origin= transformed_origin; (rr) bt #0 RenderPlaceObjsClass::build_render_object (this=this@entry=0x563afe1405c0 <RenderPlaceObjs>, object=object@entry=0x563b053e3c90, floor_intensity=floor_intensity@entry=20521, ceiling_intensity=ceiling_intensity@entry=20521, Opacity=Opacity@entry=1, origin=origin@entry=0x0, rel_origin=rel_origin@entry=0x0) at /home/sam/git/alephone/Source_Files/RenderMain/RenderPlaceObjs.cpp:402 #1 0x0000563afdbf0a06 in RenderPlaceObjsClass::add_object_to_sorted_nodes (this=this@entry=0x563afe1405c0 <RenderPlaceObjs>, object=0x563b053e3c90, floor_intensity=floor_intensity@entry=20521, ceiling_intensity=ceiling_intensity@entry=20521, Opacity=Opacity@entry=1) at /home/sam/git/alephone/Source_Files/RenderMain/RenderPlaceObjs.cpp:862 #2 0x0000563afdbf1b59 in RenderPlaceObjsClass::build_render_object_list (this=this@entry=0x563afe1405c0 <RenderPlaceObjs>) at /home/sam/git/alephone/Source_Files/RenderMain/RenderPlaceObjs.cpp:145 #3 0x0000563afde8fe11 in render_view (view=0x563b0586e3c0, software_render_dest=0x563b08f06a30) at /home/sam/git/alephone/Source_Files/RenderMain/render.cpp:461 #4 0x0000563afdee3bbb in render_screen (ticks_elapsed=ticks_elapsed@entry=1) at /home/sam/git/alephone/Source_Files/RenderOther/screen.cpp:1448 #5 0x0000563afdd75f9e in idle_game_state (time=<optimized out>) at /home/sam/git/alephone/Source_Files/Misc/interface.cpp:1130 #6 0x0000563afdb733c4 in main_event_loop () at /home/sam/git/alephone/Source_Files/shell.cpp:808 #7 0x0000563afdb7d15f in main (argc=<optimized out>, argv=0x7ffe661a66d8) at /home/sam/git/alephone/Source_Files/main.cpp: ```
I'm leaving it for now, but: ``` --- a/Source_Files/RenderMain/RenderPlaceObjs.cpp +++ b/Source_Files/RenderMain/RenderPlaceObjs.cpp @@ -204,11 +204,15 @@ render_object_data *RenderPlaceObjsClass::build_render_object( __builtin_printf("RenderPlaceObjsClass::build_render_object: temp_tfm_origin.x=%d, temp_tfm_origin.y=%d, transformed_origin.z=%d\n", temp_tfm_origin.x, temp_tfm_origin.y, transformed_origin.z); - + __builtin_printf("RenderPlaceObjsClass::build_render_object: transforming...\n"); uint16 tfm_origin_flags; transform_overflow_point2d(&temp_tfm_origin, (world_point2d *)&view->origin, view->yaw, &tfm_origin_flags); long_vector2d *tfm_origin_ptr = (long_vector2d *)(&transformed_origin); overflow_short_to_long_2d(temp_tfm_origin,tfm_origin_flags,*tfm_origin_ptr); + __builtin_printf("RenderPlaceObjsClass::build_render_object: transformed\n"); + __builtin_printf("RenderPlaceObjsClass::build_render_object (after transform): temp_tfm_origin.x=%d, temp_tfm_origin.y=%d, transformed_origin.z=%d\n", + temp_tfm_origin.x, temp_tfm_origin.y, transformed_origin.z); + __builtin_printf("RenderPlaceObjsClass::build_render_object (after transform, 2): transformed_origin.x=%d\n", transformed_origin.x); } ``` seems to fix it, and the casts there look suspicious for transform_overflow_point2d/tfm_origin_ptr/overflow_short_to_long_2d. Dropping the printfs breaks it again. I sprinkled some further may_alias but that didn't help.
(In reply to Sam James from comment #11) > I'm leaving it for now, but: (Will return to it over next few days.)