The app-emulation/dosemu 1.4.1_pre20130107-r5 application builds as expected but segfaults immediately at runtime. This looks identical to bug 297173 which was fixed in dosemu-1.4.1_pre20091009. I'm not sure when the regression occurred and that ebuild is no longer in the git repository for me to check against. Digging into this a bit with the GNU debugger I see: ``` $ gdb /usr/bin/dosemu.bin (gdb) run ... Program recived signal SIGSEGV, Segmentation fault. 0x00005555555a9268 in alloc_mapping () (gdb) bt #0 0x00005555555a9268 in alloc_mapping () #1 0x000055555562d12a in low_mem_init () #2 0x000055555557c1a0 in main () ``` The dosemu.bin executable is attempting to reserve memory locations 0-640k. This fails under anything like a modern Linux kernel regardless of architecture, as it reserves the first 64k of RAM for security reasons. There were three ways around this: 1. `echo 0 > /proc/sys/vm/mmap_min_addr`, which has security implications, but will enable a non-root user to use dosemu. 2. Run dosemu as root. 3. patch around the problem. I tried options #1 and #2, they both result in a segfault. This issue is not architecture specific. I'm not sure what happened to the patch in 1.4.1_pre20091009 that fixed this problem. As near as I can tell there hasn't been any activity on the dosemu Source Forge repository since 2007. However, there are dozens of patches floating around the dosemu mailing list. I see (real) traffic on the mailing list as recently as 2019 and even more recent spam. I strongly suspect the mailing list is where the fix to this issue originally came from but I can't find the specific patch to prove this. I glanced at a few other distros to to see if (a) they still package dosemu and (b) if it works with /proc/sys/vm/mmap_min_addr >0. It does appear that it's still being packaged by other distros and dosemu works with modern /proc/sys/vm/mmap_min_addr defaults on those platforms. So, a fix does indeed exist. Unfortunately, in every case I checked there was something like ~200k of accumulated patches merged into a single diff file. My C skills aren't what they could be, but even if they were that's a lot to digest. At this point, I'm a bit out of my depth and I'm not sure how to proceed. At a minimum, I think this package should be masked on amd64 and x86 until this issue gets resolved.
Which distros still package it?
(In reply to Sam James from comment #1) > Which distros still package it? Arch and its derivatives do, I tested it on Manjaro it's in their main repository not AUR. That said, I suspect they're the last major distro to package it. It looks like dosemu was dropped by Debian some time ago. I can find it in their package repos under "oldoldstable" but nothing newer. The last Ubuntu release with a dosemu package was 18.04 LTS, newer versions do not have it. It's not part of Fedora anymore either. I can find *third party* RPMs of dosemu on rpmfinder for the current Fedora release, so someone is making the effort to package it -- but who knows if it actually works. I really don't want to say it, because dosemu has been around almost as long as Linux itself, but realistically app-emulation/dosemu should probably be given last rites. DOSBox has gotten better over the years and has effectively replaced it. If people were missing dosemu, someone would have noticed this issue long before I did. Also, there is a revivified fork of dosemu <https://github.com/dosemu2/dosemu2>. I'll dig into dosemu2 and see if it can eventually provide a 1:1 replacement for the original. It's in a pre-release state so that's unlikely to be tomorrow.
Created attachment 863257 [details] gdb disassembly of segfaulting function TLDR: This is acting like a dosemu compatibility problem with gcc >= 10. I can work around it by compiling dosemu with gcc 9.5.0, although I'm not yet sure if 9.5.0 needs USE=-pie (after unforcing it) or not. I'll probably look into it some more in the next couple of weeks... ========== This started showing up for me yesterday, as I was rebuilding world. - (Literally "as": I happened to run dosemu a few times during the rebuild, and doesemu worked before the rebuild got to dosemu, and stopped working after it got past dosemu.) - Before this, I think my dosemu install was last built in Nov 2019, according to /var/log/emerge.log. - It is possible no one has noticed it broken because dosemu hasn't been updated (or needed to be rebuilt) for years, rather than because no one is using it... - I was rebuilding world because for unrelated reasons I finally decided to enable gcc USE=pie on the relevant machine, which I had disabled back when I switched to profile=17 years ago. (The rebuild also corresponds to an upgrade to gcc 11, and I'm pretty sure there were more gcc upgrades between 2019 and now.) Additional details: - The (1) /proc and (2) as-root workarounds don't work for me either. I'm pretty sure this bug is a new / different problem even if it has similar symptoms as and near the same code as bug 297173... - Adding -no-pie to CFLAGS, CXXFLAGS, and LDFLAGS does not resolve the issue. - If I install gcc 9.5.0 with USE=-pie, and use that to build/install dosemu, that works fine (no segfault). - gcc-10.4.1_p20230426-r1 also causes dosemu segfault, regardless of whether gcc has USE=pie or USE=-pie, or with or without -no-pie in the *FLAGS when compiling dosemu. - Similar with gcc 11, although I haven't tried as many variations of that. (And I haven't tried gcc 12 at all.) - I have not tried gcc 9 with USE=pie, although I might get around to trying that eventually, just as a data point. - TANGENT(sanitize issue): I also tried to install gcc 8.5.0, but encountered some multiple definition errors in /usr/include/linux/mount.h related to gcc's sanitizer stuff. I suspect it might help to disable USE=-sanitize for the old slot, but when I got things working with gcc 9.5.0, I didn't pursue 8 any further. ========== When I run a bad dosemu.bin under gdb, it dies in a function alloc_mapping() in a sequence of assembly language instruction that reads as follows (I've also attached the whole function): 0x000000000044904d <+221>: call 0x448940 <mprotect_mapping> 0x0000000000449052 <+226>: test $0x200,%r12d 0x0000000000449059 <+233>: je 0x44901c <alloc_mapping+172> 0x000000000044905b <+235>: cmpb $0x0,0x1afc06(%rip) # 0x5f8c68 <debug+1960> 0x0000000000449062 <+242>: jne 0x449138 <alloc_mapping+456> => 0x0000000000449068 <+248>: mov %r14,0xd7829(%rip) # 0x520898 <lowmem_base> 0x000000000044906f <+255>: add $0x8,%rsp 0x0000000000449073 <+259>: mov %r14,%rax 0x0000000000449076 <+262>: pop %rbp 0x0000000000449077 <+263>: pop %r12 0x0000000000449079 <+265>: pop %r13 0x000000000044907b <+267>: pop %r14 0x000000000044907d <+269>: ret Given that it apparently calls mprotect() (mprotect_mapping()?) and then dies the first time it tries to write to RAM after that ("lowmem_base"), it almost suggests the mprotect() is messing up the permissions on doesemu's own global variables. However, if I run dosemu.bin under "strace", the mprotect() looks fine. The last few lines of the strace are below. Various addresses change slightly each run (address space randomization?), but are never near the 0x449000 / 0x520898 of the code/variables above: ---- CUT ("strace dosemu.bin") ---- openat(AT_FDCWD, "/dev/shm/dosemu_6417", O_RDWR|O_CREAT|O_TRUNC|O_NOFOLLOW|O_CLOEXEC, 0600) = 4 unlink("/dev/shm/dosemu_6417") = 0 ftruncate(4, 0) = 0 ftruncate(4, 17891328) = 0 mmap(NULL, 17891328, PROT_READ|PROT_WRITE, MAP_SHARED, 4, 0) = 0x7fa5f4bde000 mprotect(0x7fa5f4bde000, 17891328, PROT_READ|PROT_WRITE|PROT_EXEC) = -1 EACCES (Permission denied) close(4) = 0 mmap(NULL, 4096, PROT_NONE, MAP_SHARED|MAP_ANONYMOUS, -1, 0) = 0x7fa5f65f3000 mremap(0x7fa5f65f3000, 0, 4096, MREMAP_MAYMOVE) = 0x7fa5f6008000 munmap(0x7fa5f65f3000, 4096) = 0 munmap(0x7fa5f6008000, 4096) = 0 mmap(NULL, 1114112, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_ANONYMOUS, -1, 0) = 0x7fa5f4ace000 mprotect(0x7fa5f4ace000, 1114112, PROT_READ|PROT_WRITE) = 0 --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_ACCERR, si_addr=0x520898} --- +++ killed by SIGSEGV +++ Segmentation fault ---- CUT ---- ========== I'm pausing research for now, but may look deeper into this sometime in the next couple of weeks (unless someone resolves it before then, although that seems unlikely). Perhaps try looking at source code, maybe try gcc9 with USE=pie, search if anyone has a recent patch that might help gcc 10 support, etc...
Thanks! I was scratching my head with this one. Forgive me if this is a red herring, but I wonder if this issue overlaps with bug 870412 and/or bug 880545 mentioned in https://wiki.gentoo.org/wiki/Modern_C_porting.
Created attachment 863892 [details, diff] Patch fixes dosemu startup crash when compiled with gcc >=10 I just wrote this patch that fixes the segfault when deposited in /etc/portage/patches/app-emulation/dosemu-1.4.1_pre20130107-r5/dosemu-crashWriteToCastGlobal.patch before re-emerging dosemu. It works with gcc 9, 10, 11, and 12 (at least). Dosemu was writing to a couple of "const" global variables by casting away the "const", which is undefined behavior that breaks in gcc >=10. More information is in this related bug report: https://bugzilla.redhat.com/show_bug.cgi?id=1866474 It should be trivial to add this to the PATCHES in the ebuild instead of using epatch_user. (Maybe the ebuild should be revbumped? Not sure.) However, I have noticed there are 4 moderately recent build error bugs listed in bugzilla, although they don't seem to affect me. Maybe there are missing dependencies or something in the ebuild? See bug#881149, bug#883627, bug#886119 and bug#894220. ---- I'll probably try submitting this upstream soon, although considering how dead it seems, I don't have a lot of hope. (It would be nice if they released a 1.4.2 version with various accumulated patches for modern versions of build tools to tide us over until dosemu2 is actually released, but the existing evidence suggests that is unlikely.) (Tangentially, the USE=pie vs USE=-pie for gcc concern in my previous comment is irrelevant: It doesn't matter to dosemu.)
Thanks for the help! I'll test your patch and see if I can get dosemu to build and to run as expected. I'll build it from a minimal container to see if I can shake loose any missing dependencies. I'm thinking an ebuild revision is in order. With regard to the upstream, in addition to opening a bug (if you even can) I'd suggest providing the patches to the dosemu mailing list. Even if the project's revision control system isn't seeing commits, other people can see the fix.
Minor update: I submitted the patch upstream at https://sourceforge.net/p/dosemu/patches/129/ on June 16. I got a couple of automated emails immediately, but no real responses. One of the automated emails was about how it was being held from "dosemu-notify" list until a moderator approves it. Given the dosemu1.x vs 2 situation, I suspect there might not even be a moderator left...
Runs successfully when built with sys-devel/gcc-13.2.1_p20230826 and using Matthews patches
It's been 9 months and it doesn't seem like anyone has looked at Matthews patch on Sourceforge. While it would be nice to get an official v1.4.2 that includes the patch it seems like nobody is home. Since the patch does work and `<=sys-devel/gcc:10` has been masked, could we go ahead and include the patch in the Gentoo repository?
Can you remind me in say, a week, if I haven't looked at this? I'm absolutely swamped atm but I can try review it in a week or so. (Maybe two if you'd be so kind.) thanks
(In reply to Sam James from comment #10) > Can you remind me in say, a week, if I haven't looked at this? I'm > absolutely swamped atm but I can try review it in a week or so. (Maybe two > if you'd be so kind.) > > thanks Hi Sam, here's the reminder you requested. I had a look over the diff of the patch myself and it's a few simple changes, it contains some explicit casting a declaration changes that I assume should have always been there to be standard C compliant and older versions of GCC use guess the implicit parts of this code correctly. The actual logic code remains untouched