Summary: | x11-base/xorg-server-1.6.2-r1: SIGSEGV with sys-apps/hal-0.5.13 | ||
---|---|---|---|
Product: | Gentoo Linux | Reporter: | Martin von Gagern <Martin.vGagern> |
Component: | Current packages | Assignee: | Gentoo X packagers <x11> |
Status: | RESOLVED FIXED | ||
Severity: | minor | CC: | niteblade |
Priority: | High | ||
Version: | 2008.0 | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Package list: | Runtime testing required: | --- | |
Attachments: |
Handle errors for libhal_find_device_by_capability
Handle errors for libhal_find_device_by_capability Handle and free errors for libhal_find_device_by_capability 0001-config-add-HAL-error-checks.patch |
Description
Martin von Gagern
2009-07-22 20:06:09 UTC
Don't use HAL 0.5.13 _at all_, upgrade to 0.5.13-r1 and make sure the config files are updated as well. I didn't package.mask it without reason. Created attachment 198837 [details, diff] Handle errors for libhal_find_device_by_capability I wrote this bug not because hal-0.5.13 is broken, but because I consider x11-base/xorg-server-1.6.2-r1 to be broken as well. Yes, updating to hal-0.5.13-r1 including the new config does solve the issue. Yes, if I had synced the pmask to my system, I'd never emerged 0.5.13 and therefore never encountered this problem. BUT applications should in my opinion not crash, whatever copnfiguration files say. They may fail to start, but they should complain and give an indication as to what went wrong. The only difference between hal 0.5.13 and 0.5.13-r1 is the config, so I could update and then manually (and deliberately) edit the config in order to reproduce this issue. The backtrace occurs in the X server itself, not some dynamically linked library, so the hal version doesn't directly affect the instructions around the point of failure. Loking at http://cgit.freedesktop.org/xorg/xserver/tree/config/hal.c?id=6f1aff5a2b45bc2985081abc240a8fed37170386#n506 I see these lines: devices = libhal_find_device_by_capability(info->hal_ctx, "input", &num_devices, &error); /* FIXME: Get default devices if error is set. */ for (i = 0; i < num_devices; i++) device_added(info->hal_ctx, devices[i]); So xorg-server is completely ignoring any error indicated by the hal library. The HAL library, on the other hand, seems to return NULL in some error scenarios, setting the error but not num_devices. Unless num_devices is accidentially initialized to a non-positive number, the loop body actually does get executed in case of an error, even though the array start is NULL. The FIXME comment does indicate that there is at least some awareness upstream that this does require some more work. So resolve this UPSTREAM if you consider this a bug to be addressed by xorg devs, or resolve it WONTFIX if you don't care about crashes due to broken configs, or apply my patch and mark it FIXED. REOPENing and awaiting a resolution different from the current TEST-REQUEST. Alternatives UPSTREAM, WONTFIX or FIXED as outlined above. Created attachment 198841 [details, diff]
Handle errors for libhal_find_device_by_capability
I should have used dbus_error_is_set, as far as I can tell from other, similar code. I'll recompile my xorg server with this patch tomorrow, and see if I can get useful error messages from it even with broken hal config.
Indeed, I'll see what upstream thinks about this. Thanks Created attachment 198883 [details, diff]
Handle and free errors for libhal_find_device_by_capability
OK, tested this, and got the following two messages in case of a broken config:
(EE) config/hal: couldn't find input device: org.freedesktop.DBus.Error.AccessDenied (Rejected send message, 1 matched rules; type="method_call", sender=":1.11" (uid=0 pid=1122 comm="/var/tmp/portage/x11-base/xorg-server-1.6.2-r1/ima") interface="org.freedesktop.Hal.Manager" member="FindDeviceByCapability" error name="(unset)" requested_reply=0 destination="org.freedesktop.Hal" (uid=0 pid=901 comm="/usr/sbin/hald --use-syslog --verbose=no ")))
process 1122: arguments to dbus_move_error() were incorrect, assertion "(dest) == NULL || !dbus_error_is_set ((dest))" failed in file dbus-errors.c line 278.
This is normally a bug in some application using the D-Bus library.
D-Bus not built with -rdynamic so unable to print a backtrace
After this Xorg aborted using SIGABRT.
So the first is just the kind of error message I'd have expected all along. The second is probably due to the fact that a dbus error has to be freed before it may be passed to receive another error message. My coded didn't do this, but neither did the code I copied from.
So here is another patch, preventively freeing any errors that might have occurred. As an alternative, one could free errors directly after catching them, which would seem slightly cleaner to me, but would largely increase the footprint of this patch. So I wrote it this way.
With this patch in place, my Xorg no longer crashed in the presence, but continued running after printing the first error message about the denied dbus access. So one ends up with a running X server without any input devices at all. This might be bad, as it might leave many users without means to recover the situation, as Ctrl+Alt+F# don't seem to work either. On the other hand, this might be intended behaviour in cases where input devices will be plugged in and autoconfigured at a later time. From a usability point of view I'd wish for a fatal error message as the default, but an option to disable it. I guess all of this "no input devices" handling is a diferent issue, though.
(In reply to comment #2) > Yes, updating to hal-0.5.13-r1 including the new config does solve the issue. > > BUT applications should in my opinion not crash, whatever copnfiguration files > say. They may fail to start, but they should complain and give an indication as > to what went wrong. > > The only difference between hal 0.5.13 and 0.5.13-r1 is the config, so I could > update and then manually (and deliberately) edit the config in order to > reproduce this issue. The backtrace occurs in the X server itself, not some > dynamically linked library, so the hal version doesn't directly affect the > instructions around the point of failure. Upgrading to hal-0.15.3-r1 was solve the Backtrace error about X on my laptop. Thx. Created attachment 199329 [details, diff]
0001-config-add-HAL-error-checks.patch
Martin, could I get you to try this patch?
Thanks :)
(In reply to comment #8) > Created an attachment (id=199329) [edit] > 0001-config-add-HAL-error-checks.patch > > Martin, could I get you to try this patch? Tried it, gives me a nive error message and no crash, neither segfault nor abort. So it works for me. Is this patch intended for both Gentoo patchset and upstream? (In reply to comment #9) > gives me a nive error message and no crash s/nive/nice/ Yes, I'm going to send this patch to upstream (and hopefully, they'll ack it). Thanks for testing I've pushed the patches to git master and I've nominated them for inclusion in the 1.6 branch. In the mean time, they're in 1.6.2.901. Closing. Thanks *** Bug 271833 has been marked as a duplicate of this bug. *** |