Okay, after hours of debugging my code, seems like this is not a source code problem, so it's time to investigate further ;) I'm CCing a few devs who helped me testing Somehow pthreads cancellation on recent glibcs does not work as expected, while it works mostly fine on at least 2006.0 stage3 clean and on Tester's stable system. The URL refers to IBM's guide to pthreads, that's usually quite precise; the example code does not work out of the box because of missing check.h, so I'll attach a ready-to-compile source code test. The test should spit out some lines, then output "Main completed" and exit; on an affected system it will get stuck before that. The test works fine as I said on a clean unpacked stage3 2006.0, should work on a current stable amd64 system if I got right what Tester is using, works fine on FreeBSD and NetBSD, behaves inconsistently on OSX (but does not get stuck), and works fine also on an embedded Linux (kernel 2.4, not sure which version of glibc). Although that's just example code, I have a quite complex software (unfortunately I can't provide the whole sources) that relies on pthread cancellation to work fine; on a 2006.0 stage3 that passes the above test, also my code works as intended. The same broken behaviour happens on Fedora Core, although Tester said it works fine on RHEL. Someone has a clue?
Created attachment 83204 [details] test-cancel.c
for those who care.. its a gcc 4.x bug.. it replaces the last printf() by a puts(), -fno-builtin-printf fixes this testcase on my fc4 machine. that said, puts() still should not freeze.. and if I replace the printf() calls with direct calls to puts(), it also doesnt break.. so it seems the gcc builtin function does something incompatible with glibc..
The problem still appears on my code also using -fno-builtin, I'll try tomorrow to build gcc 3.4.6 on my system and try if wit h that my code works. Note that GCC 4.0 on osx behaves correctly (after kito defined a symbol to let osx follow SUSv3).
i dont think this needs to be pursued any further ...