I've been testing net-im/telegram-desktop-4.0.2 for a day. The previous version - 3.6.1-r1 - was stable and calls would never be interrupted. With 4.0.2, I get a crash after a few minutes in call, with the following error message: === free(): invalid next size (fast) Thread 74 "tgc-net" received signal SIGABRT, Aborted. === The stack trace prints: === #0 0x00007ffff368856c in () at /lib64/libc.so.6 #1 0x00007ffff363ca02 in raise () at /lib64/libc.so.6 #2 0x00007ffff3627469 in abort () at /lib64/libc.so.6 #3 0x00007ffff367c888 in () at /lib64/libc.so.6 #4 0x00007ffff36921ca in () at /lib64/libc.so.6 #5 0x00007ffff3693dc5 in () at /lib64/libc.so.6 #6 0x00007ffff36968df in free () at /lib64/libc.so.6 #7 0x0000555557e94227 in () #8 0x0000555557e9429d in () #9 0x00007ffff74afe95 in rtc::Thread::Dispatch(rtc::Message*) () at /usr/lib64/libtg_owt.so.0 #10 0x00007ffff74af1b7 in rtc::Thread::ProcessMessages(int) () at /usr/lib64/libtg_owt.so.0 #11 0x00007ffff74af294 in rtc::Thread::PreRun(void*) () at /usr/lib64/libtg_owt.so.0 #12 0x00007ffff368684a in () at /lib64/libc.so.6 #13 0x00007ffff3709cec in () at /lib64/libc.so.6 === I've attached a strace log as well. USE flags: X dbus hunspell screencast spell. I'm using KDE Plasma (X11) with Pulseaudio, on 5.15.59-gentoo kernel. Thanks!
Created attachment 800497 [details] emerge --info output
Created attachment 800499 [details] strace log
Managed to get more symbols in: === #0 0x00007ffff368856c in () at /lib64/libc.so.6 #1 0x00007ffff363ca02 in raise () at /lib64/libc.so.6 #2 0x00007ffff3627469 in abort () at /lib64/libc.so.6 #3 0x00007ffff367c888 in () at /lib64/libc.so.6 #4 0x00007ffff36921ca in () at /lib64/libc.so.6 #5 0x00007ffff3693dc5 in () at /lib64/libc.so.6 #6 0x00007ffff36968df in free () at /lib64/libc.so.6 #7 0x0000555557e94227 in rtc::RefCountedNonVirtual<webrtc::PendingTaskSafetyFlag>::Release() const (this=0x7fff2418bbe0) at /usr/include/tg_owt/api/ref_counted_base.h:80 #8 rtc::RefCountedNonVirtual<webrtc::PendingTaskSafetyFlag>::Release() const (this=0x7fff2418bbe0) at /usr/include/tg_owt/api/ref_counted_base.h:68 #9 rtc::scoped_refptr<webrtc::PendingTaskSafetyFlag>::~scoped_refptr() (this=0x7fff2418bad8, __in_chrg=<optimized out>) at /usr/include/tg_owt/api/scoped_refptr.h:103 #10 webrtc::ScopedTaskSafety::~ScopedTaskSafety() (this=0x7fff2418bad8, __in_chrg=<optimized out>) at /usr/include/tg_owt/rtc_base/task_utils/pending_task_safety_flag.h:122 #11 tgcalls::ReflectorPort::~ReflectorPort() (this=0x7fff2418b410, __in_chrg=<optimized out>) at /usr/src/debug/net-im/telegram-desktop-4.0.2/tdesktop-4.0.2-full/Telegram/ThirdParty/tgcalls/tgcalls/v2/ReflectorPort.cpp:137 #12 0x0000555557e9429d in tgcalls::ReflectorPort::~ReflectorPort() (this=0x7fff2418b410, __in_chrg=<optimized out>) at /usr/src/debug/net-im/telegram-desktop-4.0.2/tdesktop-4.0.2-full/Telegram/ThirdParty/tgcalls/tgcalls/v2/ReflectorPort.cpp:137 #13 0x00007ffff74afe95 in rtc::Thread::Dispatch(rtc::Message*) (this=0x7fff14002500, pmsg=0x7fffd798c790) at /usr/src/debug/media-libs/tg_owt-0_pre20220507/tg_owt-10d5f4bf77333ef6b43516f90d2ce13273255f41/src/rtc_base/thread.cc:700 #14 0x00007ffff74af1b7 in rtc::Thread::ProcessMessages(int) (this=this@entry=0x7fff14002500, cmsLoop=cmsLoop@entry=-1) at /usr/src/debug/media-libs/tg_owt-0_pre20220507/tg_owt-10d5f4bf77333ef6b43516f90d2ce13273255f41/src/rtc_base/thread.cc:1144 #15 0x00007ffff74af294 in rtc::Thread::Run() (this=0x7fff14002500) at /usr/src/debug/media-libs/tg_owt-0_pre20220507/tg_owt-10d5f4bf77333ef6b43516f90d2ce13273255f41/src/rtc_base/thread.cc:890 #16 rtc::Thread::PreRun(void*) (pv=0x7fff14002500) at /usr/src/debug/media-libs/tg_owt-0_pre20220507/tg_owt-10d5f4bf77333ef6b43516f90d2ce13273255f41/src/rtc_base/thread.cc:879 #17 0x00007ffff368684a in () at /lib64/libc.so.6 #18 0x00007ffff3709cec in () at /lib64/libc.so.6 ===
I've heard about this recently too, but forgot about it. Thanks for actually submitting a bug report. It's hard for me to test calls as I don't regularly call and don't want to bother people for one. As for the bug itself, I'd like to wait for 4.1.x to be merged, with some luck it's already been solved upstream. Mind testing when that happens?
Oh, and if you have time maybe try to do the following: MYCMAKEARGS="-DBUILD_SHARED_LIBS=OFF" emerge -1 media-libs/tg_owt emerge net-im/telegram-desktop Maybe it's some optimization shenanigans that let it work on the official build.
"-DBUILD_SHARED_LIBS=OFF" yields a few compilation errors with libtg_voip - I think it expects dynamic libraries at some point, but I haven't dug deep into it. I'm happy to test 4.1.x when it's available!
Tested version 4.1.1 and it's affected by the same crash, which still seems to be inside media-libs/tg_owt. To replicate it: - Call anyone - In my case crash occurs at around the first minute mark Backtrace: === #0 0x00007ffff368856c in () at /lib64/libc.so.6 #1 0x00007ffff363ca02 in raise () at /lib64/libc.so.6 #2 0x00007ffff3627469 in abort () at /lib64/libc.so.6 #3 0x00007ffff367c888 in () at /lib64/libc.so.6 #4 0x00007ffff36921ca in () at /lib64/libc.so.6 #5 0x00007ffff3693dc5 in () at /lib64/libc.so.6 #6 0x00007ffff36968df in free () at /lib64/libc.so.6 #7 0x00007ffff74af53f in rtc::Thread::QueuedTaskHandler::OnMessage(rtc::Message*) (this=<optimized out>, msg=<optimized out>) at /usr/src/debug/media-libs/tg_owt-0_pre20220507/tg_owt-10d5f4bf77333ef6b43516f90d2ce13273255f41/src/rtc_base/thread.cc:1027 #8 0x00007ffff74afe95 in rtc::Thread::Dispatch(rtc::Message*) (this=0x7ffe944554c0, pmsg=0x7ffefd7f9790) at /usr/src/debug/media-libs/tg_owt-0_pre20220507/tg_owt-10d5f4bf77333ef6b43516f90d2ce13273255f41/src/rtc_base/thread.cc:700 #9 0x00007ffff74af1b7 in rtc::Thread::ProcessMessages(int) (this=this@entry=0x7ffe944554c0, cmsLoop=cmsLoop@entry=-1) at /usr/src/debug/media-libs/tg_owt-0_pre20220507/tg_owt-10d5f4bf77333ef6b43516f90d2ce13273255f41/src/rtc_base/thread.cc:1144 #10 0x00007ffff74af294 in rtc::Thread::Run() (this=0x7ffe944554c0) at /usr/src/debug/media-libs/tg_owt-0_pre20220507/tg_owt-10d5f4bf77333ef6b43516f90d2ce13273255f41/src/rtc_base/thread.cc:890 #11 rtc::Thread::PreRun(void*) (pv=0x7ffe944554c0) at /usr/src/debug/media-libs/tg_owt-0_pre20220507/tg_owt-10d5f4bf77333ef6b43516f90d2ce13273255f41/src/rtc_base/thread.cc:879 #12 0x00007ffff368684a in () at /lib64/libc.so.6 #13 0x00007ffff3709cec in () at /lib64/libc.so.6 (gdb) bt #0 0x00007ffff368856c in () at /lib64/libc.so.6 #1 0x00007ffff363ca02 in raise () at /lib64/libc.so.6 #2 0x00007ffff3627469 in abort () at /lib64/libc.so.6 #3 0x00007ffff367c888 in () at /lib64/libc.so.6 #4 0x00007ffff36921ca in () at /lib64/libc.so.6 #5 0x00007ffff3693dc5 in () at /lib64/libc.so.6 #6 0x00007ffff36968df in free () at /lib64/libc.so.6 #7 0x00007ffff74af53f in rtc::Thread::QueuedTaskHandler::OnMessage(rtc::Message*) (this=<optimized out>, msg=<optimized out>) at /usr/src/debug/media-libs/tg_owt-0_pre20220507/tg_owt-10d5f4bf77333ef6b43516f90d2ce13273255f41/src/rtc_base/thread.cc:1027 #8 0x00007ffff74afe95 in rtc::Thread::Dispatch(rtc::Message*) (this=0x7ffe944554c0, pmsg=0x7ffefd7f9790) at /usr/src/debug/media-libs/tg_owt-0_pre20220507/tg_owt-10d5f4bf77333ef6b43516f90d2ce13273255f41/src/rtc_base/thread.cc:700 #9 0x00007ffff74af1b7 in rtc::Thread::ProcessMessages(int) (this=this@entry=0x7ffe944554c0, cmsLoop=cmsLoop@entry=-1) at /usr/src/debug/media-libs/tg_owt-0_pre20220507/tg_owt-10d5f4bf77333ef6b43516f90d2ce13273255f41/src/rtc_base/thread.cc:1144 #10 0x00007ffff74af294 in rtc::Thread::Run() (this=0x7ffe944554c0) at /usr/src/debug/media-libs/tg_owt-0_pre20220507/tg_owt-10d5f4bf77333ef6b43516f90d2ce13273255f41/src/rtc_base/thread.cc:890 #11 rtc::Thread::PreRun(void*) (pv=0x7ffe944554c0) at /usr/src/debug/media-libs/tg_owt-0_pre20220507/tg_owt-10d5f4bf77333ef6b43516f90d2ce13273255f41/src/rtc_base/thread.cc:879 #12 0x00007ffff368684a in () at /lib64/libc.so.6 #13 0x00007ffff3709cec in () at /lib64/libc.so.6 === I can't really help with C++, but it all points to this function here: === media-libs/tg_owt-0_pre20220507/tg_owt-10d5f4bf77333ef6b43516f90d2ce13273255f41/src/rtc_base/thread.cc void Thread::QueuedTaskHandler::OnMessage(Message* msg) { RTC_DCHECK(msg); auto* data = static_cast<ScopedMessageData<webrtc::QueuedTask>*>(msg->pdata); std::unique_ptr<webrtc::QueuedTask> task(data->Release()); // Thread expects handler to own Message::pdata when OnMessage is called // Since MessageData is no longer needed, delete it. delete data; // !! CRASH OCCURS HERE !! // QueuedTask interface uses Run return value to communicate who owns the // task. false means QueuedTask took the ownership. if (!task->Run()) task.release(); } === Thanks!
Debugging symbols on glibc may help too, but it sounds like it's a use after free. Please report it upstream.
Reported upstream: https://github.com/desktop-app/tg_owt/issues/106
Same problem, downgrade to 3.6.1-r1 works
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=1bbc682ab1db76c4fa5b95433fc0032b16e6f54a commit 1bbc682ab1db76c4fa5b95433fc0032b16e6f54a Author: Esteve Varela Colominas <esteve.varela@gmail.com> AuthorDate: 2022-10-01 10:37:52 +0000 Commit: Georgy Yakovlev <gyakovlev@gentoo.org> CommitDate: 2022-10-03 22:51:37 +0000 net-im/telegram-desktop: Drop broken Linked bug applies to these versions, and won't be fixed for these. Bug: https://bugs.gentoo.org/866055 Signed-off-by: Esteve Varela Colominas <esteve.varela@gmail.com> Closes: https://github.com/gentoo/gentoo/pull/27553 Signed-off-by: Georgy Yakovlev <gyakovlev@gentoo.org> net-im/telegram-desktop/Manifest | 2 - .../files/tdesktop-4.0.2-fix-gcc12-cstdint.patch | 10 -- .../telegram-desktop/telegram-desktop-4.0.2.ebuild | 183 -------------------- .../telegram-desktop-4.1.1-r1.ebuild | 185 --------------------- 4 files changed, 380 deletions(-)
Created attachment 838065 [details] backtraces of telegram-desktop-4.3.4 with tg_owt-0_pre20220507 3.6.1-r1 is ancient, so after losing hope of ever seeing this magically fixed, I've attempted to debug this. It's a genuine headache of a bug, since it's likely caused by tg_owt, but none of the debugging methods I've attempted help demistify the problem. There's no chance of getting upstream support on this one, after all, it's not reproducible in their builds. I can't currently reproduce it in my host install, so I set this up in a clean gentoo prefix, something like follows: USE=pulseaudio emerge -1 media-libs/openal FEATURES="splitdebug compressdebug installsources" emerge -1 glibc FEATURES="nostrip installsources" CFLAGS="-Og -ggdb -pipe" CXXFLAGS="-Og -ggdb -pipe" emerge -1 tg_owt emerge --onlydeps telegram-desktop # Accept default autounmask-use values CFLAGS="-Og -ggdb -pipe" CXXFLAGS="-Og -ggdb -pipe" ebuild <repo>/net-im/telegram-desktop/telegram-desktop/telegram-desktop-4.3.4.ebuild clean compile And then with (host! not prefix) gdb: set debug-file-directory <prefix>/usr/lib/debug file <prefix>/var/tmp/portage/net-im/telegram-desktop/work/tdesktop-4.3.4-full_build/telegram-desktop r <crash> bt To reproduce this crash, I simply receive a call, and then press "accept". Sometimes it crashes before the call is initialized, sometimes it crashes after pressing the hangup button. More rarely it crashes during the call. After trying this a couple of times, I've captured three different backtraces, attached in traces.tar.gz. bt1.txt is a "malloc(): invalid size (unsorted)" crash, I have no idea what this means, but I have never seen malloc fail like this and I have plenty of memory on my system. bt2.txt is more sensible, saying "free(): invalid next size (fast)", it crashes after trying to free a webrtc::PendingTaskSafetyFlag, but trying to diagnose this backtrace through all the layers of indirection is resulting very hard for me. bt3.txt is a "corrupted size vs. prev_size" crash in the memory allocator, which also happens during a malloc() like in bt1.txt All of these point to some kind of memory corruption, maybe an array is being overflowed that is optimized out in the release build? So I rebuilt tg_owt, but this time with asan: FEATURES="nostrip installsources" CFLAGS="-Og -ggdb -pipe -fsanitize=address" CXXFLAGS="-Og -ggdb -pipe -fsanitize=address" emerge -1 tg_owt And ran: LD_PRELOAD=<prefix>/usr/lib/gcc/x86_64-pc-linux-gnu/11.3.0/libasan.so <prefix>/var/tmp/portage/net-im/telegram-desktop/work/tdesktop-4.3.4-full_build/telegram-desktop This spits out a new-delete-type-mismatch error, included in asan1.txt, regarding the size of webrtc::PendingTaskSafetyFlag between allocation and deallocation. Yet again, I'm failing to understand the issue at hand here, probably because I'm not well acquainted with C++ template semantics, so I have no idea what's going wrong here either. Putting this up here in hopes maybe someone else can take a gander at the issue, either reproducing these steps or dumping backtraces of their own.
Created attachment 838067 [details] /etc/portage prefix configuration Putting my gentoo prefix configuration here as well, for full transparency.
After enabling -fsanitize=address on both packages, ASAN pointed out that webrtc::MutexImpl::OwnerRecord::OwnerRecord() is writing outside of its allocated bounds. I've stared at this class before in my debugging, but wasn't sure if the problem was there so I skipped it. Turns out the problem is there. Not in any function (thus my confusion), but in the ABI of the class. This class can be modified in ABI by a preoprocessor definition, and this definition directly depends on "-DNDEBUG". Compiling the class in tg_owt *with* the flag, and then using the header in telegram-desktop *without* the flag causes a mismatch in the ABI, thus having the class constructor in tg_owt write past the area that was allocated for it in telegram-desktop. This is why linking it statically didn't fix the issue either. "-DNDEBUG" is a flag that is set by default in CMake, by building with -DCMAKE_BUILD_TYPE=Release/RelWithDebInfo. However, for some reason, in gentoo's cmake.eclass, this never happens. This has caused issues before, as tg_owt isn't very happy running without -DNDEBUG set (https://bugs.gentoo.org/866055). As a result, this flag is enabled in tg_owt, but I never expected this to cause any issues if I weren't to enable it specifically in telegram-desktop. It's not normal to rely on user-supplied flags to determine the ABI in your header files. Anyway, I'm glad this is finally solved. I'll do some final testing on my end, and PR it soon. I'd like to get rid of 3.6.1 as soon as possible, as it's ancient, incompatible with many things and likely riddled with bugs too, maybe someone can help me pull some strings to stabilize this quicker?
Created attachment 838377 [details] overflow reported by -fsanitize=address
The bug has been closed via the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=fe4cfd2ba2efef2336196921b72952c966537e2f commit fe4cfd2ba2efef2336196921b72952c966537e2f Author: Esteve Varela Colominas <esteve.varela@gmail.com> AuthorDate: 2022-11-30 00:12:54 +0000 Commit: Georgy Yakovlev <gyakovlev@gentoo.org> CommitDate: 2022-12-02 22:48:26 +0000 net-im/telegram-desktop: Fix call issue Fixes an issue regarding ABI incompatibility, that would cause the application to crash during calls. Closes: https://bugs.gentoo.org/866055 Thanks-to: Matteo Pacini <m+gentoo@matteopacini.me> Signed-off-by: Esteve Varela Colominas <esteve.varela@gmail.com> Signed-off-by: Georgy Yakovlev <gyakovlev@gentoo.org> ...egram-desktop-4.3.4.ebuild => telegram-desktop-4.3.4-r1.ebuild} | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) Additionally, it has been referenced in the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=11bcc0a2885e17d50c089306c3e57dda4907bdb2 commit 11bcc0a2885e17d50c089306c3e57dda4907bdb2 Author: Esteve Varela Colominas <esteve.varela@gmail.com> AuthorDate: 2022-11-30 00:18:33 +0000 Commit: Georgy Yakovlev <gyakovlev@gentoo.org> CommitDate: 2022-12-02 22:48:27 +0000 media-libs/tg_owt: Minor comment change Reflect findings in dependent package Bug: https://bugs.gentoo.org/866055 Signed-off-by: Esteve Varela Colominas <esteve.varela@gmail.com> Closes: https://github.com/gentoo/gentoo/pull/28478 Signed-off-by: Georgy Yakovlev <gyakovlev@gentoo.org> media-libs/tg_owt/tg_owt-0_pre20220507.ebuild | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=7dc63ffb431a37d2e1ef9ab06efcd83307a719a8 commit 7dc63ffb431a37d2e1ef9ab06efcd83307a719a8 Author: Esteve Varela Colominas <esteve.varela@gmail.com> AuthorDate: 2022-11-30 00:16:44 +0000 Commit: Georgy Yakovlev <gyakovlev@gentoo.org> CommitDate: 2022-12-02 22:48:27 +0000 net-im/telegram-desktop: Drop broken Serious bugfix in new package that happened in this version as well, not going to resolve and test it in this version. Bug: https://bugs.gentoo.org/866055 Signed-off-by: Esteve Varela Colominas <esteve.varela@gmail.com> Signed-off-by: Georgy Yakovlev <gyakovlev@gentoo.org> net-im/telegram-desktop/Manifest | 1 - .../telegram-desktop/telegram-desktop-4.2.4.ebuild | 204 --------------------- 2 files changed, 205 deletions(-)