Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 866055 - net-im/telegram-desktop-4.0.2 - crash during a call in the first few minutes
Summary: net-im/telegram-desktop-4.0.2 - crash during a call in the first few minutes
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: AMD64 Linux
: Normal normal (vote)
Assignee: Esteve Varela Colominas
URL:
Whiteboard:
Keywords: PullRequest
Depends on:
Blocks:
 
Reported: 2022-08-22 16:26 UTC by Matteo Pacini
Modified: 2022-12-12 12:18 UTC (History)
4 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
emerge --info output (emerge_info.txt,6.58 KB, text/plain)
2022-08-22 16:26 UTC, Matteo Pacini
Details
strace log (strace.log.bz2,137.83 KB, application/x-bzip)
2022-08-22 16:28 UTC, Matteo Pacini
Details
backtraces of telegram-desktop-4.3.4 with tg_owt-0_pre20220507 (traces.tar.gz,7.14 KB, application/gzip)
2022-11-29 15:40 UTC, Esteve Varela Colominas
Details
/etc/portage prefix configuration (configs.tar.gz,912 bytes, application/gzip)
2022-11-29 15:42 UTC, Esteve Varela Colominas
Details
overflow reported by -fsanitize=address (asan_overflow.txt,17.18 KB, text/plain)
2022-11-30 00:06 UTC, Esteve Varela Colominas
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Matteo Pacini 2022-08-22 16:26:14 UTC
I've been testing net-im/telegram-desktop-4.0.2 for a day.
The previous version - 3.6.1-r1 - was stable and calls would never be interrupted.

With 4.0.2, I get a crash after a few minutes in call, with the following error message:

===
free(): invalid next size (fast)
Thread 74 "tgc-net" received signal SIGABRT, Aborted.
===

The stack trace prints:

===
#0  0x00007ffff368856c in  () at /lib64/libc.so.6
#1  0x00007ffff363ca02 in raise () at /lib64/libc.so.6
#2  0x00007ffff3627469 in abort () at /lib64/libc.so.6
#3  0x00007ffff367c888 in  () at /lib64/libc.so.6
#4  0x00007ffff36921ca in  () at /lib64/libc.so.6
#5  0x00007ffff3693dc5 in  () at /lib64/libc.so.6
#6  0x00007ffff36968df in free () at /lib64/libc.so.6
#7  0x0000555557e94227 in  ()
#8  0x0000555557e9429d in  ()
#9  0x00007ffff74afe95 in rtc::Thread::Dispatch(rtc::Message*) () at /usr/lib64/libtg_owt.so.0
#10 0x00007ffff74af1b7 in rtc::Thread::ProcessMessages(int) () at /usr/lib64/libtg_owt.so.0
#11 0x00007ffff74af294 in rtc::Thread::PreRun(void*) () at /usr/lib64/libtg_owt.so.0
#12 0x00007ffff368684a in  () at /lib64/libc.so.6
#13 0x00007ffff3709cec in  () at /lib64/libc.so.6
===

I've attached a strace log as well.

USE flags: X dbus hunspell screencast spell.

I'm using KDE Plasma (X11) with Pulseaudio, on 5.15.59-gentoo kernel.
Thanks!
Comment 1 Matteo Pacini 2022-08-22 16:26:46 UTC
Created attachment 800497 [details]
emerge --info output
Comment 2 Matteo Pacini 2022-08-22 16:28:05 UTC
Created attachment 800499 [details]
strace log
Comment 3 Matteo Pacini 2022-08-22 19:18:07 UTC
Managed to get more symbols in:

===
#0  0x00007ffff368856c in  () at /lib64/libc.so.6
#1  0x00007ffff363ca02 in raise () at /lib64/libc.so.6
#2  0x00007ffff3627469 in abort () at /lib64/libc.so.6
#3  0x00007ffff367c888 in  () at /lib64/libc.so.6
#4  0x00007ffff36921ca in  () at /lib64/libc.so.6
#5  0x00007ffff3693dc5 in  () at /lib64/libc.so.6
#6  0x00007ffff36968df in free () at /lib64/libc.so.6
#7  0x0000555557e94227 in rtc::RefCountedNonVirtual<webrtc::PendingTaskSafetyFlag>::Release() const (this=0x7fff2418bbe0)
    at /usr/include/tg_owt/api/ref_counted_base.h:80
#8  rtc::RefCountedNonVirtual<webrtc::PendingTaskSafetyFlag>::Release() const (this=0x7fff2418bbe0)
    at /usr/include/tg_owt/api/ref_counted_base.h:68
#9  rtc::scoped_refptr<webrtc::PendingTaskSafetyFlag>::~scoped_refptr() (this=0x7fff2418bad8, __in_chrg=<optimized out>)
    at /usr/include/tg_owt/api/scoped_refptr.h:103
#10 webrtc::ScopedTaskSafety::~ScopedTaskSafety() (this=0x7fff2418bad8, __in_chrg=<optimized out>)
    at /usr/include/tg_owt/rtc_base/task_utils/pending_task_safety_flag.h:122
#11 tgcalls::ReflectorPort::~ReflectorPort() (this=0x7fff2418b410, __in_chrg=<optimized out>)
    at /usr/src/debug/net-im/telegram-desktop-4.0.2/tdesktop-4.0.2-full/Telegram/ThirdParty/tgcalls/tgcalls/v2/ReflectorPort.cpp:137
#12 0x0000555557e9429d in tgcalls::ReflectorPort::~ReflectorPort() (this=0x7fff2418b410, __in_chrg=<optimized out>)
    at /usr/src/debug/net-im/telegram-desktop-4.0.2/tdesktop-4.0.2-full/Telegram/ThirdParty/tgcalls/tgcalls/v2/ReflectorPort.cpp:137
#13 0x00007ffff74afe95 in rtc::Thread::Dispatch(rtc::Message*) (this=0x7fff14002500, pmsg=0x7fffd798c790)
    at /usr/src/debug/media-libs/tg_owt-0_pre20220507/tg_owt-10d5f4bf77333ef6b43516f90d2ce13273255f41/src/rtc_base/thread.cc:700
#14 0x00007ffff74af1b7 in rtc::Thread::ProcessMessages(int) (this=this@entry=0x7fff14002500, cmsLoop=cmsLoop@entry=-1)
    at /usr/src/debug/media-libs/tg_owt-0_pre20220507/tg_owt-10d5f4bf77333ef6b43516f90d2ce13273255f41/src/rtc_base/thread.cc:1144
#15 0x00007ffff74af294 in rtc::Thread::Run() (this=0x7fff14002500)
    at /usr/src/debug/media-libs/tg_owt-0_pre20220507/tg_owt-10d5f4bf77333ef6b43516f90d2ce13273255f41/src/rtc_base/thread.cc:890
#16 rtc::Thread::PreRun(void*) (pv=0x7fff14002500)
    at /usr/src/debug/media-libs/tg_owt-0_pre20220507/tg_owt-10d5f4bf77333ef6b43516f90d2ce13273255f41/src/rtc_base/thread.cc:879
#17 0x00007ffff368684a in  () at /lib64/libc.so.6
#18 0x00007ffff3709cec in  () at /lib64/libc.so.6
===
Comment 4 Esteve Varela Colominas 2022-08-22 20:58:24 UTC
I've heard about this recently too, but forgot about it. Thanks for actually submitting a bug report. It's hard for me to test calls as I don't regularly call and don't want to bother people for one.

As for the bug itself, I'd like to wait for 4.1.x to be merged, with some luck it's already been solved upstream. Mind testing when that happens?
Comment 5 Esteve Varela Colominas 2022-08-22 21:11:44 UTC
Oh, and if you have time maybe try to do the following:
    MYCMAKEARGS="-DBUILD_SHARED_LIBS=OFF" emerge -1 media-libs/tg_owt
    emerge net-im/telegram-desktop

Maybe it's some optimization shenanigans that let it work on the official build.
Comment 6 Matteo Pacini 2022-08-23 10:52:04 UTC
"-DBUILD_SHARED_LIBS=OFF" yields a few compilation errors with libtg_voip - I think it expects dynamic libraries at some point, but I haven't dug deep into it.

I'm happy to test 4.1.x when it's available!
Comment 7 Matteo Pacini 2022-08-23 21:33:39 UTC
Tested version 4.1.1 and it's affected by the same crash, which still seems to be inside media-libs/tg_owt.

To replicate it:
- Call anyone
- In my case crash occurs at around the first minute mark

Backtrace:

===
#0  0x00007ffff368856c in  () at /lib64/libc.so.6
#1  0x00007ffff363ca02 in raise () at /lib64/libc.so.6
#2  0x00007ffff3627469 in abort () at /lib64/libc.so.6
#3  0x00007ffff367c888 in  () at /lib64/libc.so.6
#4  0x00007ffff36921ca in  () at /lib64/libc.so.6
#5  0x00007ffff3693dc5 in  () at /lib64/libc.so.6
#6  0x00007ffff36968df in free () at /lib64/libc.so.6
#7  0x00007ffff74af53f in rtc::Thread::QueuedTaskHandler::OnMessage(rtc::Message*)
    (this=<optimized out>, msg=<optimized out>)
    at /usr/src/debug/media-libs/tg_owt-0_pre20220507/tg_owt-10d5f4bf77333ef6b43516f90d2ce13273255f41/src/rtc_base/thread.cc:1027
#8  0x00007ffff74afe95 in rtc::Thread::Dispatch(rtc::Message*) (this=0x7ffe944554c0, pmsg=0x7ffefd7f9790)
    at /usr/src/debug/media-libs/tg_owt-0_pre20220507/tg_owt-10d5f4bf77333ef6b43516f90d2ce13273255f41/src/rtc_base/thread.cc:700
#9  0x00007ffff74af1b7 in rtc::Thread::ProcessMessages(int) (this=this@entry=0x7ffe944554c0, cmsLoop=cmsLoop@entry=-1)
    at /usr/src/debug/media-libs/tg_owt-0_pre20220507/tg_owt-10d5f4bf77333ef6b43516f90d2ce13273255f41/src/rtc_base/thread.cc:1144
#10 0x00007ffff74af294 in rtc::Thread::Run() (this=0x7ffe944554c0)
    at /usr/src/debug/media-libs/tg_owt-0_pre20220507/tg_owt-10d5f4bf77333ef6b43516f90d2ce13273255f41/src/rtc_base/thread.cc:890
#11 rtc::Thread::PreRun(void*) (pv=0x7ffe944554c0)
    at /usr/src/debug/media-libs/tg_owt-0_pre20220507/tg_owt-10d5f4bf77333ef6b43516f90d2ce13273255f41/src/rtc_base/thread.cc:879
#12 0x00007ffff368684a in  () at /lib64/libc.so.6
#13 0x00007ffff3709cec in  () at /lib64/libc.so.6
(gdb) bt
#0  0x00007ffff368856c in  () at /lib64/libc.so.6
#1  0x00007ffff363ca02 in raise () at /lib64/libc.so.6
#2  0x00007ffff3627469 in abort () at /lib64/libc.so.6
#3  0x00007ffff367c888 in  () at /lib64/libc.so.6
#4  0x00007ffff36921ca in  () at /lib64/libc.so.6
#5  0x00007ffff3693dc5 in  () at /lib64/libc.so.6
#6  0x00007ffff36968df in free () at /lib64/libc.so.6
#7  0x00007ffff74af53f in rtc::Thread::QueuedTaskHandler::OnMessage(rtc::Message*)
    (this=<optimized out>, msg=<optimized out>)
    at /usr/src/debug/media-libs/tg_owt-0_pre20220507/tg_owt-10d5f4bf77333ef6b43516f90d2ce13273255f41/src/rtc_base/thread.cc:1027
#8  0x00007ffff74afe95 in rtc::Thread::Dispatch(rtc::Message*) (this=0x7ffe944554c0, pmsg=0x7ffefd7f9790)
    at /usr/src/debug/media-libs/tg_owt-0_pre20220507/tg_owt-10d5f4bf77333ef6b43516f90d2ce13273255f41/src/rtc_base/thread.cc:700
#9  0x00007ffff74af1b7 in rtc::Thread::ProcessMessages(int) (this=this@entry=0x7ffe944554c0, cmsLoop=cmsLoop@entry=-1)
    at /usr/src/debug/media-libs/tg_owt-0_pre20220507/tg_owt-10d5f4bf77333ef6b43516f90d2ce13273255f41/src/rtc_base/thread.cc:1144
#10 0x00007ffff74af294 in rtc::Thread::Run() (this=0x7ffe944554c0)
    at /usr/src/debug/media-libs/tg_owt-0_pre20220507/tg_owt-10d5f4bf77333ef6b43516f90d2ce13273255f41/src/rtc_base/thread.cc:890
#11 rtc::Thread::PreRun(void*) (pv=0x7ffe944554c0)
    at /usr/src/debug/media-libs/tg_owt-0_pre20220507/tg_owt-10d5f4bf77333ef6b43516f90d2ce13273255f41/src/rtc_base/thread.cc:879
#12 0x00007ffff368684a in  () at /lib64/libc.so.6
#13 0x00007ffff3709cec in  () at /lib64/libc.so.6
===

I can't really help with C++, but it all points to this function here:

===
media-libs/tg_owt-0_pre20220507/tg_owt-10d5f4bf77333ef6b43516f90d2ce13273255f41/src/rtc_base/thread.cc

void Thread::QueuedTaskHandler::OnMessage(Message* msg) {
  RTC_DCHECK(msg);
  auto* data = static_cast<ScopedMessageData<webrtc::QueuedTask>*>(msg->pdata);
  std::unique_ptr<webrtc::QueuedTask> task(data->Release());
  // Thread expects handler to own Message::pdata when OnMessage is called
  // Since MessageData is no longer needed, delete it.
  delete data; // !! CRASH OCCURS HERE !!

  // QueuedTask interface uses Run return value to communicate who owns the
  // task. false means QueuedTask took the ownership.
  if (!task->Run())
    task.release();
}
===

Thanks!
Comment 8 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2022-08-23 21:50:37 UTC
Debugging symbols on glibc may help too, but it sounds like it's a use after free. Please report it upstream.
Comment 9 Matteo Pacini 2022-08-23 23:29:36 UTC
Reported upstream: https://github.com/desktop-app/tg_owt/issues/106
Comment 10 vowstar 2022-08-26 03:40:30 UTC
Same problem, downgrade to 3.6.1-r1 works
Comment 11 Larry the Git Cow gentoo-dev 2022-10-03 22:52:13 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=1bbc682ab1db76c4fa5b95433fc0032b16e6f54a

commit 1bbc682ab1db76c4fa5b95433fc0032b16e6f54a
Author:     Esteve Varela Colominas <esteve.varela@gmail.com>
AuthorDate: 2022-10-01 10:37:52 +0000
Commit:     Georgy Yakovlev <gyakovlev@gentoo.org>
CommitDate: 2022-10-03 22:51:37 +0000

    net-im/telegram-desktop: Drop broken
    
    Linked bug applies to these versions, and won't be fixed for these.
    
    Bug: https://bugs.gentoo.org/866055
    Signed-off-by: Esteve Varela Colominas <esteve.varela@gmail.com>
    Closes: https://github.com/gentoo/gentoo/pull/27553
    Signed-off-by: Georgy Yakovlev <gyakovlev@gentoo.org>

 net-im/telegram-desktop/Manifest                   |   2 -
 .../files/tdesktop-4.0.2-fix-gcc12-cstdint.patch   |  10 --
 .../telegram-desktop/telegram-desktop-4.0.2.ebuild | 183 --------------------
 .../telegram-desktop-4.1.1-r1.ebuild               | 185 ---------------------
 4 files changed, 380 deletions(-)
Comment 12 Esteve Varela Colominas 2022-11-29 15:40:15 UTC
Created attachment 838065 [details]
backtraces of telegram-desktop-4.3.4 with tg_owt-0_pre20220507

3.6.1-r1 is ancient, so after losing hope of ever seeing this magically fixed, I've attempted to debug this. It's a genuine headache of a bug, since it's likely caused by tg_owt, but none of the debugging methods I've attempted help demistify the problem. There's no chance of getting upstream support on this one, after all, it's not reproducible in their builds.

I can't currently reproduce it in my host install, so I set this up in a clean gentoo prefix, something like follows:

    USE=pulseaudio emerge -1 media-libs/openal
    FEATURES="splitdebug compressdebug installsources" emerge -1 glibc
    FEATURES="nostrip installsources" CFLAGS="-Og -ggdb -pipe" CXXFLAGS="-Og -ggdb -pipe" emerge -1 tg_owt
    emerge --onlydeps telegram-desktop  # Accept default autounmask-use values
    CFLAGS="-Og -ggdb -pipe" CXXFLAGS="-Og -ggdb -pipe" ebuild <repo>/net-im/telegram-desktop/telegram-desktop/telegram-desktop-4.3.4.ebuild clean compile

And then with (host! not prefix) gdb:

    set debug-file-directory <prefix>/usr/lib/debug
    file <prefix>/var/tmp/portage/net-im/telegram-desktop/work/tdesktop-4.3.4-full_build/telegram-desktop
    r
    <crash>
    bt

To reproduce this crash, I simply receive a call, and then press "accept". Sometimes it crashes before the call is initialized, sometimes it crashes after pressing the hangup button. More rarely it crashes during the call. After trying this a couple of times, I've captured three different backtraces, attached in traces.tar.gz.

bt1.txt is a "malloc(): invalid size (unsorted)" crash, I have no idea what this means, but I have never seen malloc fail like this and I have plenty of memory on my system.
bt2.txt is more sensible, saying "free(): invalid next size (fast)", it crashes after trying to free a webrtc::PendingTaskSafetyFlag, but trying to diagnose this backtrace through all the layers of indirection is resulting very hard for me.
bt3.txt is a "corrupted size vs. prev_size" crash in the memory allocator, which also happens during a malloc() like in bt1.txt

All of these point to some kind of memory corruption, maybe an array is being overflowed that is optimized out in the release build?

So I rebuilt tg_owt, but this time with asan:

    FEATURES="nostrip installsources" CFLAGS="-Og -ggdb -pipe -fsanitize=address" CXXFLAGS="-Og -ggdb -pipe -fsanitize=address" emerge -1 tg_owt

And ran:

    LD_PRELOAD=<prefix>/usr/lib/gcc/x86_64-pc-linux-gnu/11.3.0/libasan.so <prefix>/var/tmp/portage/net-im/telegram-desktop/work/tdesktop-4.3.4-full_build/telegram-desktop

This spits out a new-delete-type-mismatch error, included in asan1.txt, regarding the size of webrtc::PendingTaskSafetyFlag between allocation and deallocation. Yet again, I'm failing to understand the issue at hand here, probably because I'm not well acquainted with C++ template semantics, so I have no idea what's going wrong here either.

Putting this up here in hopes maybe someone else can take a gander at the issue, either reproducing these steps or dumping backtraces of their own.
Comment 13 Esteve Varela Colominas 2022-11-29 15:42:06 UTC
Created attachment 838067 [details]
/etc/portage prefix configuration

Putting my gentoo prefix configuration here as well, for full transparency.
Comment 14 Esteve Varela Colominas 2022-11-30 00:04:29 UTC
After enabling -fsanitize=address on both packages, ASAN pointed out that webrtc::MutexImpl::OwnerRecord::OwnerRecord() is writing outside of its allocated bounds. I've stared at this class before in my debugging, but wasn't sure if the problem was there so I skipped it.

Turns out the problem is there. Not in any function (thus my confusion), but in the ABI of the class. This class can be modified in ABI by a preoprocessor definition, and this definition directly depends on "-DNDEBUG". Compiling the class in tg_owt *with* the flag, and then using the header in telegram-desktop *without* the flag causes a mismatch in the ABI, thus having the class constructor in tg_owt write past the area that was allocated for it in telegram-desktop. This is why linking it statically didn't fix the issue either.

"-DNDEBUG" is a flag that is set by default in CMake, by building with -DCMAKE_BUILD_TYPE=Release/RelWithDebInfo. However, for some reason, in gentoo's cmake.eclass, this never happens. This has caused issues before, as tg_owt isn't very happy running without -DNDEBUG set (https://bugs.gentoo.org/866055). As a result, this flag is enabled in tg_owt, but I never expected this to cause any issues if I weren't to enable it specifically in telegram-desktop. It's not normal to rely on user-supplied flags to determine the ABI in your header files.

Anyway, I'm glad this is finally solved. I'll do some final testing on my end, and PR it soon. I'd like to get rid of 3.6.1 as soon as possible, as it's ancient, incompatible with many things and likely riddled with bugs too, maybe someone can help me pull some strings to stabilize this quicker?
Comment 15 Esteve Varela Colominas 2022-11-30 00:06:35 UTC
Created attachment 838377 [details]
overflow reported by -fsanitize=address
Comment 16 Larry the Git Cow gentoo-dev 2022-12-02 22:49:12 UTC
The bug has been closed via the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=fe4cfd2ba2efef2336196921b72952c966537e2f

commit fe4cfd2ba2efef2336196921b72952c966537e2f
Author:     Esteve Varela Colominas <esteve.varela@gmail.com>
AuthorDate: 2022-11-30 00:12:54 +0000
Commit:     Georgy Yakovlev <gyakovlev@gentoo.org>
CommitDate: 2022-12-02 22:48:26 +0000

    net-im/telegram-desktop: Fix call issue
    
    Fixes an issue regarding ABI incompatibility, that would cause the
    application to crash during calls.
    
    Closes: https://bugs.gentoo.org/866055
    Thanks-to: Matteo Pacini <m+gentoo@matteopacini.me>
    Signed-off-by: Esteve Varela Colominas <esteve.varela@gmail.com>
    Signed-off-by: Georgy Yakovlev <gyakovlev@gentoo.org>

 ...egram-desktop-4.3.4.ebuild => telegram-desktop-4.3.4-r1.ebuild} | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Additionally, it has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=11bcc0a2885e17d50c089306c3e57dda4907bdb2

commit 11bcc0a2885e17d50c089306c3e57dda4907bdb2
Author:     Esteve Varela Colominas <esteve.varela@gmail.com>
AuthorDate: 2022-11-30 00:18:33 +0000
Commit:     Georgy Yakovlev <gyakovlev@gentoo.org>
CommitDate: 2022-12-02 22:48:27 +0000

    media-libs/tg_owt: Minor comment change
    
    Reflect findings in dependent package
    
    Bug: https://bugs.gentoo.org/866055
    Signed-off-by: Esteve Varela Colominas <esteve.varela@gmail.com>
    Closes: https://github.com/gentoo/gentoo/pull/28478
    Signed-off-by: Georgy Yakovlev <gyakovlev@gentoo.org>

 media-libs/tg_owt/tg_owt-0_pre20220507.ebuild | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=7dc63ffb431a37d2e1ef9ab06efcd83307a719a8

commit 7dc63ffb431a37d2e1ef9ab06efcd83307a719a8
Author:     Esteve Varela Colominas <esteve.varela@gmail.com>
AuthorDate: 2022-11-30 00:16:44 +0000
Commit:     Georgy Yakovlev <gyakovlev@gentoo.org>
CommitDate: 2022-12-02 22:48:27 +0000

    net-im/telegram-desktop: Drop broken
    
    Serious bugfix in new package that happened in this version as well, not
    going to resolve and test it in this version.
    
    Bug: https://bugs.gentoo.org/866055
    Signed-off-by: Esteve Varela Colominas <esteve.varela@gmail.com>
    Signed-off-by: Georgy Yakovlev <gyakovlev@gentoo.org>

 net-im/telegram-desktop/Manifest                   |   1 -
 .../telegram-desktop/telegram-desktop-4.2.4.ebuild | 204 ---------------------
 2 files changed, 205 deletions(-)