Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 788625 - x11-apps/igt-gpu-tools-1.25: segfault while dynamic loading with -Wl,z-,now or LD_BIND_NOW=1 (IFUNC refers to not yet initialized GOT)
Summary: x11-apps/igt-gpu-tools-1.25: segfault while dynamic loading with -Wl,z-,now o...
Status: CONFIRMED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: AMD64 Linux
: Normal normal (vote)
Assignee: Gentoo X packagers
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-05-06 18:05 UTC by Nekun
Modified: 2021-05-16 06:10 UTC (History)
3 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
emerge --info (emerge-info,16.52 KB, text/plain)
2021-05-06 18:05 UTC, Nekun
Details
build log (buildlog,459.93 KB, text/plain)
2021-05-06 18:06 UTC, Nekun
Details
installed deps (deps,1.62 KB, text/plain)
2021-05-06 18:06 UTC, Nekun
Details
LD_DEBUG=all /usr/lib64/ld-linux.so.2 /usr/bin/intel_vbt_decode (lddebug.xz,301.16 KB, application/x-xz)
2021-05-06 18:07 UTC, Nekun
Details
igt-gpu-tools-1.25-avoid-plt.patch (igt-gpu-tools-1.25-avoid-plt.patch,476 bytes, patch)
2021-05-10 20:41 UTC, Sergei Trofimovich (RETIRED)
Details | Diff
Force lazy binding (always-lazybind.patch,868 bytes, patch)
2021-05-16 06:10 UTC, Nekun
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Nekun 2021-05-06 18:05:44 UTC
Created attachment 706389 [details]
emerge --info

Starting program: /lib64/ld-linux-x86-64.so.2 /usr/bin/intel_vbt_decode 

Program received signal SIGSEGV, Segmentation fault.
0x0000000000019ce6 in ?? ()
(gdb) bt
#0  0x0000000000019ce6 in ?? ()
#1  0x00007ffff7f2fee9 in ?? ()
#2  0x0000000000000009 in ?? ()
#3  0x00007ffff7fdee91 in elf_machine_rela (skip_ifunc=<optimized out>, reloc_a
ddr_arg=<optimized out>, version=<optimized out>, sym=<optimized out>, reloc=0x
7ffff7f1d638, map=0x7ffff7f96000) at ../sysdeps/x86_64/dl-machine.h:330
#4  elf_dynamic_do_Rela (skip_ifunc=<optimized out>, lazy=<optimized out>, nrel
ative=<optimized out>, relsize=<optimized out>, reladdr=<optimized out>, map=0x
7ffff7f96000) at do-rel.h:137
#5  _dl_relocate_object (l=l@entry=0x7ffff7f96000, scope=<optimized out>, reloc
_mode=<optimized out>, consider_profiling=<optimized out>, consider_profiling@e
ntry=0) at dl-reloc.c:274
#6  0x00007ffff7fd6491 in dl_main (phdr=<optimized out>, phdr@entry=0x7ffff7fd2
040, phnum=<optimized out>, phnum@entry=8, user_entry=user_entry@entry=0x7fffff
ffdc70, auxv=<optimized out>) at rtld.c:2345
#7  0x00007ffff7fec072 in _dl_sysdep_start (start_argptr=start_argptr@entry=0x7
fffffffdd30, dl_main=dl_main@entry=0x7ffff7fd44e0 <dl_main>) at ../elf/dl-sysde
p.c:252
#8  0x00007ffff7fd4049 in _dl_start_final (arg=0x7fffffffdd30) at rtld.c:506
#9  _dl_start (arg=0x7fffffffdd30) at rtld.c:599
#10 0x00007ffff7fd3058 in _start () from /lib64/ld-linux-x86-64.so.2

Last strings from ld-linux debug print (see full log in attachment):
     30038:	symbol=igt_half_to_float;  lookup in file=/usr/bin/intel_vbt_decode [0]
     30038:	symbol=igt_half_to_float;  lookup in file=/usr/lib64/libigt.so.0 [0]
     30038:	binding file /usr/lib64/libigt.so.0 [0] to /usr/lib64/libigt.so.0 [0]: normal symbol `igt_half_to_float'


Also, consider it's important to note that I have a hardened profile, supposed it's might be caused by its toolchain settings.
Comment 1 Nekun 2021-05-06 18:06:12 UTC
Created attachment 706392 [details]
build log
Comment 2 Nekun 2021-05-06 18:06:25 UTC
Created attachment 706395 [details]
installed deps
Comment 3 Nekun 2021-05-06 18:07:15 UTC
Created attachment 706398 [details]
LD_DEBUG=all /usr/lib64/ld-linux.so.2 /usr/bin/intel_vbt_decode
Comment 4 Nekun 2021-05-06 18:09:19 UTC
Comment on attachment 706395 [details]
installed deps

x11-apps/igt-gpu-tools-1.25:
 [  0]  x11-apps/igt-gpu-tools-1.25   
 [  1]  dev-libs/elfutils-0.183   
 [  1]  dev-libs/glib-2.66.7   
 [  1]  sys-apps/kmod-28   
 [  1]  sys-libs/libunwind-1.5.0-r1   
 [  1]  sys-libs/zlib-1.2.11-r3   
 [  1]  sys-process/procps-3.3.17   
 [  1]  virtual/libudev-232-r3   
 [  1]  x11-libs/cairo-1.16.0-r4   
 [  1]  x11-libs/libdrm-2.4.104   
 [  1]  x11-libs/libpciaccess-0.16   
 [  1]  x11-libs/pixman-0.40.0   
 [  1]  dev-libs/xmlrpc-c-1.51.06-r2   
 [  1]  sci-libs/gsl-2.5-r1   
 [  1]  media-libs/alsa-lib-1.2.3.2-r1   
 [  1]  x11-libs/libXrandr-1.5.2   
 [  1]  x11-libs/libX11-1.7.0   
 [  1]  x11-libs/libXext-1.3.4   
 [  1]  x11-libs/libXv-1.0.11-r2   
 [  1]  dev-libs/json-c-0.15   
 [  1]  dev-util/valgrind-3.16.1   
 [  1]  dev-util/gtk-doc-1.33.1-r4   
 [  1]  dev-python/docutils-0.16-r1   
 [  1]  dev-util/peg-0.1.18   
 [  1]  x11-base/xorg-proto-2020.1   
 [  1]  sys-devel/bison-3.7.3   
 [  1]  sys-devel/flex-2.6.4-r1   
 [  1]  dev-util/meson-0.56.2   
 [  1]  dev-util/ninja-1.10.1   
 [  1]  dev-util/meson-format-array-0
Comment 5 Matt Turner gentoo-dev 2021-05-10 16:08:11 UTC
hardened@: re-cc x11@ when you have something.
Comment 6 Sam James archtester gentoo-dev Security 2021-05-10 16:11:15 UTC
CCing toolchain@ as hardened is quite quiet at the moment.
Comment 7 Sergei Trofimovich (RETIRED) gentoo-dev 2021-05-10 17:17:05 UTC
Before we dig into why relocations are pointing to invalid memory for you.

Any specific reason you are using old gcc and old binutils? Stable gcc is 10, stable binutils is 2.35.

If you tarball all the binaries (including /lib64/ld-linux-x86-64.so.2 and all .so files) seen in LD_DEBUG=all I'll try to reproduce locally and will try to get the idea where the failure comes from.
Comment 8 Sergei Trofimovich (RETIRED) gentoo-dev 2021-05-10 18:12:15 UTC
LD_BIND_NOW=1 (or probably -Wl,-z,now) should be enough to trigger the failure on vanilla system:

$ /lib64/ld-linux-x86-64.so.2 --library-path ../lib64/ ./intel_vbt_decode
usage: ./intel_vbt_decode --file=<rom_file> [--devid=<device_id>] [--panel-type=<panel_type>] [--all-panels] [--hexdump] [--block=<block_no>] [--header] [--describe] [--help]
$ LD_BIND_NOW=1 /lib64/ld-linux-x86-64.so.2 --library-path ../lib64/ ./intel_vbt_decode
Segmentation fault (core dumped)
Comment 9 Sergei Trofimovich (RETIRED) gentoo-dev 2021-05-10 19:10:06 UTC
`igt_half_to_float` is an ifunc which:

$ nm -D /usr/lib64/libigt.so.0 | fgrep to_flo
0000000000027cb0 i igt_half_to_float

means it has a chance to be called before rest of relocations are set up to resolve implementation of `igt_half_to_float`.

Ideally IFUNCs should not rely on any external relocations to resolve it's symbols. Lazy binds probably make it work by chance.

Resolver is called at https://github.com/freedesktop/xorg-intel-gpu-tools/blob/master/lib/igt_halffloat.c#L205

void igt_float_to_half(const float *f, uint16_t *h, unsigned int num)
	__attribute__((ifunc("resolve_float_to_half")));

static void (*resolve_float_to_half(void))(const float *f, uint16_t *h, unsigned int num)
{
	if (igt_x86_features() & F16C)
		return float_to_half_f16c;

	return float_to_half;
}

The igt_x86_features() is also simple: https://github.com/freedesktop/xorg-intel-gpu-tools/blob/master/lib/igt_x86.c#L105

unsigned igt_x86_features(void)
{
	unsigned max = __get_cpuid_max(BASIC_CPUID, 0);
	unsigned eax, ebx, ecx, edx;
	unsigned features = 0;
	unsigned extra = 0;

	if (max >= 1) {
		__cpuid(1, eax, ebx, ecx, edx);
...


`igt_x86_features()` is called via PLT (I think it's a smoking gun and is easy to fix one):

(gdb) disassemble resolve_half_to_float
Dump of assembler code for function resolve_half_to_float:
   0x0000000000027cb0 <+0>:	sub    $0x8,%rsp
   0x0000000000027cb4 <+4>:	call   0x19cf0 <igt_x86_features@plt>
   0x0000000000027cb9 <+9>:	lea    -0x70(%rip),%rdx        # 0x27c50 <half_to_float_f16c>
   0x0000000000027cc0 <+16>:	test   $0x2,%ah
   0x0000000000027cc3 <+19>:	lea    -0x33a(%rip),%rax        # 0x27990 <half_to_float>
   0x0000000000027cca <+26>:	cmovne %rdx,%rax
   0x0000000000027cce <+30>:	add    $0x8,%rsp
   0x0000000000027cd2 <+34>:	ret
End of assembler dump.

(gdb) disassemble 0x19cf0, 0x19cf0+10
Dump of assembler code from 0x19cf0 to 0x19cfa:
   0x0000000000019cf0 <igt_x86_features@plt+0>:	jmp    *0x70182(%rip)        # 0x89e78 <igt_x86_features@got.plt>
   0x0000000000019cf6 <igt_x86_features@plt+6>:	push   $0x1cc

(gdb) x/1a 0x89e78
0x89e78 <igt_x86_features@got.plt>:	0x19cf6 <igt_x86_features@plt+6>

And 0x19cf6 is our yet unrelocated GOT value where strace fails as well:

$ LD_BIND_NOW=1 strace /usr/bin/intel_vbt_decode |& fgrep SEGV
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x19cf6} ---
+++ killed by SIGSEGV (core dumped) +++
Comment 10 Sergei Trofimovich (RETIRED) gentoo-dev 2021-05-10 20:41:56 UTC
Created attachment 706845 [details, diff]
igt-gpu-tools-1.25-avoid-plt.patch

igt-gpu-tools-1.25-avoid-plt.patch is a hack to illustrate the point. It should be fine to make things working for amd64.

I expect it not to be enough to keep tests working or x86 working. Ideally we need to inline the body of `igt_x86_features()` into IFUNC.
Comment 11 Sergei Trofimovich (RETIRED) gentoo-dev 2021-05-10 20:43:04 UTC
Back to x11@.
Comment 12 Nekun 2021-05-16 06:09:00 UTC
As a (temporary?) workaround, suggested to add "-Wl,-z,lazy" to LDFLAGS.
Comment 13 Nekun 2021-05-16 06:10:21 UTC
Created attachment 708861 [details, diff]
Force lazy binding