Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 938955 - net-dns/bind-9.18.29-r1 crashes during start when loading Samba DLZ
Summary: net-dns/bind-9.18.29-r1 crashes during start when loading Samba DLZ
Status: UNCONFIRMED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal normal
Assignee: Patrick McLean
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-09-03 06:14 UTC by Krzysztof Olędzki
Modified: 2024-09-08 06:23 UTC (History)
5 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Krzysztof Olędzki 2024-09-03 06:14:11 UTC
I successfully upgraded to net-dns/bind-9.18.29 in some of my systems yesterday and noticed today that there is net-dns/bind-9.18.29-r1 available.

One of the differences is that jemalloc is now enabled by default.

Sadly, it cases bind to crash during startup with "free(): invalid pointer" error if Samba DLZ is used to provide support for Active Directory DNS:

dlz "AD DNS Zone" {
     database "dlopen /usr/lib/samba/bind9/dlz_bind9_18.so";
};

"USE=-jemalloc emerge -1v bind" fixes the issue.

Note it *does not* seem to be a Gentoo specific issue:
 https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1074378
 https://bugzilla.redhat.com/show_bug.cgi?id=2278016
 https://bbs.archlinux.org/viewtopic.php?id=295995

Not sure what is the best fix? One of the workaround seems to be to use LDB_MODULES_DISABLE_DEEPBIND, but perhaps we can disable "jemalloc" flag if net-fs/samba is compiled with "adc"?
Comment 1 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2024-09-03 06:15:28 UTC
This is the stupid TLS model crap. I'll take a look.
Comment 2 Krzysztof Olędzki 2024-09-03 06:16:14 UTC
[Note: Cannot add https://bbs.archlinux.org/viewtopic.php?id=295995 as "see also" - "(...) is not a valid URL to a bug. See Also URLs should point to one of:"
Comment 3 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2024-09-03 06:20:45 UTC
I think it's worse than the TLS model issue.
Comment 4 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2024-09-03 06:27:25 UTC
Florian (forgive the CC, I just know you've been working on the area a fair bit recently): do you know what Fedora plans to do to handle this?

I was somewhat under the impression that the situation had improved with external mallocs (up until now, I'd been aggressively disabling it or making it optional).
Comment 5 Larry the Git Cow gentoo-dev 2024-09-03 06:43:45 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=a943ac07947e6c6f48ddad61b7b5395fb1d65e68

commit a943ac07947e6c6f48ddad61b7b5395fb1d65e68
Author:     Sam James <sam@gentoo.org>
AuthorDate: 2024-09-03 06:42:25 +0000
Commit:     Sam James <sam@gentoo.org>
CommitDate: 2024-09-03 06:42:25 +0000

    profiles/base: mask net-dns/bind[jemalloc]
    
    Bug: https://bugs.gentoo.org/938955
    Signed-off-by: Sam James <sam@gentoo.org>

 profiles/base/package.use.mask | 4 ++++
 1 file changed, 4 insertions(+)

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=6d9bd99c395afe5e409d55eb3375f1d9b7c4f57c

commit 6d9bd99c395afe5e409d55eb3375f1d9b7c4f57c
Author:     Sam James <sam@gentoo.org>
AuthorDate: 2024-09-03 06:40:39 +0000
Commit:     Sam James <sam@gentoo.org>
CommitDate: 2024-09-03 06:41:28 +0000

    net-dns/bind: don't enable jemalloc by default
    
    It was enabled by default in c5a1a089f3e4ce5b4203fb2b0b218a42c3f090bd but
    we get crashes with dlz + samba.
    
    A reliable BIND is more important than anything else.
    
    Bug: https://bugs.gentoo.org/938955
    Signed-off-by: Sam James <sam@gentoo.org>

 net-dns/bind/bind-9.18.29-r2.ebuild | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
Comment 6 Florian Weimer 2024-09-03 09:32:25 UTC
Could we get LD_DEBUG=all from the failing process?

The local search scope for the plugin should have jemalloc for libc, but that doesn't seem to happen for some reason.
Comment 7 Krzysztof Olędzki 2024-09-04 04:29:36 UTC
# LD_DEBUG=all named -u named -f -g -d 5 2>&1 | tee /tmp/named.log

(...)
     25584:     binding file /usr/lib64/libtevent.so.0 [0] to /usr/lib64/libc.so.6 [0]: normal symbol `close' [GLIBC_2.2.5]
     25584:     symbol=malloc;  lookup in file=named [0]
     25584:     symbol=malloc;  lookup in file=/usr/lib64/libjemalloc.so.2 [0]
     25584:     binding file /usr/lib64/libtdb.so.1 [0] to /usr/lib64/libjemalloc.so.2 [0]: normal symbol `malloc' [GLIBC_2.2.5]
free(): invalid pointer

The file is 146MB long:

# wc /tmp/named.log
  1460043   9139496 154007322 /tmp/named.log

It compresses with xz -9 to 4.7MB, which is still above the attachment size limit on bugs.gentoo.org, which is 1MB. Should I mail it, split into 1 MB parts, or attach in some other way?
Comment 8 Florian Weimer 2024-09-04 05:19:10 UTC
Never mind, this is a Samba bug:

#ifdef RTLD_DEEPBIND
        /*
         * use deepbind if possible, to avoid issues with different
         * system library variants, for example ldb modules may be linked
         * against Heimdal while the application may use MIT kerberos.
         *
         * See the dlopen manpage for details.
         *
         * One typical user is the bind_dlz module of Samba,
         * but symbol versioning might be enough...
         *
         * We need a way to disable this in order to allow the
         * ldb_*ldap modules to work with a preloaded socket wrapper.
         *
         * So in future we may remove this completely
         * or at least invert the default behavior.
        */
        if (deepbind_enabled) {
                dlopen_flags |= RTLD_DEEPBIND;
        }
#endif

RTLD_DEEPBIND should not be used.
Comment 9 Krzysztof Olędzki 2024-09-04 21:25:17 UTC
Thank you Florian!

Sadly, https://bugzilla.samba.org/show_bug.cgi?id=15643 is marked as "RESOLVED INVALID " with "Closing this, the isc-bind code has been fixed (on Debian at least)" comment.
Comment 10 Kyle Elbert 2024-09-08 06:23:10 UTC
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1074378#96 has a workaround patch from one of bind's developers.