Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 724438 - sys-cluster/ceph-15.2.2 mgr locks up after upgrade from Nautilus
Summary: sys-cluster/ceph-15.2.2 mgr locks up after upgrade from Nautilus
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: AMD64 Linux
: Normal major
Assignee: Patrick McLean
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-05-21 10:30 UTC by Konstantin Agouros
Modified: 2020-05-28 19:09 UTC (History)
3 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
output of emerge --info (info.out,5.65 KB, text/plain)
2020-05-21 10:30 UTC, Konstantin Agouros
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Konstantin Agouros 2020-05-21 10:30:51 UTC
Created attachment 640732 [details]
output of emerge --info

After upgrading from nautilus to octopus (15.2.2) the mgr-process locks up.
strace shows an endless loop like this after startup

 futex(0x7f35cde700e0, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x7f35cde700d8, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1590056968, tv_nsec=655003000}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out)
futex(0x7f35cde700e0, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x7f35cde700d8, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1590056968, tv_nsec=660071000}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out)
futex(0x7f35cde700e0, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x7f35cde700d8, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1590056968, tv_nsec=665140000}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out)
futex(0x7f35cde700e0, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x7f35cde700d8, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1590056968, tv_nsec=670210000}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out)
futex(0x7f35cde700e0, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x7f35cde700d8, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1590056968, tv_nsec=675278000}, FUTEX_BITSET_MATCH_ANY^Cstrace: Process 204438 detached
 <detached ...>

Logfile of mgr only shows:

2020-05-21T09:35:19.490+0200 7f35ccf75f00  0 ceph version 15.2.2 (0c857e985a29d90501a285f242ea9c008df49eb8) octopus (stable), process ceph-mgr, pid 204438
2020-05-21T09:35:19.509+0200 7f35ccf75f00  1 mgr[py] Loading python module 'alerts'
2020-05-21T09:35:19.606+0200 7f35ccf75f00  1 mgr[py] Loading python module 'balancer'
2020-05-21T09:35:19.679+0200 7f35ccf75f00  1 mgr[py] Loading python module 'cephadm'
2020-05-21T09:35:19.878+0200 7f35ccf75f00  1 mgr[py] Loading python module 'crash'
2020-05-21T09:35:19.952+0200 7f35ccf75f00  1 mgr[py] Loading python module 'dashboard'
2020-05-21T09:35:20.805+0200 7f35ccf75f00  1 mgr[py] Loading python module 'devicehealth'
2020-05-21T09:35:20.898+0200 7f35ccf75f00  1 mgr[py] Loading python module 'diskprediction_cloud'
2020-05-21T09:35:21.004+0200 7f35ccf75f00  1 mgr[py] Loading python module 'diskprediction_local'

ceph health complains that no mgr is running
Comment 1 Patrick McLean gentoo-dev 2020-05-28 18:05:00 UTC
I would guess that this is the issue: 
https://tracker.ceph.com/issues/43447

I guess I can add a diskprediction USE flag that forces <scipy-1.4, and removes diskprediction_local if it's off.
Comment 2 Konstantin Agouros 2020-05-28 18:57:37 UTC
AFAIK scipy was pulled by this update
Comment 3 Larry the Git Cow gentoo-dev 2020-05-28 19:09:32 UTC
The bug has been closed via the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=c8077590d4fd447e07508c7e511e9190920f0ca8

commit c8077590d4fd447e07508c7e511e9190920f0ca8
Author:     Patrick McLean <patrick.mclean@sony.com>
AuthorDate: 2020-05-28 19:08:53 +0000
Commit:     Patrick McLean <chutzpah@gentoo.org>
CommitDate: 2020-05-28 19:09:22 +0000

    sys-cluster/ceph-15.2.2-r1: Revbump, fix bugs #724508 and #724438
    
    Adds a "diskprediction" USE flag to enable diskprediction_local since it
    forces an old scipy (forcing off python3_8) (bug #724438)
    
    Fixes up the systemd unit and adds a tmpfiles entry (bug #724508)
    
    Closes: https://bugs.gentoo.org/724508
    Closes: https://bugs.gentoo.org/724438
    Copyright: Sony Interactive Entertainment Inc.
    Package-Manager: Portage-2.3.100, Repoman-2.3.22
    Signed-off-by: Patrick McLean <chutzpah@gentoo.org>

 .../{ceph-15.2.2.ebuild => ceph-15.2.2-r1.ebuild}  | 30 +++++++++++++++-------
 .../ceph/files/ceph-15.2.2-systemd-unit.patch      | 12 +++++++++
 sys-cluster/ceph/files/ceph-tmpfilesd              |  1 +
 sys-cluster/ceph/metadata.xml                      |  1 +
 4 files changed, 35 insertions(+), 9 deletions(-)