Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 765463 - net-fs/openafs: RX communication failure between servers and clients
Summary: net-fs/openafs: RX communication failure between servers and clients
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All All
: Normal critical with 1 vote (vote)
Assignee: Adam Feldman
URL: https://lists.openafs.org/pipermail/o...
Whiteboard:
Keywords:
Depends on: 765574
Blocks:
  Show dependency tree
 
Reported: 2021-01-14 19:38 UTC by Cheyenne Wills
Modified: 2021-01-20 11:16 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Cheyenne Wills 2021-01-14 19:38:08 UTC
RX Communication error due to an epoc rollover. 

This problem is in all releases of 1.8 and affects both client and server communications.

Patches have been made available upstream for the master branch and are currently being ported back to the 1.8.x branch. 

A new upstream release 1.8.7 will contain the three patches that resolve this issue.

The 1.8.x branch gerrits are:

14493	rx: rx_InitHost do not overwrite RAND_bytes rx_nextCid
14494   rx: update_nextCid overflow handling is broken

A "cleanup" commit is currently being worked on.  This is not necessary to fix the problem, but it appears that it will be included with the forthcoming 1.8.7.

master gerrit: 14496 Remove overflow check from update_nextCid

Reproducible: Always




https://gerrit.openafs.org/14493
https://gerrit.openafs.org/14494
https://gerrit.openafs.org/14496
Comment 1 Cheyenne Wills 2021-01-14 22:46:31 UTC
The patches have been merged and tagged upstream as openafs-stable-1_8_7
Comment 2 Cheyenne Wills 2021-01-15 03:21:52 UTC
https://www.openafs.org/release/openafs-1.8.7.html
Comment 3 Cheyenne Wills 2021-01-15 11:18:11 UTC
Note that the delta between 1.8.6 and 1.8.7 is just the the patches for this problem.  Also note that the three patches can easily be applied to any of the 1.8 levels, so it should be possible to add them on top of to the existing 1.8.6 package.
Comment 4 Volkmar Glauche 2021-01-15 11:19:57 UTC
I can confirm that the existing ebuild for openafs 1.8.6 works to build openafs 1.8.7 (just renamed the ebuild without changing anything).
Comment 5 Cheyenne Wills 2021-01-15 15:55:42 UTC
This problem impacts all 1.8 RX communications (e.g. communication between the database servers, communication, communication between the file servers and the database servers, and the communication from the clients).

The problem is when the communication is initiated.  Which means that existing connections are not immediately impacted.  

When upgrading it's being recommended to update the database servers first, then the file servers, then the clients.
Comment 6 Larry the Git Cow gentoo-dev 2021-01-15 21:37:59 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=bf5eac4aac7fa7aa1a81598a9b4468e498355b0d

commit bf5eac4aac7fa7aa1a81598a9b4468e498355b0d
Author:     Andrew Savchenko <bircoph@gentoo.org>
AuthorDate: 2021-01-15 21:32:17 +0000
Commit:     Andrew Savchenko <bircoph@gentoo.org>
CommitDate: 2021-01-15 21:37:42 +0000

    net-fs/openafs: 1.8.7 version bump
    
    This update fixes critical bug in the generation of Rx connection IDs
    that prevent Rx clients started after 14 Jan 2021 08:25:36 AM UTC
    from being able to successfully make connections.
    
    Bug: https://bugs.gentoo.org/765463
    Package-Manager: Portage-3.0.12, Repoman-3.0.2
    Signed-off-by: Andrew Savchenko <bircoph@gentoo.org>

 net-fs/openafs/Manifest             |   3 +
 net-fs/openafs/openafs-1.8.7.ebuild | 343 ++++++++++++++++++++++++++++++++++++
 2 files changed, 346 insertions(+)
Comment 7 Andrew Savchenko gentoo-dev 2021-01-15 21:47:03 UTC
(In reply to Cheyenne Wills from comment #3)
> Note that the delta between 1.8.6 and 1.8.7 is just the the patches for this
> problem.  Also note that the three patches can easily be applied to any of
> the 1.8 levels, so it should be possible to add them on top of to the
> existing 1.8.6 package.

It's not that simple: in Gentoo we apply openafs-stable-1_8_x branch on top for the release tarball. The main reason for this is the latest possible Linux kernel support (e.g. 1.8.7 supports 5.7 and aforementioned branch supports 5.9), because Gentoo users tend to use latest kernels. Aside from that in contains quite a lot of bugfixes which are also usually useful. There is quite a delta in 
openafs-stable-1_8_x since the previous snapshot in Gentoo.

(In reply to Volkmar Glauche from comment #4)
> I can confirm that the existing ebuild for openafs 1.8.6 works to build
> openafs 1.8.7 (just renamed the ebuild without changing anything).

I recommend you to use 1.8.7 from the tree, since it contains the latest snapshot of the openafs-stable-1_8_x branch and it has much more bugfixes compared to 1.8.7 you are using as described.
Comment 8 Cheyenne Wills 2021-01-16 01:14:24 UTC
(In reply to Andrew Savchenko from comment #7)
> (In reply to Cheyenne Wills from comment #3)
> > Note that the delta between 1.8.6 and 1.8.7 is just the the patches for this
> > problem.  Also note that the three patches can easily be applied to any of
> > the 1.8 levels, so it should be possible to add them on top of to the
> > existing 1.8.6 package.
> 
> It's not that simple: in Gentoo we apply openafs-stable-1_8_x branch on top
> for the release tarball. The main reason for this is the latest possible
> Linux kernel support (e.g. 1.8.7 supports 5.7 and aforementioned branch
> supports 5.9), because Gentoo users tend to use latest kernels. Aside from
> that in contains quite a lot of bugfixes which are also usually useful.
> There is quite a delta in 
> openafs-stable-1_8_x since the previous snapshot in Gentoo.
> 


That's fine.  I was referring to the delta between the upstream tags of openafs-stable-1_8_6 and openafs-stable-1-8_7 is just the fix for this.

I would be a little cautious of using the tip of openafs-stable-1_8_x.  You might run into a problem with a partial solution as we (upstream) work through to the next release.  The tip of 1.8.x should always be usable, but there is at times a pending wish list of commits that are being pulled in from the master branch, so just grabbing the tip might mean getting just part of a larger update.

We (upstream) are working on the next stable 1.8 release.  This bug sidetracked the 1.8.7 work with the fix (the way the patches were pulled into 1.8.x is the same steps as pulling in a security patch), so what was going to be 1.8.7 will now be 1.8.8.

One of the items pending is a fix for a build error with Linux 5.11-rc1.  I'm the one working on that patch, and I'm in the process of testing it.  (It's a simple patch, but testing it is a little more involved).
Comment 9 Andrew Savchenko gentoo-dev 2021-01-16 09:53:09 UTC
(In reply to Cheyenne Wills from comment #8)
> I would be a little cautious of using the tip of openafs-stable-1_8_x.  You
> might run into a problem with a partial solution as we (upstream) work
> through to the next release.  The tip of 1.8.x should always be usable, but
> there is at times a pending wish list of commits that are being pulled in
> from the master branch, so just grabbing the tip might mean getting just
> part of a larger update.

That's the risk we have to take. After all we have testing to filter that out before stable.
 
> We (upstream) are working on the next stable 1.8 release.  This bug
> sidetracked the 1.8.7 work with the fix (the way the patches were pulled
> into 1.8.x is the same steps as pulling in a security patch), so what was
> going to be 1.8.7 will now be 1.8.8.

Yes, this is understandable.

> One of the items pending is a fix for a build error with Linux 5.11-rc1. 
> I'm the one working on that patch, and I'm in the process of testing it. 
> (It's a simple patch, but testing it is a little more involved).

By the way, does openafs-stable-1_8_x work fine with kernel 5.10? From commit history I see 5.9 support, so I set this up as a limit and all >=5.10 users will be warned during build about unsupported kernel. If OpenAFS is known to work fine on 5.10, I can bump the limit.
Comment 10 Cheyenne Wills 2021-01-16 19:46:29 UTC
(In reply to Andrew Savchenko from comment #9)
> (In reply to Cheyenne Wills from comment #8)
> > I would be a little cautious of using the tip of openafs-stable-1_8_x.  You
> > might run into a problem with a partial solution as we (upstream) work
> > through to the next release.  The tip of 1.8.x should always be usable, but
> > there is at times a pending wish list of commits that are being pulled in
> > from the master branch, so just grabbing the tip might mean getting just
> > part of a larger update.
> 
> That's the risk we have to take. After all we have testing to filter that
> out before stable.
>  

No problem, just as long as you are aware.  Watch the gerrit traffic 
on gerrit.openafs.org  for the open openafs-stable-1_8_x branch


Also be aware that there will be a 1.9.x at some point, that is going to be
a development branch to flesh out some new features that will be going into
the next general release of openafs (so expect some churn in the new features.)
In the meantime 1.8.x will continue to contain the current general release.

 
> By the way, does openafs-stable-1_8_x work fine with kernel 5.10? From
> commit history I see 5.9 support, so I set this up as a limit and all >=5.10
> users will be warned during build about unsupported kernel. If OpenAFS is
> known to work fine on 5.10, I can bump the limit.

OpenAFS 1.8.x should be fine on a Linux 5.10 kernel.  There were no changes needed
to build on 5.10.

Building against 5.11 so far has one hard build error.  Watch the master
branch on gerrit for the fix.
Comment 11 Andrew Savchenko gentoo-dev 2021-01-16 20:43:13 UTC
(In reply to Cheyenne Wills from comment #10)
> Also be aware that there will be a 1.9.x at some point, that is going to be
> a development branch to flesh out some new features that will be going into
> the next general release of openafs (so expect some churn in the new
> features.)
> In the meantime 1.8.x will continue to contain the current general release.

I was planning to stay on 1.8.x until 2.0.x will be available. I'm not sure I'll be able to support 1.9.x.
  
> > By the way, does openafs-stable-1_8_x work fine with kernel 5.10? From
> > commit history I see 5.9 support, so I set this up as a limit and all >=5.10
> > users will be warned during build about unsupported kernel. If OpenAFS is
> > known to work fine on 5.10, I can bump the limit.
> 
> OpenAFS 1.8.x should be fine on a Linux 5.10 kernel.  There were no changes
> needed
> to build on 5.10.

Thanks. I'll update the version check.
Comment 12 Andrew Savchenko gentoo-dev 2021-01-20 11:16:14 UTC
Working version 1.8.7 is stable in the tree now.