RX Communication error due to an epoc rollover. This problem is in all releases of 1.8 and affects both client and server communications. Patches have been made available upstream for the master branch and are currently being ported back to the 1.8.x branch. A new upstream release 1.8.7 will contain the three patches that resolve this issue. The 1.8.x branch gerrits are: 14493 rx: rx_InitHost do not overwrite RAND_bytes rx_nextCid 14494 rx: update_nextCid overflow handling is broken A "cleanup" commit is currently being worked on. This is not necessary to fix the problem, but it appears that it will be included with the forthcoming 1.8.7. master gerrit: 14496 Remove overflow check from update_nextCid Reproducible: Always https://gerrit.openafs.org/14493 https://gerrit.openafs.org/14494 https://gerrit.openafs.org/14496
The patches have been merged and tagged upstream as openafs-stable-1_8_7
https://www.openafs.org/release/openafs-1.8.7.html
Note that the delta between 1.8.6 and 1.8.7 is just the the patches for this problem. Also note that the three patches can easily be applied to any of the 1.8 levels, so it should be possible to add them on top of to the existing 1.8.6 package.
I can confirm that the existing ebuild for openafs 1.8.6 works to build openafs 1.8.7 (just renamed the ebuild without changing anything).
This problem impacts all 1.8 RX communications (e.g. communication between the database servers, communication, communication between the file servers and the database servers, and the communication from the clients). The problem is when the communication is initiated. Which means that existing connections are not immediately impacted. When upgrading it's being recommended to update the database servers first, then the file servers, then the clients.
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=bf5eac4aac7fa7aa1a81598a9b4468e498355b0d commit bf5eac4aac7fa7aa1a81598a9b4468e498355b0d Author: Andrew Savchenko <bircoph@gentoo.org> AuthorDate: 2021-01-15 21:32:17 +0000 Commit: Andrew Savchenko <bircoph@gentoo.org> CommitDate: 2021-01-15 21:37:42 +0000 net-fs/openafs: 1.8.7 version bump This update fixes critical bug in the generation of Rx connection IDs that prevent Rx clients started after 14 Jan 2021 08:25:36 AM UTC from being able to successfully make connections. Bug: https://bugs.gentoo.org/765463 Package-Manager: Portage-3.0.12, Repoman-3.0.2 Signed-off-by: Andrew Savchenko <bircoph@gentoo.org> net-fs/openafs/Manifest | 3 + net-fs/openafs/openafs-1.8.7.ebuild | 343 ++++++++++++++++++++++++++++++++++++ 2 files changed, 346 insertions(+)
(In reply to Cheyenne Wills from comment #3) > Note that the delta between 1.8.6 and 1.8.7 is just the the patches for this > problem. Also note that the three patches can easily be applied to any of > the 1.8 levels, so it should be possible to add them on top of to the > existing 1.8.6 package. It's not that simple: in Gentoo we apply openafs-stable-1_8_x branch on top for the release tarball. The main reason for this is the latest possible Linux kernel support (e.g. 1.8.7 supports 5.7 and aforementioned branch supports 5.9), because Gentoo users tend to use latest kernels. Aside from that in contains quite a lot of bugfixes which are also usually useful. There is quite a delta in openafs-stable-1_8_x since the previous snapshot in Gentoo. (In reply to Volkmar Glauche from comment #4) > I can confirm that the existing ebuild for openafs 1.8.6 works to build > openafs 1.8.7 (just renamed the ebuild without changing anything). I recommend you to use 1.8.7 from the tree, since it contains the latest snapshot of the openafs-stable-1_8_x branch and it has much more bugfixes compared to 1.8.7 you are using as described.
(In reply to Andrew Savchenko from comment #7) > (In reply to Cheyenne Wills from comment #3) > > Note that the delta between 1.8.6 and 1.8.7 is just the the patches for this > > problem. Also note that the three patches can easily be applied to any of > > the 1.8 levels, so it should be possible to add them on top of to the > > existing 1.8.6 package. > > It's not that simple: in Gentoo we apply openafs-stable-1_8_x branch on top > for the release tarball. The main reason for this is the latest possible > Linux kernel support (e.g. 1.8.7 supports 5.7 and aforementioned branch > supports 5.9), because Gentoo users tend to use latest kernels. Aside from > that in contains quite a lot of bugfixes which are also usually useful. > There is quite a delta in > openafs-stable-1_8_x since the previous snapshot in Gentoo. > That's fine. I was referring to the delta between the upstream tags of openafs-stable-1_8_6 and openafs-stable-1-8_7 is just the fix for this. I would be a little cautious of using the tip of openafs-stable-1_8_x. You might run into a problem with a partial solution as we (upstream) work through to the next release. The tip of 1.8.x should always be usable, but there is at times a pending wish list of commits that are being pulled in from the master branch, so just grabbing the tip might mean getting just part of a larger update. We (upstream) are working on the next stable 1.8 release. This bug sidetracked the 1.8.7 work with the fix (the way the patches were pulled into 1.8.x is the same steps as pulling in a security patch), so what was going to be 1.8.7 will now be 1.8.8. One of the items pending is a fix for a build error with Linux 5.11-rc1. I'm the one working on that patch, and I'm in the process of testing it. (It's a simple patch, but testing it is a little more involved).
(In reply to Cheyenne Wills from comment #8) > I would be a little cautious of using the tip of openafs-stable-1_8_x. You > might run into a problem with a partial solution as we (upstream) work > through to the next release. The tip of 1.8.x should always be usable, but > there is at times a pending wish list of commits that are being pulled in > from the master branch, so just grabbing the tip might mean getting just > part of a larger update. That's the risk we have to take. After all we have testing to filter that out before stable. > We (upstream) are working on the next stable 1.8 release. This bug > sidetracked the 1.8.7 work with the fix (the way the patches were pulled > into 1.8.x is the same steps as pulling in a security patch), so what was > going to be 1.8.7 will now be 1.8.8. Yes, this is understandable. > One of the items pending is a fix for a build error with Linux 5.11-rc1. > I'm the one working on that patch, and I'm in the process of testing it. > (It's a simple patch, but testing it is a little more involved). By the way, does openafs-stable-1_8_x work fine with kernel 5.10? From commit history I see 5.9 support, so I set this up as a limit and all >=5.10 users will be warned during build about unsupported kernel. If OpenAFS is known to work fine on 5.10, I can bump the limit.
(In reply to Andrew Savchenko from comment #9) > (In reply to Cheyenne Wills from comment #8) > > I would be a little cautious of using the tip of openafs-stable-1_8_x. You > > might run into a problem with a partial solution as we (upstream) work > > through to the next release. The tip of 1.8.x should always be usable, but > > there is at times a pending wish list of commits that are being pulled in > > from the master branch, so just grabbing the tip might mean getting just > > part of a larger update. > > That's the risk we have to take. After all we have testing to filter that > out before stable. > No problem, just as long as you are aware. Watch the gerrit traffic on gerrit.openafs.org for the open openafs-stable-1_8_x branch Also be aware that there will be a 1.9.x at some point, that is going to be a development branch to flesh out some new features that will be going into the next general release of openafs (so expect some churn in the new features.) In the meantime 1.8.x will continue to contain the current general release. > By the way, does openafs-stable-1_8_x work fine with kernel 5.10? From > commit history I see 5.9 support, so I set this up as a limit and all >=5.10 > users will be warned during build about unsupported kernel. If OpenAFS is > known to work fine on 5.10, I can bump the limit. OpenAFS 1.8.x should be fine on a Linux 5.10 kernel. There were no changes needed to build on 5.10. Building against 5.11 so far has one hard build error. Watch the master branch on gerrit for the fix.
(In reply to Cheyenne Wills from comment #10) > Also be aware that there will be a 1.9.x at some point, that is going to be > a development branch to flesh out some new features that will be going into > the next general release of openafs (so expect some churn in the new > features.) > In the meantime 1.8.x will continue to contain the current general release. I was planning to stay on 1.8.x until 2.0.x will be available. I'm not sure I'll be able to support 1.9.x. > > By the way, does openafs-stable-1_8_x work fine with kernel 5.10? From > > commit history I see 5.9 support, so I set this up as a limit and all >=5.10 > > users will be warned during build about unsupported kernel. If OpenAFS is > > known to work fine on 5.10, I can bump the limit. > > OpenAFS 1.8.x should be fine on a Linux 5.10 kernel. There were no changes > needed > to build on 5.10. Thanks. I'll update the version check.
Working version 1.8.7 is stable in the tree now.