Summary: Hetzner server for releng stage builds (releng-administered) Personnel involved: mattst88 (as releng lead), dilfridge, iamben, other releng members Deadline: at soonest convenience Amount: 64,26 € / month ongoing funding 70,21 € setup Description: Dedicated server for stage builds; amd64 and x86 natively, other arches such as riscv via qemu. Installation and administration will be performed by the release engineering only; specifically, integration into LDAP and/or Puppet is not desired. The above pricing corresponds to a AX51-NVMe server, https://www.hetzner.com/de/dedicated-rootserver/ax51-nvme * AMD Ryzen™ 7 3700X octa-core Matisse (Zen2) * 64 GB DDR4 ECC * 2x 1TB NVMe SSD * 1GB/s network connectivity guaranteed * unlimited traffic Rationale: * The current machine is short on disk space, regularly breaking builds. * The system essentially does not contain any persistent data, runs stock Gentoo-stable code, and is non-critical in case of a break of service. * A "normal" Gentoo install, i.e. a regularly fully updated stable install, is better tested by developers, behaves more predictably, and is likely more secure. * The current infra Puppet-based management is extremely unflexible and effectively kills productivity, since it makes manual interventions by the non-infra people who are using it pointlessly complex [1]. * Migration to a releng-administered machine will allow us to simplify and centralize the baroque contraption of scripts in the releng repository. [1] Recent example: * task: moving blueness' musl builds into standard amd64 stage build mechanism * part 1, moderately tricky, read blueness' scripts, check if they do anything unusual, write specs from them: done in an hour. * part 2, completely trivial, uploading the seed into the infra build host and installing a cron job there: involves days of debating with antarus and robbat2, the start of a releng-onboarding wiki document, and the obligatory "puppet is screwed up, we dont know why, and nobody has time to fix it" phase
Much discussion was had in #gentoo-trustees about this. Approval in principle, provided that the administration of the hosts happens in a better way, esp that will be carried over when releng headcount changes over the years. I'm going to create a wiki page to plan out how to do that administration (the host itself gets no puppet, no LDAP).
as infra person: I think the concept is we order it (it takes 2-3 weeks to be fulfilled according to the vendor) and we can retrofit the administration once that is done; but the administration of the machine would not block purchase / fulfillment. As trustees: Trustees please vote to approve or deny: 64.26 eur /mo is ~80$ / month (+/- exchange rate) or about 960$/yr (not including the one-time setup fee.)
I would like a base config to be done via puppet (users, management software updates, etc). But I think this is OK as is. I vote yes on approving this funding request.
Aye but need a better administration plan, like what suggested by @Robbat2.
(In reply to Robin Johnson from comment #1) > Much discussion was had in #gentoo-trustees about this. > > Approval in principle, provided that the administration of the hosts happens > in a better way, esp that will be carried over when releng headcount changes > over the years. > > I'm going to create a wiki page to plan out how to do that administration > (the host itself gets no puppet, no LDAP). Inspection, yes. Warning if things are outdated, fine. Please expect strong pushback against any integration into automated administration. This, after all, is one of the main points of the proposal. > Recent example: > * task: moving blueness' musl builds into standard amd64 stage build mechanism > * part 1, moderately tricky, read blueness' scripts, check if they do anything > unusual, write specs from them: done in an hour. > * part 2, completely trivial, uploading the seed into the infra build host and > installing a cron job there: involves days of debating with antarus and > robbat2, the start of a releng-onboarding wiki document, and the obligatory > "puppet is screwed up, we dont know why, and nobody has time to fix it" phase Or, to cite a fellow Gentoo developer: "I feel like infra exists solely as a glorified puppet-shepherding Rube Goldberg machine."
Sorry, I muddled this a bit. The options are either: (a) approve the purchase for releng. (b) don't approve the purchase for releng. I'm not quite sure how the machine is administered plays a critical role in the funding; compared to past requests. (1) We fund a bunch of stuff on AWS and we have limited visibility into their use; but that's deemed OK? (2) We funded the sparc devbox recently (bought disks?) and we also have limited visibility into its use. I'm happy to improve how these machines are administered; but again it seems like an artificial barrier for this specific request.
Aye approve the purchase for releng.
This motion passed with (3/5 votes)
<hat type="infra"> mattst88, dilfridge: could you clarify if the location matters? Finland HEL1 (Helsinki) (Price (monthly): € 54.00 / Setup (once): € 59.00) Germany FSN1 (Falkenstein) (Price (monthly): € 59.00 / Setup (once): € 59.00) If it doesn't matter, I'd pick the cheaper location </hat>
(In reply to Robin Johnson from comment #9) Doesn't matter to me either. The cheaper option works for me.
Both is fine for me too. So the cheaper option is best.
For posterity, I vote aye on funding. Aside from this, I agree that a good administration plan needs to be designed and implemented. This does not imply "full management" of the box. As a security team member... given this is a releng box, there are keys and other items that a compromise could impact our signed stages.
demeter.amd64.dev.gentoo.org is online as of mid-June further ansible work is required