We use Gentoo to build clusters and so we have made a few additions to the boot system for cluster use. Basicly, the diskless nodes and the server are booted from the same system image. The nodes nfs mount the server's / fs read-only and use local tmpfs for modifiable stuff (/root /var /etc). We modify three existing scripts and add a new one. We feel that it would be nicer to separate them from the main scripts, but we are not sure how to integrate it as many things need to be part of the early boot and late shutdown sequences... functions.sh: add a checkserver function (telling us if we are on a node or server) and a constant (shmdir) used to mount tmpfs later.. The checkserver function can also be used by some services that are used at both places with different configs. halt.sh: many things are escaped on nodes... doesn't try to deactivate swap on nodes (as they have no disk anyways), doesn't go to console on nodes if umounting fails (the nodes are headless) rc: do not try to activate swap on diskless nodes, and copy the content of modifiable directories into tmpfs and finally we add "switch" which switches runlevels based on the checkserver function... patches attached
Created attachment 1782 [details, diff] checkserver function and shmdir constant
Created attachment 1783 [details, diff] shutdown customisations..
Created attachment 1784 [details, diff] bootup downloading in tmpfs...
Created attachment 1785 [details] switch boot script
Btw, more details on our website.. http://www.cerca.umontreal.ca/chp/en/projects/adelie/
i sorta cleaned up this stuff a while ago, and sent a unified diff to Az. i dunno where we are in the grand scheme of things with this. perhaps it is time to resync our efforts there? i imagine what you have and what we have are quite different at the moment.
Created attachment 5980 [details, diff] further improvements to /sbin/rc This is a patch against rc-scripts-1.4.2.4 (from baselayout-1.8.5.4) that adds a few enhancements from our last cluster. 1. It will work even if /var /root and /etc have their own partitions.. since NFS doesnt get through mountpoints 2. /etc/mtab is filled properly 3. fstab.node is copied if its there... Also, this system wont work if /usr is in its own partition on the server (we dont have a clean solution to that yet)...
Added to cvs.
Here's a new version of our patches against rc-scripts 1.4.2.9 (from baselayout 1.8.5.9). The new design allows much more flexibility and removes SSI specific stuff from the main scripts. Instead of having fixed "node_default/server_default", we use two new init parameters to specify the boot and default runlevels. So we have changed all fixed uses of "boot" to a new variable (BOOTLEVEL). Also we can force a specific soft/runlevel from the kernel command line, allowing different nodes to boot from the same inittab file. A typical kernel command line might look like: linux bootlevel=boot.different softlevel=anotherlevel The bootlevel will be preserved and will be used just like the current "boot" is used. Also, for conf.d, we changed it so that it will look for /etc/conf.d/{service}.{runlevel} before looking for the normal file. For this thing, we've added an add_suffix function in functions.sh which returns "file.suffix" if it exists or just "file" otherwise. Also, the switch script can disappear, since with those new changes, the server is now a normal gentoo system. The SSI specific scripts to go with this will come in a later bug report (and are available on request).
Created attachment 13908 [details, diff] patch for functions.sh
Created attachment 13909 [details, diff] remove ssi specific stuff from halt.sh
Created attachment 13910 [details, diff] patch for rc
Created attachment 13911 [details, diff] patch for runscript.sh
Created attachment 14516 [details] rc.patch for the new no-tmpfs solutio
Created attachment 14517 [details] runscript.sh.patch for the new no-tmpfs solution
Created attachment 14518 [details] halt.sh.patch ...continuing... no-tmpfs stuff
Created attachment 14519 [details] functions.sh.patch no-tmpfs...
The last four patches are the "adaptation" of our previous patches to the new "no-tmpfs" init system in Gentoo. Since we were using quite different scripts for the boot runlevel (no "checkroot" for example) and that the new system hardcodes some of the boot services, we had to find a simple yet elegant solution. We chose to make init scripts read "/etc/runlevels/LEVEL/critical" to know what are the boot runlevel services. If this file is not present, it uses the Gentoo hardcoded defaults. We still maintain the kernel command line parameters, as in our last "patch-attack".
Quick look want to impress me - its a lot cleaner. I will get review it in the next few days and merge it. If issues, I will get back to you.
*** Bug 23710 has been marked as a duplicate of this bug. ***
seems to be in portage.