| Summary: | get_version() in linux-info.eclass sets incorrect $KV_LOCAL | ||
|---|---|---|---|
| Product: | Gentoo Linux | Reporter: | Guenther Brunthaler <gb_about_gnu> |
| Component: | Eclasses | Assignee: | Gentoo Kernel Miscellaneous <kernel-misc> |
| Status: | RESOLVED DUPLICATE | ||
| Severity: | major | CC: | spatz |
| Priority: | High | ||
| Version: | unspecified | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Package list: | Runtime testing required: | --- | |
| Attachments: |
Dummy ebuild for printing out get_version() information for debugging
Dummy ebuild for printing out get_version() information for debugging Dummy ebuild for printing out get_version() information for debugging Patch fixing the problem |
||
There were some errors regarding slashes in my example how things should work. Here is a corrected version:
test -n "${ROOT}" || ROOT=/
root=${ROOT%%/} # Remove any slashes from the end.
. ${root}/etc/make.conf
test -n "${KERNEL_DIR}" || KERNEL_DIR=/usr/src/linux
test -n "${KBUILD_OUTPUT}" || KBUILD_OUTPUT=${KERNEL_DIR}
krelease=`cd "${root}/${KBUILD_OUTPUT}" && make kernelrelease | tail -n1`
kmod_install_dir=${root}/lib/modules/${krelease}
echo "Please install kernel modules into '${kmod_install_dir}'."
is this issue true for all kernel versions? i think that this is duplicate of #328243, isn't it? (In reply to comment #3) > i think that this is duplicate of #328243, isn't it? Could be. However, my bug applies to a 2.6.30 kernel as well, and bug # 328243 explicitly states 2.6.35 and later kernels. Maybe it's the same bug, but maybe not. If it indeed is, we can mark this one as a duplicate of course. (In reply to comment #2) > is this issue true for all kernel versions? Yes, I think so. At least I could not see a reason why linux-mod.eclass should make a distinction between 2.6.30 and other kernel versions. The tricky thing is that this bug only effects people who use the CONFIG_LOCALVERSION feature of the kernel. For other configurations, the output of "make kernelrelease" and "make kernelversion" will always be the same, and thus the selected kernel module directory will "by coincidence" be the right one". An explanation about the difference between the output of "make kernelversion" and make "kernelrelease" and how they effect the kernel module directory. Let's say you installed sys-kernel/gentoo-sources-2.6.30-r8 like I did. If you configure that kernel and then run the following command in the kernel build directory $ make kernelversion it will display something like make -C /usr/src/linux-2.6.30-gentoo-r8 O=/var/tmp/kernel-output/kbuild-xquad-10.218/. kernelversion 2.6.30-gentoo-r8 The last line here is the kernel *version*. Unless you have define the CONFIG_LOCALVERSION parameter in the kernel konfiguration, the output of "make kernelrelease" will display exactly the same output. If, however, you did set the CONFIG_LOCALVERSION parameter to some value, $ grep ^CONFIG_LOCALVERSION .config CONFIG_LOCALVERSION="-xquad-10.218" then the output will differ: $ make kernelrelease make -C /usr/src/linux-2.6.30-gentoo-r8 O=/var/tmp/kernel-output/kbuild-xquad-10.218/. kernelrelease 2.6.30-gentoo-r8-xquad-10.218 As you can see, the output is then the same as from "make kernelversion", but the string from the CONFIG_LOCALVERSION-setting will have been appended to the output. And why is this important? Because it's this output of "make kernelrelease" which will be hardcoded into the kernel to be returned when "uname -r" is excuted when it is running. And this output is also used to locate the kernel's modules which must be put beneath /lib/modules/`uname -r`/ (at the time the new kernel is running - at the time the kernel is built, typically a different kernel will run, potentially returning something else for "uname -r". However, "make kernelrelease" displays already at build time what the kernel will return later when it is running for "uname -r".) Renamed bug because I narrowed down the origin of the problem. The linux-mod.eclass is actually innocent - its the get_version() function from the linux-info.eclass which seems the be the origin of the troubles. Created attachment 241699 [details] Dummy ebuild for printing out get_version() information for debugging Put this ebuild in one of the directories in PORTAGE_OVERLAY list, such as /usr/local/portage/overlay/dev-libs/dummy/ on my box, then "cd" to that directory and run $ FEATURES=digest ebuild dummy-1.0.ebuild clean unpack This displayed the following output on my box: >>> Creating Manifest for /usr/local/portage/overlay/dev-libs/dummy * checking ebuild checksums ;-) ... [ ok ] * checking auxfile checksums ;-) ... [ ok ] * checking miscfile checksums ;-) ... [ ok ] * CPV: dev-libs/dummy-1.0 * REPO: * USE: amd64 elibc_glibc kernel_linux multilib userland_GNU * Determining the location of the kernel source code * Found kernel source directory: * /usr/src/linux * Found kernel object directory: * /var/tmp/kernel-output/target * Found sources for kernel version: * 2.6.30-gentoo-r8 >>> Unpacking source... * KERNEL_DIR = "/usr/src/linux" * KBUILD_OUTPUT = "/var/tmp/kernel-output/target" * KV_FULL = "2.6.30-gentoo-r8" * KV_MAJOR = "2" * KV_MINOR = "6" * KV_PATCH = "30" * KV_EXTRA = "-gentoo-r8" * KV_LOCAL = "" * KV_DIR = "/usr/src/linux" * KV_DIR_OUT = "" >>> Source unpacked in /var/tmp/portage/dev-libs/dummy-1.0/work Note that $KV_DIR_OUT and $KV_LOCAL are reported as empty, although $KBUILD_OUTPUT is set, and CONFIG_LOCALVERSION is also set in my case - see this: $ grep ^CONFIG_LOCALVERSION /var/tmp/kernel-output/target/.config CONFIG_LOCALVERSION="-xquad-10.218" So, obviously get_version() is mistaken by reporting the above values. Created attachment 241701 [details]
Dummy ebuild for printing out get_version() information for debugging
Added x86 keyword so that the ebuild can be used as described on x86 also.
Renamed bug from "get_version() in linux-info.eclass sets incorrect $KV_DIR_OUT and $KV_LOCAL" to "get_version() in linux-info.eclass sets incorrect $KV_LOCAL" because it turned out that $KV_OUT_DIR was the wrong variable name - the correct name is $KV_OUT_DIR and it is set up correctly. Sorry! ;-) So it's only $KV_LOCAL which is set up incorrectly. Created attachment 241703 [details] Dummy ebuild for printing out get_version() information for debugging This ebuild produces the following output when run with the following command: $ FEATURES=digest ebuild dummy-1.1.ebuild clean unpack >>> Creating Manifest for /usr/local/portage/overlay/dev-libs/dummy * checking ebuild checksums ;-) ... [ ok ] * checking auxfile checksums ;-) ... [ ok ] * checking miscfile checksums ;-) ... [ ok ] * CPV: dev-libs/dummy-1.1 * REPO: * USE: amd64 elibc_glibc kernel_linux multilib userland_GNU * Determining the location of the kernel source code * Found kernel source directory: * /usr/src/linux * Found kernel object directory: * /var/tmp/kernel-output/target * Found sources for kernel version: * 2.6.30-gentoo-r8 >>> Unpacking source... * KERNEL_DIR = "/usr/src/linux" * KBUILD_OUTPUT = "/var/tmp/kernel-output/target" * KV_FULL = "2.6.30-gentoo-r8" * KV_MAJOR = "2" * KV_MINOR = "6" * KV_PATCH = "30" * KV_EXTRA = "-gentoo-r8" * KV_LOCAL = "" * KV_DIR = "/usr/src/linux" * KV_OUT_DIR = "/var/tmp/kernel-output/target" >>> Source unpacked in /var/tmp/portage/dev-libs/dummy-1.1/work So everything seems to be fine except for $KV_LOCAL. I have further analzed what the eclass actually does in order to determine the value of CONFIG_LOCALVERSION.
This is as follows:
First, it changes the current directory (of a subshell) to the kernel build directory. In my case, that's
$ cd /var/tmp/kernel-output/target
There, it runs the following command and expects it to print out the value of CONFIG_LOCALVERSION:
$ bash /usr/src/linux/scripts/setlocalversion /usr/src/linux
However, on by box, this script prints out nothing - and nothing is also what $KV_LOCAL is set to, thus creating the problem.
All this happens in line 547 of the linux-info.eclass, which looks like this:
KV_LOCAL="${KV_LOCAL}$(cd ${KV_OUT_DIR} ; bash ${KV_DIR}/scripts/setloc 547 alversion ${KV_DIR})"
So, the actual problem is that linux-info.eclass expects the script scripts/setlocalversion in the linux source directory to print out the value of CONFIG_LOCALVERSION - but it just doesn't!
Sorry, line 547 from my last post has been mangled by "less"; the correct contents of the line is:
KV_LOCAL="${KV_LOCAL}$(cd ${KV_OUT_DIR} ; bash ${KV_DIR}/scripts/setlocalversion ${KV_DIR})"
But the question remains: What is the purpose of the scripts/setlocalversion script in the kernel source directory, and is it really true that this script, when run, should print out the value of CONFIG_LOCALVERSION (plus additional suffixed from git files maybe)? It seems this is not the case.
In my opinion, the ebuild should not take any chances, and run a "make kernelrelease" instead of all the stuff it does.
This will always display the correct information, because the details how the kernel Makefile can change at any time, but it will always know how to correctly display the kernel release when a "make kernelrelease" is run.
I understand now why get_version() is *not* just running 'make kernelrelease': It does not work when running without write permissions. But the ebuild must be prepared to be run as user "portage", who has read access but no write access to the kernel build and source directories. This explains the complicated way get_version() tries to do its thing. But unfortunately it does not do it's job right. ;-) It totally understand now why scripts/setlocalversion does not display anything on my box - the comments in the first lines of the script explain why: # This scripts adds local version information from the version # control systems git, mercurial (hg) and subversion (svn). This means: scripts/setlocalversion only extracts version information from version control systems, but *not* from the .config file! Therefore, it ignores the CONFIG_LOCALVERSION entry completely. Which means, get_version() does too little! It does not acquire all the required information to emulate the effect of "make kernelrelease". Here is some wisdom from the Linux Makefile (version 2.6.30): # Build the kernel release string # # The KERNELRELEASE value built here is stored in the file # include/config/kernel.release, and is used when executing several # make targets, such as "make install" or "make modules_install." # # The eventual kernel release string consists of the following fields, # shown in a hierarchical format to show how smaller parts are concatenated # to form the larger and final value, with values coming from places like # the Makefile, kernel config options, make command line options and/or # SCM tag information. # # $(KERNELVERSION) # $(VERSION) eg, 2 # $(PATCHLEVEL) eg, 6 # $(SUBLEVEL) eg, 18 # $(EXTRAVERSION) eg, -rc6 # $(localver-full) # $(localver) # localversion* (files without backups, containing '~') # $(CONFIG_LOCALVERSION) (from kernel config setting) # $(localver-auto) (only if CONFIG_LOCALVERSION_AUTO is set) # ./scripts/setlocalversion (SCM tag, if one exists) # $(LOCALVERSION) (from make command line if provided) # # Note how the final $(localver-auto) string is included *only* if the # kernel config option CONFIG_LOCALVERSION_AUTO is selected. Also, at the # moment, only git is supported but other SCMs can edit the script # scripts/setlocalversion and add the appropriate checks as needed. So, obviously there is indeed *much* more to do than just running ./scripts/setlocalversion! But fortunately, this string is already constructed and written to the file "include/config/kernel.release" when the kernel has been built. In this case, the Makefile uses a simpler sequence to actually do things: # Read KERNELRELEASE from include/config/kernel.release (if it exists) KERNELRELEASE = $(shell cat include/config/kernel.release 2> /dev/null) KERNELVERSION = $(VERSION).$(PATCHLEVEL).$(SUBLEVEL)$(EXTRAVERSION) All we have to do now, is put the same logic into the eclass! Created attachment 241709 [details, diff]
Patch fixing the problem
GOT IT! Attached is a patch for fixing the problem!
Apply it with the following command:
$ patch /usr/portage/eclass/linux-info.eclass linux-info_eclass-bug_331467.patch
With this patch applied, I successfully built ati-drivers and truecrypt. All the kernel modules are installed into the right directories now.
:-)
The command in the previous post got line-wrapped. Here it is again, manually broken into two lines: $ patch /usr/portage/eclass/linux-info.eclass \ linux-info_eclass-bug_331467.patch Can someone test this patch with a kernel 2.4.37 or 2.4.37.5 please? It's in Portage, so it should be supported. And I have no idea whether 2.4's build system differs from 2.6's. And if so, how much. Your tree is out of date so you didn't get the fixes from bug #323717. Please sync and retest. *** This bug has been marked as a duplicate of bug 323717 *** |
When emerging a package which uses "inherit linux-mod" for installing Linux kernel modules, the output of "make kernelversion" seems to be used for determining into which subdirectory of /lib/modules the new kernel modules should be installed, rather than the output of "make kernelrelease" as it should be. If the CONFIG_LOCALVERSION kernel configuration parameter has been used for configuring, the output of "make kernelversion" and "make kernelrelease" will be different. In this case, the newly created kernel modules will be installed into an incorrect directory where the kernel will not find them, leading to all sorts of problems. Reproducible: Always Steps to Reproduce: 1. Build and install a kernel and put some text in the CONFIG_LOCALVERSION parameter, such as "-test" 2. Emerge a package which installs its kernel module using linux-mod.eclass, for instance ati-drivers. 3. Look where the ebuild has installed the kernel module: It will not be beneath /lib/modules/*-test/ as it should be, but instead beneath a directory with the same path name except for the "-test" basename suffix which is missing as if no CONFIG_LOCALVERSION had been defined. Actual Results: The kernel modules will be installed into the wrong directory. Expected Results: The kernel modules should be installed into the correct directory. The correct sequence to find the directory where to install a kernel under Gentoo (as it has been done in the past, and according to my understanding) is as follows: test -n "${ROOT}" || ROOT=/ . ${ROOT}/etc/make.conf test -n "${KERNEL_DIR}" || KERNEL_DIR=usr/src/linux test -n "${KBUILD_OUTPUT}" || KBUILD_OUTPUT=${KERNEL_DIR} krelease=`cd "${ROOT}${KBUILD_OUTPUT}" && make kernelrelease | tail -n1` kmod_install_dir=${ROOT}lib/modules/${krelease} echo "Please install kernel modules into '${kmod_install_dir}'." Whatever linux-mod.eclass actually does, it is no longer this. Also note that this is a new bug, because linux-mod.eclass worked correctly until some time ago. I suspect this bug might also the cause for several other bugs, such as #182642, #194765, #212745, and #294251. I have at least verified two packages to actually install into wrong directories because of this bug: ati-drivers-10.7 and truecrypt-4.3a.