The kernel modules installed by nvidia-kernel are not sysfs enabled (therefore udev intolerant) if the kernel tree is using KBUILD_OUTPUT. I just confirmed this by compiling the modules against kernel trees with identical configurations, but one of them (the one that produced non-sysfs modules) was KBUILD_OUTPUT enabled. This is with nvidia-kernel-1.6106 and Portage 2.0.50-r9 (default-x86-2004.0, gcc-3.3.3, glibc-2.3.3.20040420-r0, 2.6.7-d1)
And here is the fix: --- NVIDIA_kernel-1.0-6106-koutput-support.patch.bak 2004-07-26 02:44:23.582892643 +0900 +++ NVIDIA_kernel-1.0-6106-koutput-support.patch 2004-07-26 02:44:29.101289733 +0900 @@ -85,7 +85,7 @@ select_makefile: --- conftest.sh.old 2004-07-01 13:54:41.750507456 +1000 +++ conftest.sh 2004-07-01 13:55:08.910378528 +1000 -@@ -7,16 +7,18 @@ +@@ -7,16 +7,19 @@ CC="$1" ISYSTEM=`$CC -print-file-name=include` @@ -100,6 +100,7 @@ --I $HEADERS -I $HEADERS/asm/mach-default \ +-I $SOURCE_HEADERS -I $SOURCE_HEADERS/asm/mach-default \ +-I $BUILT_HEADERS -I $BUILT_HEADERS/../include2/asm/mach-default \ ++-I $BUILT_HEADERS/../include2 -Wimplicit-function-declaration" -case "$3" in The reason: conftest.sh was failing on one of the tests -- the one related to creating devices. It was failing because the compiler was failing with "no such file asm/posix_types.h". The reason for that -- the file is found at this path $KOUTPUT/include2/asm/posix_types.h, which is linked as appropriate, but include2 was not given as a -I. The above patch for the patch I tested and it works.
Erm can you provide a log of the error.. and ensure that you are using the ebuilds in portage because honestly I cannot make this happen, and I tested about five different koutput / non-koutput kernel pairs before commiting in the first place and didnt have one problem with koutput because the kernel makefiles are meant to handle the includes..
What kind of log do you want? The problem is that conftest.sh fails to check for "class_simple_create". It silently fails to build the conftest.c program, all output is redirected to /dev/null, it removes all traces of the output and sources files, and returns the appropriate value that means class_simple_create not available. However, here is how I found the problem on my machine: # cd /usr/portage/media-video/nvidia-kernel # config-kernel --set-symlink 2.6.7 # this is a koutput-enabled tree # ebuild nvidia-kernel-1.0.6106.ebuild clean unpack # vim /var/tmp/portage/nvid*...../conftest.sh I changed the compilation line to these two lines, so I can observe the output. I will attach the output below echo $CC $CFLAGS -c conftest$$.c > _out$$.cmd $CC $CFLAGS -c conftest$$.c > _compile$$.log 2>&1 # ebuild nvidia-kernel-1.0.6106.ebuild compile # strings /var/tmp/port..../nvidia.ko > /tmp/nvidia.strings.koutput # config-kernel --set-symlink 2.6.7-d1 # the non-koutput equivalent of the running kernel # repeat same steps as above I first compared the output of the two "strings" commands: # diff /tmp/nvidia/strings* 3815a3816 > [^_ 4797a4799,4800 > nvidiactl > nvidia%d 4885a4889 > NVRM: class_simple creation failed 4990a4995 > class_simple_create 5011a5017 > class_simple_device_remove 5032a5039 > class_simple_destroy 5049a5057 > class_simple_device_add Then, we can take a look at those _compile*.log and _out files: # non-koutput version: two sets of outputs .cmd: gcc -D__KERNEL__ -Werror -nostdinc -isystem /usr/lib/gcc-lib/i686-pc-linux-gnu/3.3.3/include -I //usr/src/linux/include -I //usr/src/linux/include/asm/mach-default -I //usr/src/linux/include -I //usr/src/linux/include/../include2/asm/mach-default -Wimplicit-function-declaration -c conftest5296.c .log: empty # koutput version: three sets of outputs .cmd: gcc -D__KERNEL__ -Werror -nostdinc -isystem /usr/lib/gcc-lib/i686-pc-linux-gnu/3.3.3/include -I //usr/src/linux/include -I //usr/src/linux/include/asm/mach-default -I ///var/tmp/linux-build/2.6.7-d1/include -I ///var/tmp/linux-build/2.6.7-d1/include/../include2/asm/mach-default -Wimplicit-function-declaration -c conftest4638.c .log: (head -n 15) In file included from //usr/src/linux/include/linux/types.h:13, from //usr/src/linux/include/linux/kobject.h:18, from //usr/src/linux/include/linux/device.h:16, from conftest4638.c:1: //usr/src/linux/include/linux/posix_types.h:47:29: asm/posix_types.h: No such file or directory In file included from //usr/src/linux/include/linux/kobject.h:18, from //usr/src/linux/include/linux/device.h:16, from conftest4638.c:1: //usr/src/linux/include/linux/types.h:14:23: asm/types.h: No such file or directory In file included from //usr/src/linux/include/linux/kobject.h:18, from //usr/src/linux/include/linux/device.h:16, from conftest4638.c:1: //usr/src/linux/include/linux/types.h:18: error: syntax error before "__kernel_dev_t" //usr/src/linux/include/linux/types.h:18: warning: data definition has no type or storage class //usr/src/linux/include/linux/types.h:21: error: syntax error before "dev_t"
Georgi you are a champ.. I posted my previous comment while at uni and couldnt quite look at what you meant, but ive tested it at home now and ive just commited the thing into cvs... Wait until nvidia-kernel-1.0.6106-r1 hits the tree and its all there :) Report success / failure using this new version too please.
Fixed. And also included in 6111 by nvidia.
Created attachment 43060 [details, diff] 1.0.6111-koutput.patch The new ebuilds that have the sysfs feature included by nVidia do not work. Compiling the module for a 2.6.7 kernel creates a sysfs unaware module. Compiling the module for a 2.6.10_ kernel creates an unloadable module (nvidia: Unknown symbol pci_find_class) Problem is just as before: conftesh.sh breaks tests, because it does not include the proper headers. The attached one-line patch fixes the problems.
... and reopening ...
$OUPUT/include2/asm -> $HEADERS/include/asm and there is a patch to fix the pci_find_class its been in there for a while, make sure you are using the latest ebuild. On that matter, im about to commit 6629 aswell, so you might want to test it.
I am not trying to be annoying or anything, but the fix you've commited *does not* work for me. The patch that I sent *does* work. I used "ebuild nvidia-1.0.6111-r2.ebuild unpack" to unpack the source, I checked that conftest.sh does indeed contain "$HEADERS/include/asm" as you suggested, I used "ebuild nvidia-1.0.6111-r2.ebuild compile" to compile the thing. The bunch of errors like /var/tmp/linux-build/2.6.7-d1/include2/asm/mpspec.h:8: error: `MAX_MP_BUSSES' undeclared here (not in a function) /var/tmp/linux-build/2.6.7-d1/include2/asm/mpspec.h:9: error: `MAX_MP_BUSSES' undeclared here (not in a function) /var/tmp/linux-build/2.6.7-d1/include2/asm/mpspec.h:10: error: `MAX_MP_BUSSES' undeclared here (not in a function) /var/tmp/linux-build/2.6.7-d1/include2/asm/mpspec.h:12: error: `MAX_MP_BUSSES' undeclared here (not in a function) /var/tmp/linux-build/2.6.7-d1/include2/asm/mpspec.h:19: error: `MAX_APICS' undeclared here (not in a function) did not look good. It didn't stop the compilation from completing, though. I ran "strings nvidia.ko | grep class" on the created module and # strings nvidia.ko | grep class 04Jpci_find_class Well, doesn't look good. Now, I changed "$HEADERS/include/asm" to "$OUTPUT/include2/asm/mach-default", ran ebuild ... compile, and what do you know: # strings nvidia.ko | grep class NVRM: class_simple creation failed class_simple_create 04Jpci_find_class class_simple_device_remove class_simple_destroy class_simple_device_add I synced and did this 10 minutes ago. I'll wait for the new ebuild to see if it works (i.e. if nvidia fixed it upstream).
Just to make sure that we're speaking about the same thing. This is what happens here: # ebuild nvidia-1.0.6111-r2.ebuild unpack <snip output> # cd /var/tmp/portage/..../src/nv # sh conftest.sh gcc /usr/src/linux-2.6.7 \ /var/tmp/linux-build/2.6.7-d1 class_simple_create 0 <--- unexpected behavior # sh conftest.sh gcc /usr/src/linux-2.6.7 \ /var/tmp/linux-build/2.6.7-d1 remap_range <--- no output is unexpected behavior # sed -e '20aCFLAGS="$CFLAGS -I$OUTPUT/include2/asm/mach-default"' \ -i conftest.sh # sh conftest.sh gcc /usr/src/linux-2.6.7 \ /var/tmp/linux-build/2.6.7-d1 class_simple_create 1 <---- expected behavior # sh conftest.sh gcc /usr/src/linux-2.6.7 \ > /var/tmp/linux-build/2.6.7-d1 remap_range 5 <---- expected behavior (I guess)
And to finish this off, I just tried 1.0-6229 from source. When I tried it as it is, it did not even compile. When I added the $OUTPUT/include2/asm/mach-default fix, it compiled correctly. Without the fix, conftest.sh was failing the class_simple_create and the remap_page_range tests. It didn't compile on a 2.6.10 kernel, but I guess that's not relevant. At least conftest.sh was passing the proper tests when it had $OUTPUT/include2/...
Wait, ok i see what you're saying. /usr/src/linux/include/asm doesnt get created as a symlick to the correct asm-* directory if you are using koutput. In which case the patch is valid, sorry kinda forgot about that. But as you are using 2.6.7 you should be using pci_find_class, as it wasnt until a later kernel that pci_find_class -> pci_get_class ... Anyway, 6629 just went into cvs, with these fixes, as did 6111-r3 (which is now stable version) -- The fact that it doesnt work with 2.6.10 is basically mm-sources fault (i think), use the ebuilds in cvs, they work.
OK, I tested 1.0.6629 and it works fine on a 2.6.7 kernel.
Thanks, closing (hopefully for the last time now :))
Guys, it broke again (1.0.7664). I see the comment that the new ebuild is not complete or something, and I am guessing that the reason to comment out # epatch ${NV_PATCH_PREFIX//7174/7167}-conftest-koutput-includes.patch is because it does not apply, but please, please, pretty please fix it. The commented-out patch applies cleanly if you remove the second hunk, and since it's the first hunk that is important, it's absolutely trivial to get a *working* ebuild in the tree.
I have the same problem with 1.0.7667
OK, the new problem is not entirely related to the old bug. It seems that sysfs support disappeared in 7664 completely, so there is no fixing of the problem (or I am misreading something). However, the current problem is that 7664+ do not even *compile* on a KBUILD_OUTPUT kernel. The fix is the same, so I think you could as well apply it. Just to make sure you have it before you, the fix is this: --- ./conftest.sh 2004-11-07 12:20:02.776660256 +1100 +++ ./conftest.sh 2004-11-07 12:23:32.432787680 +1100 @@ -17,7 +17,7 @@ if [ "$OUTPUT" != "$SOURCES" ]; then CFLAGS="$CFLAGS -I$OUTPUT/include2 -I$OUTPUT/include \ --I$HEADERS -I$HEADERS/asm/mach-default" +-I$HEADERS -I$OUTPUT/include2/asm/mach-default" else CFLAGS="$CFLAGS -I$HEADERS -I$HEADERS/asm/mach-default" fi
Is this still borken in 1.0.8178?
No one responded to my request at the beginning on Jan. Marking Resolved, NEEDINFO.
Apologies for the late reply. Just checked that 8178-r3 works fine.