Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 58294 - nvidia-kernel is not sysfs enabled on a KBUILD_OUTPUT-ed kernel
Summary: nvidia-kernel is not sysfs enabled on a KBUILD_OUTPUT-ed kernel
Status: RESOLVED NEEDINFO
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All All
: High normal (vote)
Assignee: X11 External Driver Maintainers
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-07-25 07:45 UTC by Georgi Georgiev
Modified: 2006-02-25 01:18 UTC (History)
2 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
1.0.6111-koutput.patch (1.0.6111-koutput.patch,389 bytes, patch)
2004-10-31 23:41 UTC, Georgi Georgiev
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Georgi Georgiev 2004-07-25 07:45:31 UTC
The kernel modules installed by nvidia-kernel are not sysfs enabled (therefore udev intolerant) if the kernel tree is using KBUILD_OUTPUT. I just confirmed this by compiling the modules against kernel trees with identical configurations, but one of them (the one that produced non-sysfs modules) was KBUILD_OUTPUT enabled.

This is with nvidia-kernel-1.6106 and
Portage 2.0.50-r9 (default-x86-2004.0, gcc-3.3.3, glibc-2.3.3.20040420-r0, 2.6.7-d1)
Comment 1 Georgi Georgiev 2004-07-25 10:51:54 UTC
And here is the fix:

--- NVIDIA_kernel-1.0-6106-koutput-support.patch.bak    2004-07-26 02:44:23.582892643 +0900
+++ NVIDIA_kernel-1.0-6106-koutput-support.patch        2004-07-26 02:44:29.101289733 +0900
@@ -85,7 +85,7 @@
  select_makefile:
 --- conftest.sh.old    2004-07-01 13:54:41.750507456 +1000
 +++ conftest.sh        2004-07-01 13:55:08.910378528 +1000
-@@ -7,16 +7,18 @@
+@@ -7,16 +7,19 @@
  
  CC="$1"
  ISYSTEM=`$CC -print-file-name=include`
@@ -100,6 +100,7 @@
 --I $HEADERS -I $HEADERS/asm/mach-default \
 +-I $SOURCE_HEADERS -I $SOURCE_HEADERS/asm/mach-default \
 +-I $BUILT_HEADERS -I $BUILT_HEADERS/../include2/asm/mach-default \
++-I $BUILT_HEADERS/../include2
  -Wimplicit-function-declaration"
  
 -case "$3" in

The reason: conftest.sh was failing on one of the tests -- the one related to creating devices. It was failing because the compiler was failing with "no such file asm/posix_types.h". The reason for that -- the file is found at this path $KOUTPUT/include2/asm/posix_types.h, which is linked as appropriate, but include2 was not given as a -I.

The above patch for the patch I tested and it works.
Comment 2 Andrew Bevitt 2004-07-25 22:17:07 UTC
Erm can you provide a log of the error.. and ensure that you are using the ebuilds in portage because honestly I cannot make this happen, and I tested about five different koutput / non-koutput kernel pairs before commiting in the first place and didnt have one problem with koutput because the kernel makefiles are meant to handle the includes..
Comment 3 Georgi Georgiev 2004-07-25 23:50:30 UTC
What kind of log do you want? The problem is that conftest.sh fails to check for "class_simple_create". It silently fails to build the conftest.c program, all output is redirected to /dev/null, it removes all traces of the output and sources files, and returns the appropriate value that means class_simple_create not available.

However, here is how I found the problem on my machine:

# cd /usr/portage/media-video/nvidia-kernel

# config-kernel --set-symlink 2.6.7 # this is a koutput-enabled tree
# ebuild nvidia-kernel-1.0.6106.ebuild clean unpack
# vim /var/tmp/portage/nvid*...../conftest.sh
I changed the compilation line to these two lines, so I can observe the output. I will attach the output below
        echo $CC $CFLAGS -c conftest$$.c > _out$$.cmd
        $CC $CFLAGS -c conftest$$.c > _compile$$.log 2>&1
# ebuild nvidia-kernel-1.0.6106.ebuild compile
# strings /var/tmp/port..../nvidia.ko > /tmp/nvidia.strings.koutput

# config-kernel --set-symlink 2.6.7-d1 # the non-koutput equivalent of the running kernel
# repeat same steps as above

I first compared the output of the two "strings" commands:
# diff /tmp/nvidia/strings*
3815a3816
>  [^_
4797a4799,4800
> nvidiactl
> nvidia%d
4885a4889
> NVRM: class_simple creation failed
4990a4995
> class_simple_create
5011a5017
> class_simple_device_remove
5032a5039
> class_simple_destroy
5049a5057
> class_simple_device_add

Then, we can take a look at those _compile*.log and _out files:
# non-koutput version: two sets of outputs

.cmd: gcc -D__KERNEL__ -Werror -nostdinc -isystem /usr/lib/gcc-lib/i686-pc-linux-gnu/3.3.3/include -I //usr/src/linux/include -I //usr/src/linux/include/asm/mach-default -I //usr/src/linux/include -I //usr/src/linux/include/../include2/asm/mach-default -Wimplicit-function-declaration -c conftest5296.c

.log: empty

# koutput version: three sets of outputs

.cmd: gcc -D__KERNEL__ -Werror -nostdinc -isystem /usr/lib/gcc-lib/i686-pc-linux-gnu/3.3.3/include -I //usr/src/linux/include -I //usr/src/linux/include/asm/mach-default -I ///var/tmp/linux-build/2.6.7-d1/include -I ///var/tmp/linux-build/2.6.7-d1/include/../include2/asm/mach-default -Wimplicit-function-declaration -c conftest4638.c

.log: (head -n 15)
In file included from //usr/src/linux/include/linux/types.h:13,
                 from //usr/src/linux/include/linux/kobject.h:18,
                 from //usr/src/linux/include/linux/device.h:16,
                 from conftest4638.c:1:
//usr/src/linux/include/linux/posix_types.h:47:29: asm/posix_types.h: No such file or directory
In file included from //usr/src/linux/include/linux/kobject.h:18,
                 from //usr/src/linux/include/linux/device.h:16,
                 from conftest4638.c:1:
//usr/src/linux/include/linux/types.h:14:23: asm/types.h: No such file or directory
In file included from //usr/src/linux/include/linux/kobject.h:18,
                 from //usr/src/linux/include/linux/device.h:16,
                 from conftest4638.c:1:
//usr/src/linux/include/linux/types.h:18: error: syntax error before "__kernel_dev_t"
//usr/src/linux/include/linux/types.h:18: warning: data definition has no type or storage class
//usr/src/linux/include/linux/types.h:21: error: syntax error before "dev_t"
Comment 4 Andrew Bevitt 2004-07-26 06:14:55 UTC
Georgi you are a champ.. I posted my previous comment while at uni and couldnt quite look at what you meant, but ive tested it at home now and ive just commited the thing into cvs... 

Wait until nvidia-kernel-1.0.6106-r1 hits the tree and its all there :)

Report success / failure using this new version too please.
Comment 5 Andrew Bevitt 2004-08-05 18:13:01 UTC
Fixed.

And also included in 6111 by nvidia.
Comment 6 Georgi Georgiev 2004-10-31 23:41:47 UTC
Created attachment 43060 [details, diff]
1.0.6111-koutput.patch

The new ebuilds that have the sysfs feature included by nVidia do not work.

Compiling the module for a 2.6.7 kernel creates a sysfs unaware module.
Compiling the module for a 2.6.10_ kernel creates an unloadable module (nvidia:
Unknown symbol pci_find_class)

Problem is just as before: conftesh.sh breaks tests, because it does not
include the proper headers.

The attached one-line patch fixes the problems.
Comment 7 Georgi Georgiev 2004-10-31 23:42:17 UTC
... and reopening ...
Comment 8 Andrew Bevitt 2004-11-06 15:39:32 UTC
$OUPUT/include2/asm -> $HEADERS/include/asm

and there is a patch to fix the pci_find_class its been in there for a while, make sure you are using the latest ebuild.

On that matter, im about to commit 6629 aswell, so you might want to test it.
Comment 9 Georgi Georgiev 2004-11-06 16:21:11 UTC
I am not trying to be annoying or anything, but the fix you've commited *does not* work for me. The patch that I sent *does* work.

I used "ebuild nvidia-1.0.6111-r2.ebuild unpack" to unpack the source, I checked that conftest.sh does indeed contain "$HEADERS/include/asm" as you suggested, I used "ebuild nvidia-1.0.6111-r2.ebuild compile" to compile the thing. The bunch of errors like

/var/tmp/linux-build/2.6.7-d1/include2/asm/mpspec.h:8: error: `MAX_MP_BUSSES' undeclared here (not in a function)
/var/tmp/linux-build/2.6.7-d1/include2/asm/mpspec.h:9: error: `MAX_MP_BUSSES' undeclared here (not in a function)
/var/tmp/linux-build/2.6.7-d1/include2/asm/mpspec.h:10: error: `MAX_MP_BUSSES' undeclared here (not in a function)
/var/tmp/linux-build/2.6.7-d1/include2/asm/mpspec.h:12: error: `MAX_MP_BUSSES' undeclared here (not in a function)
/var/tmp/linux-build/2.6.7-d1/include2/asm/mpspec.h:19: error: `MAX_APICS' undeclared here (not in a function)

did not look good. It didn't stop the compilation from completing, though. I ran "strings nvidia.ko | grep class" on the created module and

# strings nvidia.ko | grep class
04Jpci_find_class

Well, doesn't look good. Now, I changed "$HEADERS/include/asm" to "$OUTPUT/include2/asm/mach-default", ran ebuild ... compile, and what do you know:

# strings nvidia.ko | grep class
NVRM: class_simple creation failed
class_simple_create
04Jpci_find_class
class_simple_device_remove
class_simple_destroy
class_simple_device_add

I synced and did this 10 minutes ago.

I'll wait for the new ebuild to see if it works (i.e. if nvidia fixed it upstream).
Comment 10 Georgi Georgiev 2004-11-06 16:52:58 UTC
Just to make sure that we're speaking about the same thing.
This is what happens here:

# ebuild nvidia-1.0.6111-r2.ebuild unpack
<snip output>
# cd /var/tmp/portage/..../src/nv
# sh conftest.sh gcc /usr/src/linux-2.6.7 \
  /var/tmp/linux-build/2.6.7-d1 class_simple_create
0  <--- unexpected behavior
# sh conftest.sh gcc /usr/src/linux-2.6.7 \
  /var/tmp/linux-build/2.6.7-d1 remap_range
<--- no output is unexpected behavior
# sed -e '20aCFLAGS="$CFLAGS -I$OUTPUT/include2/asm/mach-default"' \
  -i conftest.sh
# sh conftest.sh gcc /usr/src/linux-2.6.7 \
  /var/tmp/linux-build/2.6.7-d1 class_simple_create
1  <---- expected behavior
# sh conftest.sh gcc /usr/src/linux-2.6.7 \
>   /var/tmp/linux-build/2.6.7-d1 remap_range
5  <---- expected behavior (I guess)
Comment 11 Georgi Georgiev 2004-11-06 17:09:59 UTC
And to finish this off, I just tried 1.0-6229 from source.
When I tried it as it is, it did not even compile.
When I added the $OUTPUT/include2/asm/mach-default fix, it compiled correctly.

Without the fix, conftest.sh was failing the class_simple_create and the remap_page_range tests.

It didn't compile on a 2.6.10 kernel, but I guess that's not relevant. At least conftest.sh was passing the proper tests when it had $OUTPUT/include2/...
Comment 12 Andrew Bevitt 2004-11-06 18:03:01 UTC
Wait, ok i see what you're saying.

/usr/src/linux/include/asm doesnt get created as a symlick to the correct asm-* directory if you are using koutput. In which case the patch is valid, sorry kinda forgot about that.

But as you are using 2.6.7 you should be using pci_find_class, as it wasnt until a later kernel that pci_find_class -> pci_get_class ... 

Anyway, 6629 just went into cvs, with these fixes, as did 6111-r3 (which is now stable version) -- The fact that it doesnt work with 2.6.10 is basically mm-sources fault (i think), use the ebuilds in cvs, they work.
Comment 13 Georgi Georgiev 2004-11-07 23:12:30 UTC
OK, I tested 1.0.6629 and it works fine on a 2.6.7 kernel.
Comment 14 Andrew Bevitt 2004-11-09 17:43:32 UTC
Thanks, closing (hopefully for the last time now :))
Comment 15 Georgi Georgiev 2005-06-06 22:24:58 UTC
Guys, it broke again (1.0.7664). I see the comment that the new ebuild is not
complete or something, and I am guessing that the reason to comment out

        # epatch ${NV_PATCH_PREFIX//7174/7167}-conftest-koutput-includes.patch

is because it does not apply, but please, please, pretty please fix it. The
commented-out patch applies cleanly if you remove the second hunk, and since
it's the first hunk that is important, it's absolutely trivial to get a
*working* ebuild in the tree.
Comment 16 Giuliano Gagliardi 2005-07-09 03:13:02 UTC
I have the same problem with 1.0.7667
Comment 17 Georgi Georgiev 2005-07-09 07:08:07 UTC
OK, the new problem is not entirely related to the old bug. It seems that sysfs
support disappeared in 7664 completely, so there is no fixing of the problem (or
I am misreading something).

However, the current problem is that 7664+ do not even *compile* on a
KBUILD_OUTPUT kernel. The fix is the same, so I think you could as well apply it.

Just to make sure you have it before you, the fix is this:

--- ./conftest.sh       2004-11-07 12:20:02.776660256 +1100
+++ ./conftest.sh       2004-11-07 12:23:32.432787680 +1100
@@ -17,7 +17,7 @@

 if [ "$OUTPUT" != "$SOURCES" ]; then
     CFLAGS="$CFLAGS -I$OUTPUT/include2 -I$OUTPUT/include \
--I$HEADERS -I$HEADERS/asm/mach-default"
+-I$HEADERS -I$OUTPUT/include2/asm/mach-default"
 else
     CFLAGS="$CFLAGS -I$HEADERS -I$HEADERS/asm/mach-default"
 fi
Comment 18 Kris Kersey (RETIRED) gentoo-dev 2006-01-04 07:03:10 UTC
Is this still borken in 1.0.8178?
Comment 19 Kris Kersey (RETIRED) gentoo-dev 2006-02-24 20:20:51 UTC
No one responded to my request at the beginning on Jan.  Marking Resolved, NEEDINFO.
Comment 20 Georgi Georgiev 2006-02-25 01:18:23 UTC
Apologies for the late reply. Just checked that 8178-r3 works fine.