Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 620422 - crosscompiling in x32 for x32_64: include/linux/skbuff.h:3155:15: error: unknown type name 'vgid'
Summary: crosscompiling in x32 for x32_64: include/linux/skbuff.h:3155:15: error: unkn...
Status: RESOLVED CANTFIX
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: Gentoo Linux bug wranglers
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-06-02 11:04 UTC by segmentation fault
Modified: 2019-01-12 19:10 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
config file for x86_64 (.config,150.05 KB, text/plain)
2017-06-02 11:04 UTC, segmentation fault
Details

Note You need to log in before you can comment on or make changes to this bug.
Description segmentation fault 2017-06-02 11:04:30 UTC
Created attachment 474980 [details]
config file for x86_64

Strange...it all happens to me - I wonder why...

I am cross-compiling a 64-bit kernel on a 32-bit box. So far I did:

- rsync / to an external USB disc.
- mount the USB disc partition somewhere.
- mount /proc, /sys and /dev under the mountpoit.
- chroot to the USB disc. 

In the chrooted environment, I did:

env-update
source /etc/profile
export PS1="(chroot) $PS1"

So far, it looks as if I am going to install Gentoo on the USB disc, following the standard installation guide. From now on, things get different though... ;-)

- chroot to the USB disc. From now on, all is done in the chrooted environment.
- Create a local overlay called 'crossdev'.
- emerge the crossdev package
- Run:
  crossdev -S -s1 -oO /usr/local/portage/overlays/crossdev --binutils 2.26.1 --gcc 4.9.3 --kernel 4.9.25 --libc 2.22-r4 -t x86_64

Output:
-------------------------------------------------------------------------------------------------------------------------------
 * crossdev version:      20151026
 * Host Portage ARCH:     x86
 * Target Portage ARCH:   amd64
 * Target System:         x86_64-pc-linux-gnu
 * Stage:                 1 (C compiler only)
 * ABIs:                  amd64

 * binutils:              binutils-2.26.1
 * gcc:                   gcc-4.9.3

 * CROSSDEV_OVERLAY:      /usr/local/portage/overlays/crossdev
 * PORT_LOGDIR:           /var/log/portage
 * PORTAGE_CONFIGROOT:    
 * Portage flags:         
  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  - 
 * leaving sys-devel/binutils in /usr/local/portage/overlays/crossdev
 * leaving sys-devel/gcc in /usr/local/portage/overlays/crossdev
 * leaving sys-kernel/linux-headers in /usr/local/portage/overlays/crossdev
 * leaving sys-libs/glibc in /usr/local/portage/overlays/crossdev
 * leaving sys-devel/gdb in /usr/local/portage/overlays/crossdev
 * leaving metadata/layout.conf alone in /usr/local/portage/overlays/crossdev
  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  - 
 * Log: /var/log/portage/cross-x86_64-pc-linux-gnu-binutils.log
 * Emerging cross-binutils ...                                                                                           [ ok ]
 * Log: /var/log/portage/cross-x86_64-pc-linux-gnu-linux-headers-quick.log
 * Emerging cross-linux-headers-quick ...                                                                                [ ok ]
 * Log: /var/log/portage/cross-x86_64-pc-linux-gnu-glibc-headers.log
 * Emerging cross-glibc-headers ...                                                                                      [ ok ]
 * Log: /var/log/portage/cross-x86_64-pc-linux-gnu-gcc-stage1.log
 * Emerging cross-gcc-stage1 ...                                                                                         [ ok ]


I then changed the /usr/src/linux/.config file (in the chrooted filesystem!) to have:

CONFIG_64BIT=y
#CONFIG_X86_32 is not set

and ran

make oldconfig

which, among others, gave me the possibility to choose:

CONFIG_IA32_EMULATION=y

I then ran:

genkernel --kernel-config=/usr/src/linux/.config --kernel-cross-compile=x86_64-pc-linux-gnu --arch-override=x86_64 --clean all

This compiled an 64-bit bzImage and went on to compile the modules:

make -j2  CROSS_COMPILE="x86_64-pc-linux-gnu-" prepare 
...
make -j2  CROSS_COMPILE="x86_64-pc-linux-gnu-" bzImage
...
make -j2  CROSS_COMPILE="x86_64-pc-linux-gnu-" modules

Module compilation, however, stopped with the error:

  CC [M]  crypto/pcbc.o
In file included from ./include/crypto/algapi.h:18:0,
                 from crypto/pcbc.c:17:
./include/linux/skbuff.h:3155:15: error: unknown type name 'vgid'
 static inline void skb_copy_from_linear_data_offset(const struct sk_buff *skb,
               ^
./include/linux/skbuff.h: In function 'skb_copy_from_linear_data_offset':
./include/linux/skbuff.h:3160:1: warning: no return statement in function returning non-void [-Wreturn-type]
 }
 ^
scripts/Makefile.build:299: recipe for target 'crypto/pcbc.o' failed
make[1]: *** [crypto/pcbc.o] Error 1
Makefile:988: recipe for target 'crypto' failed
make: *** [crypto] Error 2

I don't see a 'vgid' in include/linux/skbuff.h, line 3155, so I am totally lost here...

I attach the .config file for the kernel I am trying to compile. This is the

/usr/src/linux/.config

file of the *crooted* environment (where I also ran the genkernel and other commands above).

Some info on the 'parent' system (the 32-bit system that contains the mountpoint for the chrooted environment):

Portage 2.3.3 (python 3.4.3-final-0, hardened/linux/x86, gcc-4.9.3, glibc-2.22-r4, 4.9.25-gentoo i686)
=================================================================
System uname: Linux-4.9.25-gentoo-i686-Intel-R-_Pentium-R-_4_CPU_3.40GHz-with-gentoo-2.2
Timestamp of repository gentoo: Wed, 03 May 2017 08:15:01 +0000
sh bash 4.3_p48
ld GNU ld (Gentoo 2.25.1 p1.1) 2.25.1
app-shells/bash:          4.3_p48::gentoo
dev-java/java-config:     2.2.0-r3::gentoo
dev-lang/perl:            5.24.1-r1::gentoo
dev-lang/python:          2.7.10-r1::gentoo, 3.4.3-r1::gentoo
dev-util/cmake:           3.7.2::gentoo
dev-util/pkgconfig:       0.28-r2::gentoo
sys-apps/baselayout:      2.2::gentoo
sys-apps/openrc:          0.19.1::gentoo
sys-apps/sandbox:         2.10-r2::gentoo
sys-devel/autoconf:       2.13::gentoo, 2.69::gentoo
sys-devel/automake:       1.4_p6-r2::gentoo, 1.5-r2::gentoo, 1.6.3-r2::gentoo, 1.7.9-r3::gentoo, 1.8.5-r5::gentoo, 1.9.6-r4::gentoo, 1.10.3-r1::gentoo, 1.11.6-r1::gentoo, 1.12.6::gentoo, 1.13.4::gentoo, 1.14.1::gentoo, 1.15::gentoo
sys-devel/binutils:       2.24-r3::gentoo, 2.25.1-r1::gentoo, 2.26.1::gentoo
sys-devel/gcc:            4.3.6-r1::gentoo, 4.4.7::gentoo, 4.8.5::gentoo, 4.9.3::gentoo
sys-devel/gcc-config:     1.7.3::gentoo
sys-devel/libtool:        2.4.6::gentoo
sys-devel/make:           4.1-r1::gentoo
sys-kernel/linux-headers: 3.18::gentoo (virtual/os-headers)
sys-libs/glibc:           2.22-r4::gentoo

P.S. I guess that by now I have answered my own question - it all happens to me because nobody else would do something like all the above to get to a 64-bit kernel... :sigh:
Comment 1 segmentation fault 2017-06-02 11:06:59 UTC
BTW, crossdev compiled binutils 2.27, even though I told it to use 2.26. It just took the latest stable one...
Comment 2 Jonas Stein gentoo-dev 2017-06-10 12:58:34 UTC
It is sad to read that you have problems with the software. The situation seems to be a bit more complicate and requires some analysis.
We can not help you efficiently via bug tracker. The bug tracker aims rather on specific problems in .ebuilds and less on individual systems. 

I have had very good experience on the gentoo IRC [1] with questions like this. Of course there are also forums and mailing lists [2,3].
I hope you understand, that I will close the bug here therefore and wish you good luck on one of the mentioned channels [4].

Please try to find out with the help of the community, which broken ebuild could cause the problem, because we can not assign the ticket else.

Please reopen the ticket in order to provide an indication for an error in the ebuild.

[1] https://www.gentoo.org/get-involved/irc-channels/
[2] https://forums.gentoo.org/
[3] https://www.gentoo.org/get-involved/mailing-lists/all-lists.html
[4] https://www.gentoo.org/support/
Comment 3 segmentation fault 2019-01-12 14:28:50 UTC
More than 1.5 years later, I can finally report the resolution of this issue:

It was a faulty RAM module.

Such issues are hard to resolve, so let me give some info about how I was able to track this down:

BIG mistake on my side: I did not check the source code at the indicated position:

./include/linux/skbuff.h:3155:15: error: unknown type name 'vgid'

Where on earth do you see 'vgid' on line 3155 of include/linux/skbuff.h, at position 15? A healthy eye sees only this: 

void

Not 'vgid', but 'void'.

This alone should have ringed the alarm bells - but it did not occur to me at that time to look at the code in the *work* directory of portage. That's the place where the source code was extracted in order to be prepared, compiled and installed. Had I done this, I would have seen

vgid

instead of 

void

This means: extracting the source code to one place, you get the version with 'void', extracting it to another gives you an identical code tree, EXCEPT just one line that contains a flipped character (in this case: 'o' became 'g').

Extracting the code archive once to a hopefully good place (actually, at this point, there is no way to really know that your extracted-once copy is THE good one, but it turns out that this doesn't matter) and then each time anew in a loop to another place and comparing both with

diff -ruN /path/to/extracted-once/code /path/to/extracted-anew-in-a-loop/code

over *hours* did not show any differences. Even after 50 loops, no differences in the copies. Very frustrating...the error is not reproducible that easy (of course it's not, you have to utilize all your RAM, so that you get chances to catch the faulty RAM bits - but I did not think that far at the time).

The *breakthrough* came when I decided to test the *remaining* RAM with stress-ng, using the 'flip' memory checker (which specially checks for flipped bits RAM). You have to be very careful in specifying enough RAM to test with stress-ng, so that all your physical RAM is utilized - but not more than that, because your machine will suffer from memory starvation, crawl to death or even hang! Running the above extraction and diff inside an endless loop, while at the same time testing the free memory for flipping bits with stress-ng and its 'flip' strategy, revealed errors in both the diff and stress-ng after half an hour of operation: diff started to spit differences in exactly one line, with exactly one character flipped (a *different* character and a *different* line of a *different* program in the code tree in each run of the loop that produced an output (not every run produced differences)!) - and stress-ng was reporting flip errors now and then.

LESSONS:

1. READ the source code where the error is supposed to occur. What do you SEE? Trust your eyes. Be suspicious.
2. To reproduce the error, do the operation that produced it inside a loop.
3. If the error does not surface, push for it by employing a memory stressing tool during operation of 2).


From there, the action to be taken was clear: boot with only one of the three RAM modules, start the tests again and see whether the errors come after, say, 3 hours of operation. If not, chances are the module is good.

This way, I was able to identify the faulty RAM module and exchange it. All is rock solid since then. :-)