Summary: | media-libs/opencv-2.4.3 with USE=cuda : /opt/cuda/bin/nvcc compiles in /tmp which results in "No space left on device" | ||
---|---|---|---|
Product: | Gentoo Linux | Reporter: | Daniel Troeder <daniel> |
Component: | Current packages | Assignee: | Andreas K. Hüttel <dilfridge> |
Status: | RESOLVED OBSOLETE | ||
Severity: | normal | CC: | amynka, juantxorena, kde, martin.dummer, tom |
Priority: | Normal | ||
Version: | unspecified | ||
Hardware: | AMD64 | ||
OS: | Linux | ||
Whiteboard: | |||
Package list: | Runtime testing required: | --- | |
Attachments: | media-libs:opencv-2.4.3:20121109-105730.log.bz2 |
Description
Daniel Troeder
2012-11-09 11:26:13 UTC
Created attachment 329004 [details]
media-libs:opencv-2.4.3:20121109-105730.log.bz2
compile log (bzip2'd)
How is this a bug if your space is running out? Sounds like a job for the system admin to resize the partition. You should be using FEATURES='sandbox' to avoid compilations all over your running system. Or what Samuli said. The temporary files of nvcc *should* go to $TEMP, which is set by portage to the ebuild temp dir. Why this does not work is a mystery to me; unfortunately I cannot test it since I've got gcc-4.7. :( Do you actually have no space left on /tmp, or is this a "fake error" generated by the sandbox? I can confirm this bug. The space runs out in /tmp, which is 1GB on my system. My FEATURES are: FEATURES="assume-digests binpkg-logs compressdebug config-protect-if-modified distlocks ebuild-locks fakeroot fixlafiles merge-sync news parallel-fetch preserve-libs protect-owned sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch xattr" so using "sandbox" does not help When pausing the emerge just in the middle I can see a process like: /opt/cuda/bin/nvcc /var/tmp/portage/media-libs/opencv-2.4.3/work/OpenCV-2.4.3/modules/gpu/src/cuda/pyr_down.cu .... which starts a child process cudafe --m64 --gnu_version=40603 -tused --no_remove_unneeded_entities --gen_c_file_name /tmp/tmpxft_00007873_00000000-6_pyr_down.compute_20.cudafe1.c --stub_file_name /tmp/tmpxft_00007873_00000000-6_pyr_down.compute_20.cudafe1.stu and these files /tmp/tmpxft_<some-randomness> really exist in /tmp... So someone has to convince "nvcc" to use a different temp directory - but a quick "nvcc --help" gave no hint to me. Any ideas (except the one from Comment #2)??? My /tmp is on a 512MB tmpfs. I grew it, and could compile. The peek disk usage by the compilation was 2.4 GB in /tmp (plus 2 GB in PORTAGE_TMPDIR). There are dozens of files >30MB with names like /tmp/tmpxft_00000808_00000000-27_row_filter.compute_20.cpp3.i /tmp/tmpxft_00000808_00000000-30_row_filter.compute_12.cudafe2.gpu /tmp/tmpxft_00000f28_00000000-11_column_filter.compute_11.ptx /tmp/tmpxft_00001a54_00000000-3_element_operations.fatbin.c /tmp/tmpxft_00000808_00000000-11_row_filter.compute_11.ptx /tmp/tmpxft_00000f28_00000000-9_column_filter.compute_30.ptx /tmp/tmpxft_00001a54_00000000-49_element_operations.compute_11.ii that seem to be computer generated C and CUDA source files. PS: I enabled FEATURES="sandbox", but it didn't help, /tmp/tmpxft_* are still created. Not a regression. I confirm this bug. Also, I fail to see how this is not a regression, since I could compile opencv with the same useflags in the past and now I can't. The fact that this may not be an important bug because it affects only to people with a very specific configuration doesn't mean that is not a regression. I've finally compiled this. For that, I had to mount /tmp somewhere else with more space, so it didn't thrown me the error. Anyway, the compilation time was incredibly high. According to genlop, it lasted 3 hours and 46 minutes, whereas the older versions with the same useflags and same everything were about 10 minutes (13 minutes the longer one, the previous stable version, 2.3.1a-r1). Definitely there is something fishy going on there. I agree this is not a bug, at least not in Gentoo. Long compile time is a little wierd, but I suspect it is due to compiling for all CUDA versions, see http://answers.opencv.org/question/5090/why-opencv-building-is-so-slow-with-cuda/ for example. But this is really up to opencv developers to say if it is a bug or not. Creating files in /tmp is completely normal and OK. Maybe the ebuild should check if there is enough space, but it is hard to say what is "enough", as it depends mainly on your -jX setting in MAKEOPTS. I suggest compiling with MAKEOPTS="-j1" for a smaller footprint. If that is not enough, then - *sigh* - your /tmp just need to be larger. Are the devs on me with this one, or am I way out of line here? :) Cheers and pardon my English, Tom The compilation for every single GPU arch possible is quite stupid IMHO. If we're going to have this the gentoo way, we should only compile what we need. According to that single link that Tomáš give us, there's an easy workaround for this. Maybe we should have some kind of USE flag, or variable, or something, to compile only the version we need. As for the other thing, the tmp problem, I dunno. I doubt that the devs do something which isn't done upstream. I agree, some USE_EXPAND flag would be nice here. One have to know his HW, but that is why we love Gentoo :) (In reply to comment #11) > I agree this is not a bug, at least not in Gentoo. > ... > > Creating files in /tmp is completely normal and OK. Maybe the ebuild should > check if there is enough space, but it is hard to say what is "enough", as > it depends mainly on your -jX setting in MAKEOPTS. I do not agree, and if you scroll back and read comment #4 and my analyze in #5 again you also should be convinced that it IS a bug. Let me repeat in short: the nvcc compiler should use the content of the TEMP environment variable for his tempfiles which is set by portage to TEMP=/var/tmp/portage/media-libs/opencv-2.4.3/temp (in my setup). But he ifnores this and uses /tmp instead. > I suggest compiling with MAKEOPTS="-j1" for a smaller footprint. I will try this suggestion. Well, I see your point, but I was treating it as a compiler. gcc without -pipe option will IMHO do the same thing. @Andreas would help to add functionality (freespace check) of check-reqs.eclass here? Version is not currently in tree.. Amynka |