751382 – media-libs/opensubdiv[cuda]: requires <gcc-9

Bug 751382 - media-libs/opensubdiv[cuda]: requires <gcc-9

Summary: media-libs/opensubdiv[cuda]: requires <gcc-9

Status:	RESOLVED FIXED

Alias:	None

Product:	Gentoo Linux
Classification:	Unclassified
Component:	Current packages (show other bugs)
Hardware:	All Linux

Importance:	Normal normal with 1 vote (vote)
Assignee:	Adrian

URL:
Whiteboard:
Keywords:	PullRequest

Duplicates (1):	819897 (view as bug list)
Depends on:
Blocks:

Reported:	2020-10-26 18:45 UTC by Alexey Korepanov
Modified:	2021-11-22 14:11 UTC (History)
CC List:	7 users (show)

See Also:	https://github.com/gentoo/gentoo/pull/18516 https://github.com/gentoo/gentoo/pull/18663 https://github.com/gentoo/gentoo/pull/22852
Package list:
Runtime testing required:	---

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Alexey Korepanov 2020-10-26 18:45:38 UTC

Hi. Opensubdiv requires <gcc-9 as a default compiler and gives an error message otherwise that cuda requires gcc < 9. This is outdated: cuda-11 has gcc-9 in the compatibility list.

pkg_pretend() {
	if use cuda; then
		[[ $(gcc-major-version) -gt 8 ]] && \
		eerror "USE=cuda requires gcc < 9. Run gcc-config to switch your default compiler" && \
		die "Need gcc version earlier than 9"
	fi
	[[ ${MERGE_TYPE} != binary ]] && use openmp && tc-check-openmp
}

Reproducible: Always

Comment 1 Adrian 2020-12-05 11:46:22 UTC

I agree that users of cuda-11 should be permitted to use later versions of gcc, but the fix is not so simple.

For example the graphics card in one of my systems only supports sm_30 which was deprecated in cuda 11. Cuda 10 will support sm_30, but only allows gcc-8. When I have selected gcc-9 with gcc-config on that system, opensubdiv fails to compile as nvcc fails with /opt/cuda/include/crt/host_config.h:138:2: error: #error -- unsupported GNU version! gcc versions later than 8 are not supported!

Currently the requirements are
<=nvidia-cuda-toolkit-9.0  ( <=sys-devel/gcc-5.4 )
nvidia-cuda-toolkit-9.1 ( <=sys-devel/gcc-6.4 )
nvidia-cuda-toolkit-9.2 to 10.1 ( <=sys-devel/gcc-7.3 )
nvidia-cuda-toolkit 10.1 to 10.2 ( <=sys-devel/gcc-8.4 )
nvidia-cuda-toolkit-11.0 ( <=sys-devel/gcc-9.3 )
nvidia-cuda-toolkit-11.1 ( <=sys-devel/gcc-10.4 )

Putting this testing in every ebuild that uses nvcc at build time is fragile and will be easily broken by updates to nvidia-cuda-toolkit. The user has already been warned about this in a message that appears when installing nvidia-cuda-toolkit.

I plan to remove the gcc version checking from pkg_pretend, as the problem is not with opensubdiv itself, but whether the currently selected version of gcc supports the installed version of nvidia-cuda-toolkit.

If users do require selection of different gcc versions for different packages (eg. chromium requires >= gcc 9.2, but nvidia-cuda-toolkit-10 requires gcc < 9), they can select the latest version with gcc-config, and then override the gcc version on a per package basis using

/etc/portage/env/gcc-8.4.0.conf
PATH="/usr/x86_64-pc-linux-gnu/gcc-bin/8.4.0:${PATH}"
CC="/usr/x86_64-pc-linux-gnu/gcc-bin/8.4.0/gcc"
CXX="/usr/x86_64-pc-linux-gnu/gcc-bin/8.4.0/g++"

and

/etc/portage/package.env/oldgcc
dev-util/nvidia-cuda-toolkit gcc-8.4.0.conf
media-libs/opensubdiv gcc-8.4.0.conf

Comment 2 Alexey Korepanov 2020-12-05 11:53:06 UTC

I agree with this problem is outside the scope of opensubdiv. It should be solved in cuda ebuild, if possible at all. Opensubdiv should not even make these checks.

Comment 3 Martin Rott 2021-01-21 09:35:44 UTC

Hello, 
to me it seems that opensubdiv ebuild really shouldn't tell which version of gcc I should have. This is between cuda and cuda-toolkit and you should only depend on e.g. specific version of cuda toolkit, if opensubdiv needs that(it probably doesn't). Removing mentioned part in pkg_pretend helps.

Comment 4 Adrian 2021-01-21 10:33:35 UTC

Opensubdiv can work with any version of cuda, providing that the card supports it and the correct version of gcc is also used. But if the user selects an incompatible version using gcc-config even after building, then opensubdiv fails at runtime (for example when called from blender) There is a message warning about this in the nvidia-cuda-toolkit ebuild as you mention.

In pull request 18663, I have removed the checks from package pretend which allows you to attempt to compile opensubdiv with the current gcc version.

I have placed checks and a warning message in src_configure so that if you attempt to use an unsupported version a message will be given to aid troubleshooting should the build fail.

In that PR I also add the ability to specify which compute capabilities to compile against, as that must also be compatible with the chosen nvidia-cuda-toolkit. I hope it gets merged soon as it allows you to optimise your system based on your actual graphics card. I would appreciate any feedback

Comment 5 Martin Rott 2021-01-21 11:34:07 UTC

Adrian, needed to solve this so I stumbled upon version from CG overlay and found interesting patches there, taking care (not only) about mentioned sm_30 issue.. Maybe this can help: https://github.com/brothermechanic/cg/tree/master/media-libs/opensubdiv 
Simply removing the gcc version check can easily end up in same(or different) sm_30 missing error as you stated before(sthg between nvcc and opensubdiv), pardon my former comment..

Comment 6 Joonas Niilola gentoo-dev

2021-03-08 14:35:16 UTC

Can we perhaps start thinking about turning the feature off if it can't be made to work with later GCCs? This is impossible to test. GCC-10 is already current stable!

Comment 7 Adrian 2021-03-10 00:11:31 UTC

Being able to use cuda for opensubdiv is important as having gpu accelerated subdivision surface rendering allows more rapid computation for creating smoothed curves which are used on almost every sculpted model, and is especially useful when creating cloth simulation for animation, or rendering large scenes in blender.

The issue is nvidia-cuda-toolkit's support for gcc, not opensubdiv which can work with new and old versions of gcc providing the user has an appropriate graphics card and version of the toolkit installed.

For example opensubdiv[cuda] can work with gcc 10, but it requires nvidia-cuda-toolkit 11, and this requires a graphics card of at least Nvidia Pascal architecture.

Users of older graphics cards will need to use previous versions of nvidia-cuda-toolkit, which are still in the tree, but only support up to gcc 8. As this includes my laptop I am ensuring that this configuration still works.

Comment 8 Kelly Hirai 2021-10-13 17:10:31 UTC

can we add a use flag like nogcccheck or usegcc8?

Comment 9 Adam Stylinski 2021-10-17 15:22:35 UTC

(In reply to Adrian from comment #7)
> Being able to use cuda for opensubdiv is important as having gpu accelerated
> subdivision surface rendering allows more rapid computation for creating
> smoothed curves which are used on almost every sculpted model, and is
> especially useful when creating cloth simulation for animation, or rendering
> large scenes in blender.
> 
> The issue is nvidia-cuda-toolkit's support for gcc, not opensubdiv which can
> work with new and old versions of gcc providing the user has an appropriate
> graphics card and version of the toolkit installed.
> 
> For example opensubdiv[cuda] can work with gcc 10, but it requires
> nvidia-cuda-toolkit 11, and this requires a graphics card of at least Nvidia
> Pascal architecture.
> 
> Users of older graphics cards will need to use previous versions of
> nvidia-cuda-toolkit, which are still in the tree, but only support up to gcc
> 8. As this includes my laptop I am ensuring that this configuration still
> works.

Minor correction, it requires Maxwell or greater cards.  Nvidia has not yet deprecated Maxwell, but Kepler and below have been basically deprecated by Nvidia upstream.  SM_50 is the generation I'm on, thankfully.  Though, it might be a while before I can get my hands on something better.

Comment 10 Sam James archtester

2021-10-24 03:25:44 UTC

*** Bug 819897 has been marked as a duplicate of this bug. ***

Comment 11 Larry the Git Cow gentoo-dev

2021-11-22 14:11:11 UTC

The bug has been closed via the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=fc0a2d9cd04c458e48543abea41bba7882913e93

commit fc0a2d9cd04c458e48543abea41bba7882913e93
Author:     Alexander Golubev <fatzer2@gmail.com>
AuthorDate: 2021-11-06 23:14:33 +0000
Commit:     Joonas Niilola <juippis@gentoo.org>
CommitDate: 2021-11-22 14:10:19 +0000

    media-libs/opensubdiv: use cuda eclass
    
    * Utilize cuda eclass and let it handle gcc selection instead of forcing
      an outdated version.
    * Add a fix to provide sane defaults when compiling against a recent
      enough CUDA versions.
    * Add an option to pass user-specified NVCCFLAGS and prevent cmake from
      overriding them.
    
    Closes: https://bugs.gentoo.org/744517
    Closes: https://bugs.gentoo.org/751382
    Signed-off-by: Alexander Golubev <fatzer2@gmail.com>
    Closes: https://github.com/gentoo/gentoo/pull/22852
    Signed-off-by: Joonas Niilola <juippis@gentoo.org>

 ...opensubdiv-3.4.4-add-CUDA11-compatibility.patch | 19 +++++
 media-libs/opensubdiv/opensubdiv-3.4.4-r2.ebuild   | 93 ++++++++++++++++++++++
 2 files changed, 112 insertions(+)