Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 349814 - media-video/libav: Enable SPARC and ARM optimizations
Summary: media-video/libav: Enable SPARC and ARM optimizations
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: New packages (show other bugs)
Hardware: All Linux
: High normal with 1 vote (vote)
Assignee: Gentoo Media-video project
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-12-26 18:28 UTC by Matt Turner
Modified: 2017-09-09 20:50 UTC (History)
8 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
enable neon use (ffmpeg-neon-use.patch,2.66 KB, patch)
2011-07-11 22:41 UTC, Yixun Lan
Details | Diff
the default build log (ffmpeg_vfp.txt,38.53 KB, text/plain)
2011-07-12 09:58 UTC, Yixun Lan
Details
the configure output (ffmpeg_config.log,191.11 KB, text/plain)
2011-07-12 10:00 UTC, Yixun Lan
Details
ffmpeg 0.8 log (ffmpeg_config_0.8.log,195.31 KB, text/plain)
2011-07-12 10:15 UTC, Yixun Lan
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Matt Turner gentoo-dev 2010-12-26 18:28:38 UTC
ffmpeg's configure script offers the following flags for ARM optimizations

  --disable-armv5te        disable armv5te optimizations
  --disable-armv6          disable armv6 optimizations
  --disable-armv6t2        disable armv6t2 optimizations
  --disable-armvfp         disable ARM VFP optimizations
  --disable-iwmmxt         disable iwmmxt optimizations
  --disable-mmi            disable MMI optimizations
  --disable-neon           disable neon optimizations

And this flag for SPARC optimizations

  --disable-vis            disable VIS optimizations

but none of these are used in the ffmpeg ebuilds. While all the ARM flags aren't terribly important, NEON certainly is. And VIS for SPARC is important as well.
Comment 1 Alexis Ballier gentoo-dev 2010-12-27 00:53:59 UTC
you probably mean "disable" instead of "enable", right ?

and of course, patches for -9999 are very welcome :=)
Comment 2 Alex Buell 2010-12-27 11:01:19 UTC
if sparc then 
  $opts = $opts + "--enable-vis" 
fi

etc, something like that would be greatly appreciated.
Comment 3 Alexis Ballier gentoo-dev 2011-01-05 15:07:46 UTC
btw, imho, _arch teams_ should provide global useflags for their optimizations if they want them to be controlled by useflags; masked in base/ and unmasked on their arches. They should also specify if they want them enabled or disabled by default.
Comment 4 Raúl Porcel (RETIRED) gentoo-dev 2011-02-12 18:18:16 UTC
you sure this isn't autodetected already?
Comment 5 Siarhei Siamashka 2011-02-12 18:28:46 UTC
(In reply to comment #4)
> you sure this isn't autodetected already?

Still adding support for "neon" USE flag for ARM might be useful. Just to make the users feel warm and fuzzy about it. And also in order to to be able to easily disable NEON for fun or for benchmarking purposes.
Comment 6 Yixun Lan archtester gentoo-dev 2011-07-11 22:41:58 UTC
Created attachment 279809 [details, diff]
enable neon use

I'm using armv7a-hardfloat-linux-gnueabi toolchain.
ffempg-0.7.1 can auto detect neon, but also enable vfp, which will cause problem.
this patch will explicitly enable neon which also disable vfp
Comment 7 Alexis Ballier gentoo-dev 2011-07-11 22:58:41 UTC
well, patches should go to -9999

in this version you should add this to the CPU_FEATURES (alphabetically sorted)

if there is a compatibility problem with some useflags, REQUIRED_USE should solve it


and again, arches haven't provided the use flags and their masks
Comment 8 Siarhei Siamashka 2011-07-12 07:10:48 UTC
(In reply to comment #6)
> ffempg-0.7.1 can auto detect neon, but also enable vfp, which will cause
> problem.

what is the problem with vfp?
Comment 9 Yixun Lan archtester gentoo-dev 2011-07-12 08:51:56 UTC
I'm trying to add neon support, then encounter this bug, see 374915
Comment 10 Siarhei Siamashka 2011-07-12 09:24:13 UTC
(In reply to comment #9)
> I'm trying to add neon support, then encounter this bug, see 374915

USAT from bug 374915 is not a VFP instruction. If disabling VFP helps for some reason, that's certainly not a correct fix, but some side effect.
Comment 11 Yixun Lan archtester gentoo-dev 2011-07-12 09:58:06 UTC
Created attachment 279847 [details]
the default build log

this is the default build log (with enable-neon, disable-armvfp) thus all of above is auto-detect by build system, it seems enable all, neon and armvfp,vfpv3
Comment 12 Yixun Lan archtester gentoo-dev 2011-07-12 10:00:35 UTC
Created attachment 279849 [details]
the configure output
Comment 13 Yixun Lan archtester gentoo-dev 2011-07-12 10:09:59 UTC
attached log (build.log, config.log) is from ffmpeg-0.8(but same problem with ffmpeg-0.7.1)

the snip from config.log
---------------------------
vfpv3=yes
vfpv3_deps=armvfp
armvfp=yes
armvfp_deps=arm
neon=yes
neon_deps=arm


the snip from build.log
----------------------
libavcodec/arm/dsputil_vfp.S:114: Error: selected processor does not support ARM mode `vmulge.f32 s27,s16,s27'
libavcodec/arm/dsputil_vfp.S:115: Error: selected processor does not support ARM mode `vmulge.f32 s28,s23,s28'
libavcodec/arm/dsputil_vfp.S:116: Error: selected processor does not support ARM mode `vldmdbgt r2!,{s4-s7}'
libavcodec/arm/dsputil_vfp.S:117: Error: selected processor does not support ARM mode `vmulge.f32 s29,s22,s29'
Comment 14 Yixun Lan archtester gentoo-dev 2011-07-12 10:15:13 UTC
Created attachment 279857 [details]
ffmpeg 0.8 log

previous ffmpeg config.log is from 9999 with neon-enabled && armvfp-disabled
here is correct of ffempg-0.8 (auto detect version)
Comment 15 Siarhei Siamashka 2011-07-12 12:13:52 UTC
This is a bug in either ffmpeg or toolchain (or even in both of them to some extent), which results in wrong cpu features detection and broken ffmpeg builds.

The 'configure' script from ffmpeg runs gcc with inline assembly for vfp support detection. And this test code snippet looks like this:

    void foo(void){ __asm__ volatile("fadds s0, s0, s0"); }

And it's also important that this test code gets compiled with '-mcpu=cortex-a8 -c' gcc flags (note that there is no '-mfpu' option). It builds fine, because hardfloat gcc already implicitly enables vfp and does not need '-mfpu=' option. And as another exmaple, if we try to compile some empty C source file as 'gcc -S empty.c', then we get the following results (note the '.fpu vfp' directive emitted there):

        .arch armv7-a
        .eabi_attribute 27, 3
        .eabi_attribute 28, 1
        .fpu vfp
        .eabi_attribute 20, 1
        .eabi_attribute 21, 1
        .eabi_attribute 23, 3
        .eabi_attribute 24, 1
        .eabi_attribute 25, 1
        .eabi_attribute 26, 2
        .eabi_attribute 30, 6
        .eabi_attribute 18, 4
        .file   "empty.c"
        .ident  "GCC: (Gentoo 4.5.2 p1.1, pie-0.4.5) 4.5.2"
        .section        .note.GNU-stack,"",%progbits

In any case, the end result is that vfp is detected as supported by ffmpeg. The problem shows up when ffmpeg actually tries to compile assembly optimizations, they are not *.C files with inline assembly, but full assembly (*.S files). And these *.S files do not have any '.fpu' directives themselves, hence failing to compile. Have a look at the following example:

$ gcc -v
Using built-in specs.
COLLECT_GCC=/usr/armv7a-hardfloat-linux-gnueabi/gcc-bin/4.5.2/gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/armv7a-hardfloat-linux-gnueabi/4.5.2/lto-wrapper
Target: armv7a-hardfloat-linux-gnueabi
Configured with: /var/tmp/portage/sys-devel/gcc-4.5.2/work/gcc-4.5.2/configure --prefix=/usr --bindir=/usr/armv7a-hardfloat-linux-gnueabi/gcc-bin/4.5.2 --includedir=/usr/lib/gcc/armv7a-hardfloat-linux-gnueabi/4.5.2/include --datadir=/usr/share/gcc-data/armv7a-hardfloat-linux-gnueabi/4.5.2 --mandir=/usr/share/gcc-data/armv7a-hardfloat-linux-gnueabi/4.5.2/man --infodir=/usr/share/gcc-data/armv7a-hardfloat-linux-gnueabi/4.5.2/info --with-gxx-include-dir=/usr/lib/gcc/armv7a-hardfloat-linux-gnueabi/4.5.2/include/g++-v4 --host=armv7a-hardfloat-linux-gnueabi --build=armv7a-hardfloat-linux-gnueabi --disable-altivec --disable-fixed-point --without-ppl --without-cloog --disable-lto --with-float=hard --enable-nls --without-included-gettext --with-system-zlib --disable-werror --enable-secureplt --disable-multilib --enable-libmudflap --disable-libssp --enable-libgomp --enable-cld --with-python-dir=/share/gcc-data/armv7a-hardfloat-linux-gnueabi/4.5.2/python --enable-checking=release --disable-libgcj --with-arch=armv7-a --with-float=hard --enable-languages=c,c++,fortran --enable-shared --enable-threads=posix --enable-__cxa_atexit --enable-clocale=gnu --with-bugurl=http://bugs.gentoo.org/ --with-pkgversion='Gentoo 4.5.2 p1.1, pie-0.4.5'
Thread model: posix
gcc version 4.5.2 (Gentoo 4.5.2 p1.1, pie-0.4.5)

$ cat test.S
        .text
        .cpu cortex-a8
        test:
        fadds s0, s0, s0

$ gcc -c -mfpu=vfp test.S

$ gcc -c test.S
test.S: Assembler messages:
test.S:4: Error: selected processor does not support `fadds s0,s0,s0'

Unlike using C with inline assembly, real assembly sources need either '.fpu' directive, or -mfpu= option provided in the command line.

The possible solutions (in random order) are:
1. Change ffmpeg configure tests to use real assembly. In this case the results of configure tests and the results of real build will be at least more consistent
2. Add explicit '.fpu vfp' / '.fpu neon' directives to the assembly sources.
3. Use the '--extra-cflags' ffmpeg configure option to enforce passing some usable '-mfpu=' to gcc and assembler
4. Tweak hardfloat binutils so that the assembler also implicitly assumes '.fpu vfp' without any command line options.
5. ... or any other fix/hack ....

It's always a good idea trying to report the problem upstream and ask for some advice. Even if they decide to blame gcc/binutils in the end.
Comment 16 Matt Turner gentoo-dev 2011-08-17 04:54:53 UTC
(In reply to comment #7)
> and again, arches haven't provided the use flags and their masks

ARM's 'neon' and SPARC's 'vis' USE flags have been in place since at least March 20 2011.

(In reply to comment #0)
>   --disable-mmi            disable MMI optimizations

Confusingly, this is apparently for PlayStation2, not ARM, although it's in the middle of a list of ARM optimizations. So, ignore it.

WRT the other ARM flags, I can certainly see having a iwmmxt and armvfp USE flags, but what about armv{5,6,7}*? Maybe that can be tied into armin76's proposed ARM subprofiles.

I'll write the patches after talking with armin76, but I'll need someone else to do the testing (probably armin76 ;-).
Comment 17 Alexis Ballier gentoo-dev 2011-09-06 18:00:16 UTC
(In reply to comment #16)
> (In reply to comment #7)
> > and again, arches haven't provided the use flags and their masks
> 
> ARM's 'neon' and SPARC's 'vis' USE flags have been in place since at least
> March 20 2011.
> 

neon & vis added to -9999 then
Comment 18 Matt Turner gentoo-dev 2011-09-21 16:09:42 UTC
The same should be done for libav, right?
Comment 19 Matt Turner gentoo-dev 2011-10-22 03:17:11 UTC
Added ARM's iwmmxt USE flag to profiles. It's ready to be added to ffmpeg and libav.
Comment 20 Alexis Ballier gentoo-dev 2011-11-09 13:01:20 UTC
(In reply to comment #19)
> Added ARM's iwmmxt USE flag to profiles. It's ready to be added to ffmpeg and
> libav.

done for ffmpeg-9999

i dont use nor plan to use in the near future libav, so you'll have to find someone interested or do it yourself :)
Comment 21 Luca Barbato gentoo-dev 2011-11-15 08:42:52 UTC
Libav has a patch ready to solve the problem by setting arch and fpu in the asm files, but we can't reproduce the issue yet, could you please guide me to the process?
Comment 22 Luca Barbato gentoo-dev 2011-11-15 14:08:22 UTC
After a little fight with the new crossdev not new enough to know about the cxx nocxx trap and another nuisance I managed to reproduce the problem and investigate a little further. libav-9999 should be fine for everybody now, I'll drop another 0.8 prereleases soon.