ffmpeg's configure script offers the following flags for ARM optimizations --disable-armv5te disable armv5te optimizations --disable-armv6 disable armv6 optimizations --disable-armv6t2 disable armv6t2 optimizations --disable-armvfp disable ARM VFP optimizations --disable-iwmmxt disable iwmmxt optimizations --disable-mmi disable MMI optimizations --disable-neon disable neon optimizations And this flag for SPARC optimizations --disable-vis disable VIS optimizations but none of these are used in the ffmpeg ebuilds. While all the ARM flags aren't terribly important, NEON certainly is. And VIS for SPARC is important as well.
you probably mean "disable" instead of "enable", right ? and of course, patches for -9999 are very welcome :=)
if sparc then $opts = $opts + "--enable-vis" fi etc, something like that would be greatly appreciated.
btw, imho, _arch teams_ should provide global useflags for their optimizations if they want them to be controlled by useflags; masked in base/ and unmasked on their arches. They should also specify if they want them enabled or disabled by default.
you sure this isn't autodetected already?
(In reply to comment #4) > you sure this isn't autodetected already? Still adding support for "neon" USE flag for ARM might be useful. Just to make the users feel warm and fuzzy about it. And also in order to to be able to easily disable NEON for fun or for benchmarking purposes.
Created attachment 279809 [details, diff] enable neon use I'm using armv7a-hardfloat-linux-gnueabi toolchain. ffempg-0.7.1 can auto detect neon, but also enable vfp, which will cause problem. this patch will explicitly enable neon which also disable vfp
well, patches should go to -9999 in this version you should add this to the CPU_FEATURES (alphabetically sorted) if there is a compatibility problem with some useflags, REQUIRED_USE should solve it and again, arches haven't provided the use flags and their masks
(In reply to comment #6) > ffempg-0.7.1 can auto detect neon, but also enable vfp, which will cause > problem. what is the problem with vfp?
I'm trying to add neon support, then encounter this bug, see 374915
(In reply to comment #9) > I'm trying to add neon support, then encounter this bug, see 374915 USAT from bug 374915 is not a VFP instruction. If disabling VFP helps for some reason, that's certainly not a correct fix, but some side effect.
Created attachment 279847 [details] the default build log this is the default build log (with enable-neon, disable-armvfp) thus all of above is auto-detect by build system, it seems enable all, neon and armvfp,vfpv3
Created attachment 279849 [details] the configure output
attached log (build.log, config.log) is from ffmpeg-0.8(but same problem with ffmpeg-0.7.1) the snip from config.log --------------------------- vfpv3=yes vfpv3_deps=armvfp armvfp=yes armvfp_deps=arm neon=yes neon_deps=arm the snip from build.log ---------------------- libavcodec/arm/dsputil_vfp.S:114: Error: selected processor does not support ARM mode `vmulge.f32 s27,s16,s27' libavcodec/arm/dsputil_vfp.S:115: Error: selected processor does not support ARM mode `vmulge.f32 s28,s23,s28' libavcodec/arm/dsputil_vfp.S:116: Error: selected processor does not support ARM mode `vldmdbgt r2!,{s4-s7}' libavcodec/arm/dsputil_vfp.S:117: Error: selected processor does not support ARM mode `vmulge.f32 s29,s22,s29'
Created attachment 279857 [details] ffmpeg 0.8 log previous ffmpeg config.log is from 9999 with neon-enabled && armvfp-disabled here is correct of ffempg-0.8 (auto detect version)
This is a bug in either ffmpeg or toolchain (or even in both of them to some extent), which results in wrong cpu features detection and broken ffmpeg builds. The 'configure' script from ffmpeg runs gcc with inline assembly for vfp support detection. And this test code snippet looks like this: void foo(void){ __asm__ volatile("fadds s0, s0, s0"); } And it's also important that this test code gets compiled with '-mcpu=cortex-a8 -c' gcc flags (note that there is no '-mfpu' option). It builds fine, because hardfloat gcc already implicitly enables vfp and does not need '-mfpu=' option. And as another exmaple, if we try to compile some empty C source file as 'gcc -S empty.c', then we get the following results (note the '.fpu vfp' directive emitted there): .arch armv7-a .eabi_attribute 27, 3 .eabi_attribute 28, 1 .fpu vfp .eabi_attribute 20, 1 .eabi_attribute 21, 1 .eabi_attribute 23, 3 .eabi_attribute 24, 1 .eabi_attribute 25, 1 .eabi_attribute 26, 2 .eabi_attribute 30, 6 .eabi_attribute 18, 4 .file "empty.c" .ident "GCC: (Gentoo 4.5.2 p1.1, pie-0.4.5) 4.5.2" .section .note.GNU-stack,"",%progbits In any case, the end result is that vfp is detected as supported by ffmpeg. The problem shows up when ffmpeg actually tries to compile assembly optimizations, they are not *.C files with inline assembly, but full assembly (*.S files). And these *.S files do not have any '.fpu' directives themselves, hence failing to compile. Have a look at the following example: $ gcc -v Using built-in specs. COLLECT_GCC=/usr/armv7a-hardfloat-linux-gnueabi/gcc-bin/4.5.2/gcc COLLECT_LTO_WRAPPER=/usr/libexec/gcc/armv7a-hardfloat-linux-gnueabi/4.5.2/lto-wrapper Target: armv7a-hardfloat-linux-gnueabi Configured with: /var/tmp/portage/sys-devel/gcc-4.5.2/work/gcc-4.5.2/configure --prefix=/usr --bindir=/usr/armv7a-hardfloat-linux-gnueabi/gcc-bin/4.5.2 --includedir=/usr/lib/gcc/armv7a-hardfloat-linux-gnueabi/4.5.2/include --datadir=/usr/share/gcc-data/armv7a-hardfloat-linux-gnueabi/4.5.2 --mandir=/usr/share/gcc-data/armv7a-hardfloat-linux-gnueabi/4.5.2/man --infodir=/usr/share/gcc-data/armv7a-hardfloat-linux-gnueabi/4.5.2/info --with-gxx-include-dir=/usr/lib/gcc/armv7a-hardfloat-linux-gnueabi/4.5.2/include/g++-v4 --host=armv7a-hardfloat-linux-gnueabi --build=armv7a-hardfloat-linux-gnueabi --disable-altivec --disable-fixed-point --without-ppl --without-cloog --disable-lto --with-float=hard --enable-nls --without-included-gettext --with-system-zlib --disable-werror --enable-secureplt --disable-multilib --enable-libmudflap --disable-libssp --enable-libgomp --enable-cld --with-python-dir=/share/gcc-data/armv7a-hardfloat-linux-gnueabi/4.5.2/python --enable-checking=release --disable-libgcj --with-arch=armv7-a --with-float=hard --enable-languages=c,c++,fortran --enable-shared --enable-threads=posix --enable-__cxa_atexit --enable-clocale=gnu --with-bugurl=http://bugs.gentoo.org/ --with-pkgversion='Gentoo 4.5.2 p1.1, pie-0.4.5' Thread model: posix gcc version 4.5.2 (Gentoo 4.5.2 p1.1, pie-0.4.5) $ cat test.S .text .cpu cortex-a8 test: fadds s0, s0, s0 $ gcc -c -mfpu=vfp test.S $ gcc -c test.S test.S: Assembler messages: test.S:4: Error: selected processor does not support `fadds s0,s0,s0' Unlike using C with inline assembly, real assembly sources need either '.fpu' directive, or -mfpu= option provided in the command line. The possible solutions (in random order) are: 1. Change ffmpeg configure tests to use real assembly. In this case the results of configure tests and the results of real build will be at least more consistent 2. Add explicit '.fpu vfp' / '.fpu neon' directives to the assembly sources. 3. Use the '--extra-cflags' ffmpeg configure option to enforce passing some usable '-mfpu=' to gcc and assembler 4. Tweak hardfloat binutils so that the assembler also implicitly assumes '.fpu vfp' without any command line options. 5. ... or any other fix/hack .... It's always a good idea trying to report the problem upstream and ask for some advice. Even if they decide to blame gcc/binutils in the end.
(In reply to comment #7) > and again, arches haven't provided the use flags and their masks ARM's 'neon' and SPARC's 'vis' USE flags have been in place since at least March 20 2011. (In reply to comment #0) > --disable-mmi disable MMI optimizations Confusingly, this is apparently for PlayStation2, not ARM, although it's in the middle of a list of ARM optimizations. So, ignore it. WRT the other ARM flags, I can certainly see having a iwmmxt and armvfp USE flags, but what about armv{5,6,7}*? Maybe that can be tied into armin76's proposed ARM subprofiles. I'll write the patches after talking with armin76, but I'll need someone else to do the testing (probably armin76 ;-).
(In reply to comment #16) > (In reply to comment #7) > > and again, arches haven't provided the use flags and their masks > > ARM's 'neon' and SPARC's 'vis' USE flags have been in place since at least > March 20 2011. > neon & vis added to -9999 then
The same should be done for libav, right?
Added ARM's iwmmxt USE flag to profiles. It's ready to be added to ffmpeg and libav.
(In reply to comment #19) > Added ARM's iwmmxt USE flag to profiles. It's ready to be added to ffmpeg and > libav. done for ffmpeg-9999 i dont use nor plan to use in the near future libav, so you'll have to find someone interested or do it yourself :)
Libav has a patch ready to solve the problem by setting arch and fpu in the asm files, but we can't reproduce the issue yet, could you please guide me to the process?
After a little fight with the new crossdev not new enough to know about the cxx nocxx trap and another nuisance I managed to reproduce the problem and investigate a little further. libav-9999 should be fine for everybody now, I'll drop another 0.8 prereleases soon.
iwmmxt has been removed http://source.ffmpeg.org/?p=ffmpeg.git;a=commit;h=363bd1c62c1bcbac2dcb56f3dc47824f075888d2 https://bugs.gentoo.org/show_bug.cgi?id=408031