Summary: | itpp-3.10.3 (new bug-fix release) | ||
---|---|---|---|
Product: | Gentoo Linux | Reporter: | Adam Piątyszek <ediap> |
Component: | New packages | Assignee: | Markus Dittrich (RETIRED) <markusle> |
Status: | VERIFIED FIXED | ||
Severity: | enhancement | CC: | markusle, sci |
Priority: | High | ||
Version: | 2006.0 | ||
Hardware: | All | ||
OS: | Linux | ||
URL: | http://itpp.sourceforge.net/ | ||
Whiteboard: | |||
Package list: | Runtime testing required: | --- |
Description
Adam Piątyszek
2006-07-11 06:44:11 UTC
Hi Adam, Thank you very much for the update and I will bump the ebuild tonight and prune the old versions. I'll also send the amd64 folks another reminder regarding ~amd64. With respect to other platforms: Do you have any sense of which ones (in addition to amd64) should work? Regarding stabling itpp: Would 3.10.3 be a good candidate from your point of view? If so, then according to our guidelines, we'll wait 4 weeks after the ebuild is committed and in case there are no bugs/issues with it during this period I'll send a request to the x86 team for stabilization. Thanks again, Markus (In reply to comment #1) > With respect to other platforms: Do you have any sense of which > ones (in addition to amd64) should work? I think it should work on sparc, at least when using GCC compiler (I am using one SPARC Blade 100 for testing this library during it's development). ppc, ppc64, ppc macos and ia64 are my candidates as well, however I know about one problem with configuration on Mac OS X, which is described here: http://itpp.sourceforge.net/latest/installation.html#macosx No idea about other architectures... But it might be worth giving them a try... > Regarding stabling itpp: Would 3.10.3 be a good candidate > from your point of view? If so, then according to our guidelines, > we'll wait 4 weeks after the ebuild is committed and in case there > are no bugs/issues with it during this period I'll send a request > to the x86 team for stabilization. I think so. The 3.10.0 release has been publised on March 15th, 2006. And next releases only fixed identified bugs and installation/configuration problems... So in my opinion the library is quite stable now. Even in Gentoo with various BLAS/LAPACK/FFT implementations. Thanks for your prompt response! /ediap Hi Adam, I just bumped itpp in portage and pruned the old versions. I've also made a note in my calendar and will request itpp-3.10.3 to be marked x86 in fours weeks time should there be no major bugs. Finally, I've send the amd64 folks another reminder about ~amd64. Once that's done (hopefully soon) we can then tackle the other arches. Thanks, Markus Thanks! /ediap Dear Markus, One new Gentoo and IT++ user helped yesterday to find out that itpp-3.10.3, which I considered to be marked stable, does not detect the threaded blas-atlas library. Probably it would work when --with-blas="-lblas -lpthread" was used (I haven't tested this, since I have not dual core machine for my personal use). Is the threaded blas-atlas the Gentoo devs invention? What is a proper way of detecting if libblas.* is a threaded library or not? Thanks for your help with this issue :) BR, /ediap Hi Adam, As far as I know atlas' setup script will build a threaded static libptf77blas.a if it detects a multiprocessor machine. The ebuild detects this and then also creates a threaded *.so. Hence, in this sense the threaded libs are atlas and not Gentoo specific. As a matter of fact, I just build itpp-3.10.3 against the threaded libblas.so without any problems whatsoever on a dual P4 machine. So all seems to be well for me. Here's the info on this particular machine mcell5 dittrich # blas-config -p Current profiles: F77 BLAS: /usr/lib/blas/f77-threaded-ATLAS C BLAS: /usr/lib/blas/ mcell5 dittrich # ls -la /usr/lib/libblas.so.0 lrwxrwxrwx 1 root root 32 Jul 24 21:34 /usr/lib/libblas.so.0 -> blas/threaded-atlas/libblas.so.0 Can you track down what exactly fails during configure; is it by any chance similar to what's reported in bug #137877 (i.e. missing symbols, pthread_create). Thanks, Markus Hi Markus, (In reply to comment #6) > As far as I know atlas' setup script will build a threaded > static libptf77blas.a if it detects a multiprocessor machine. > The ebuild detects this and then also creates a threaded *.so. > Hence, in this sense the threaded libs are atlas and not > Gentoo specific. > > As a matter of fact, I just build itpp-3.10.3 against the threaded > libblas.so without any problems whatsoever on a dual P4 machine. > So all seems to be well for me. [...] That is good news. > Can you track down what exactly fails during configure; is it by any > chance similar to what's reported in bug #137877 (i.e. missing > symbols, pthread_create). It seems that this is the case. Here is a part of "config.log" from the user that reported the problem: configure:20292: checking for sgemm_ in -lblas configure:20325: g++ -o conftest conftest.cc -lblas -L/usr/lib/gcc/i686-pc-linux-gnu/3.4.6 -L/usr/lib/gcc/i686-pc-linux-gnu/3.4.6/../../../../i686-pc-linux-gnu/lib -L/usr/lib/gcc/i686-pc-linux-gnu/3.4.6/../../.. -lfrtbegin -lg2c -lm -lgcc_s >&5 /usr/lib/libatlas.so.0: undefined reference to `pthread_create' /usr/lib/libatlas.so.0: undefined reference to `pthread_join' collect2: ld returned 1 exit status The user emerged stable blas-atlas (3.6.0-r1) and lapack-atlas (3.6.0) and then itpp-3.10.3 with the following use flags: USE="blas cblas lapack debug". And configure, due to this undefined references, could not find blas. BR, /ediap (In reply to comment #6) > As a matter of fact, I just build itpp-3.10.3 against the threaded > libblas.so without any problems whatsoever on a dual P4 machine. > So all seems to be well for me. Here's the info on this particular > machine Did you also check if all tests passes? IT++ can be build even if blas can not be detected, but with limited functionality. /ediap (In reply to comment #6) > As far as I know atlas' setup script will build a threaded > static libptf77blas.a if it detects a multiprocessor machine. > The ebuild detects this and then also creates a threaded *.so. > Hence, in this sense the threaded libs are atlas and not > Gentoo specific. One more thing. I found "threads" USE flag and wonder if blas-atlas/lapack-atlas and itpp ebuilds could use this flag for building threaded libraries. BR, /ediap Hi Adam, I had a somewhat deeper look into this and unfortunately I was too quick in my response last night, since I was assuming the configure step would bomb in case there are problems with blas/atlas. As a matter of fact I have the same problem with pthread symbols not being found by libblas.so/libatlas.so; this is only an issue on the dual P4 box, things are just fine on my other single CPU test box. In any case, in my opinion this is an issue with blas-atlas not with itpp and I'll have to investigate what is going wrong during the atlas configure step itself and possibly file a bug with upstream. The remedy for now would be appending -lpthread but the shared objects really should find their symbols without it. If I don't find a quick fix I'll probably open a bug to track this issue and will CC you on it. Thanks, Markus Hi Adam, I've just committed fixes to the 3.7.11[-r1] ebuilds of blas-atlas and lapack-atlas that will hopefully resolve these missing pthread_ symbol issues (at least they do for me). Could you please ask your user to re-emerge blas-atlas (in an hour or so when the mirrors have updated) and try emerging itpp again? Let's hope it works :) Thanks, Markus (In reply to comment #10) > In any case, in my opinion this is an issue with blas-atlas not with > itpp and I'll have to investigate what is going wrong during the atlas > configure step itself and possibly file a bug with upstream. > The remedy for now would be appending -lpthread but the shared > objects really should find their symbols without it. > > If I don't find a quick fix I'll probably open a bug to track this > issue and will CC you on it. Another problem appeared. The user reported some problems during emerging the blas-atlas-3.7.11-r1, which you recommended for tests. Here is a part of the error message: MULADD=0, lat=64: kill file and rerun with higher reps; variation exceeds tolerence MULADD=1, lat=1, mf=270.79 MULADD=1, lat=2, mf=340.71 MULADD=1, lat=3, mf=510.15 MULADD=1, lat=4, mf=659.25 MULADD=1, lat=5, mf=795.71 MULADD=1, lat=6, mf=998.44 MULADD=0, lat=1, mf=197.32 MULADD=0, lat=2, mf=397.49 MULADD=0, lat=3, mf=595.01 MULADD=0, lat=4, mf=786.04 MULADD=0, lat=5, mf=971.60 MULADD=0, lat=6, mf=1134.91 nreg=16, mflop = 1216.08 (peak 1134.91) nreg=32, mflop = 1246.68 (peak 1134.91) nreg=64, mflop = 1150.20 (peak 1134.91) make[6]: *** [RunMulAdd] Error 255 make[6]: Leaving directory `/var/tmp/portage/blas-atlas-3.7.11-r1/work/ATLAS/tune/sysinfo/Linux_P4SSE3_2' make[5]: *** [res/dMULADD] Error 2 make[5]: Leaving directory `/var/tmp/portage/blas-atlas-3.7.11-r1/work/ATLAS/tune/sysinfo/Linux_P4SSE3_2' xsyssum: ../GetSysSum.c:69: getfpinfo0: Assertion `system(fnam) == 0' failed. make[4]: *** [/var/tmp/portage/blas-atlas-3.7.11-r1/work/ATLAS/include/Linux_P4SSE3_2/atlas_dsysinfo.h] Aborted make[4]: Leaving directory `/var/tmp/portage/blas-atlas-3.7.11-r1/work/ATLAS/tune/sysinfo/Linux_P4SSE3_2' make[3]: *** [/var/tmp/portage/blas-atlas-3.7.11-r1/work/ATLAS/include/Linux_P4SSE3_2/atlas_dsysinfo.h] Error 2 make[3]: Leaving directory `/var/tmp/portage/blas-atlas-3.7.11-r1/work/ATLAS/src/auxil/Linux_P4SSE3_2' make[2]: *** [IStage1] Error 2 make[2]: Leaving directory `/var/tmp/portage/blas-atlas-3.7.11-r1/work/ATLAS/bin/Linux_P4SSE3_2' I recommended reemerging blas-atlas, since it seems to be a timing problem, but I am not an expert on ATLAS errors. Will let you know, as soon we have more info about this. BR, /ediap Hi Adam, Yeah, unfortunately, these errors occur once in a while when emerging blas-atlas. I talked to upstream about this a while ago and there are some tips on the atlas web-site how deal with them should they persists. For me personally, they usually seem to pop up when my machine is doing other CPU intensive stuff during blas-atlas' timing test which seem to confuse them sometimes. In any case, the -r1 ebuild only changed some things during linking, hence if the user has been able to get through the compile with the previous version he should be able to do it with -r1 as well. Otherwise he could try some of the things that the atlas people recommend. Please keep us posted. There's actually a new atlas release out since a few weeks that might do much better with newer chips, but since upstream changed the build procedure quite significantly compared to their previous releases it will probably take a little while to get in into portage. Thanks, Markus |