You will have noticed that I just marked sci-libs/root-4.02.00 '-sparc' because as it stands, it cannot build on sparc. There are two problems (everything relative to ${PORTAGE_TMPDIR}/portage.root-4.02.00/work/root): (1) unix/src/TUnixSystem.cxx does not know about sparc, and so the compile fails; (2) xrootd/src/xrootd/... does not know about sparc, and thus its configure-on-the-fly fails. The first of these is trivial to fix. The second is not hard, but it is ugly because as part of the unpack process, you must untar the xrootd/src/xrootd directory, patch it, and re-tar it so that the build will create it correctly a little later. Patch will be provided, but that takes a second entry for the bug because initial entry doesn't seem to allow attachments. With the patch incorporated, root can have ~sparc because it runs well enough for testing. Obligatory environment information: U2(2x400 SMP) + U60(2x450 SMP), kernel 2.4.27-sparc-r1 (gcc version 3.3.5-20050130 (Gentoo Hardened Linux 3.3.5.20050130-r1, ssp-3.3.5.20050130-1, pie-8.7.7.1) two systems, using distcc).
Created attachment 54860 [details, diff] When applied in .../portage/sci-libs/root, converts root-4.02.00 into a sparc-friendly ebuild Apply the patch thus: In .../portage/sci-libs/root, execute patch -p0 -b -z- < {path-to-patch}/root-4.02.00.patch Resulting ebuild: (1) Changes -sparc ==> ~sparc (2) Enables USE=ruby (configure can't find /usr/lib/libruby18-static.a, but that's for another bug report.) (3) patches unix/src/TUnixSystem.cxx to recognize sparc. (4) un-tars xrootd/src/xrootd, patches three files to allow sparc and to use the OPT optimization flag; (5) re-tars xrootd/src/xrootd-20041124-0752.src.tgz so that during the build, root's configure-on-the-fly for xrootd will know about sparc. With these changes, root seems to do OK with the tutorials on sparc, and so could earn a ~sparc keyword.
Could you please submit a bug report upstream, I've had some success getting the idiotic ARCH file in xrootd modified to support amd64/ppc, though it would be far nicer if the problem was never allowed to happen.
how are we on this old bug? Could test on sparc the new root-5 now in the main tree?
Still needs love: ayanami ~ # emerge root Calculating dependencies... done! >>> Emerging (1 of 1) sci-physics/root-5.14.00b to / * root_v5.14.00b.source.tar.gz MD5 ;-) ... [ ok ] * root_v5.14.00b.source.tar.gz RMD160 ;-) ... [ ok ] * root_v5.14.00b.source.tar.gz SHA1 ;-) ... [ ok ] * root_v5.14.00b.source.tar.gz SHA256 ;-) ... [ ok ] * root_v5.14.00b.source.tar.gz size ;-) ... [ ok ] * Users_Guide_5_14.pdf MD5 ;-) ... [ ok ] * Users_Guide_5_14.pdf RMD160 ;-) ... [ ok ] * Users_Guide_5_14.pdf SHA1 ;-) ... [ ok ] * Users_Guide_5_14.pdf SHA256 ;-) ... [ ok ] * Users_Guide_5_14.pdf size ;-) ... [ ok ] * checking ebuild checksums ;-) ... [ ok ] * checking auxfile checksums ;-) ... [ ok ] * checking miscfile checksums ;-) ... [ ok ] * checking root_v5.14.00b.source.tar.gz ;-) ... [ ok ] * checking Users_Guide_5_14.pdf ;-) ... [ ok ] * * You may want to build ROOT with these non Gentoo extra packages: * AliEn, castor, Chirp, Globus, Monalisa, Oracle, peac, * PYTHIA, PYTHIA6, SapDB, SRP, Venus * You can use the EXTRA_CONF variable for this. * Example, for PYTHIA, you would do: * EXTRA_CONF="--enable-pythia --with-pythia-libdir=/usr/lib" emerge root * >>> Unpacking source... >>> Unpacking root_v5.14.00b.source.tar.gz to /var/tmp/portage/root-5.14.00b/work >>> Unpacking Users_Guide_5_14.pdf to /var/tmp/portage/root-5.14.00b/work unpack Users_Guide_5_14.pdf: file format not recognized. Ignoring. >>> Source unpacked. >>> Compiling source in /var/tmp/portage/root-5.14.00b/work/root ... Attempts at guessing your architecture failed. Please specify the architecture as the first argument. Do './configure --help' for a list of avaliable architectures. !!! ERROR: sci-physics/root-5.14.00b failed. Call stack: ebuild.sh, line 1546: Called dyn_compile ebuild.sh, line 937: Called src_compile root-5.14.00b.ebuild, line 115: Called die !!! configure failed !!! If you need support, post the topmost build error, and the call stack if relevant.
I gave a quick look at how to change it for sparc, although I don't have a sparc box. It seems that xrootd was modified and now supports linux sparc. But the main configure has a lot of arches/compiler flags. Could you try to see if the generic gcc work? You can pass it with EXTRA_CONF="linux" emerge root.
(In reply to comment #5) > I gave a quick look at how to change it for sparc, although I don't have a > sparc box. It seems that xrootd was modified and now supports linux sparc. But > the main configure has a lot of arches/compiler flags. Could you try to see if > the generic gcc work? You can pass it with EXTRA_CONF="linux" emerge root. > At first glance, that does not seem sufficient. I'll keep playing with it, though.
I'm attempting a build now. With this configure file, the architecture must come first, so to get it to configure, in the ebuild I had to make the configure statement start: ./configure \ ${EXTRA_CONF} \ etc. Real problem is in the configure file, though. It is still missing a linux:sparc:*.*) entry for autodetection. Also, I don't know yet if arch=linux is correct for sparc, so I'm still playing.
It looks as if root-5 should build on sparc. Most of it does with arch=linux. There is one problem at least, and I'll try to look at it more closely: It wants to build netx, netx appears to require xrootd, but it does not want to build xrootd. So, netx fails. I'm going to try explicit --disable-xrootd (which should disable netx) and explicit --enable-xrootd to see what happens.
If I force --disable-xrootd in the ebuild, root-5.14.00b now appears to build OK on sparc. (That's with EXTRA_CONF="linux" emerge root and with the ebuild change mentioned in Comment #7.) I'll play with it some more on a more current system.
(In reply to comment #9) > If I force --disable-xrootd in the ebuild, root-5.14.00b now appears to build > OK on sparc. (That's with EXTRA_CONF="linux" emerge root and with the ebuild > change mentioned in Comment #7.) I'll play with it some more on a more current > system. > I tried this on SB1000-MP system. It appears that on sparc, for some reason --enable-xrootd does not actually cause root to build xrootd. This might be a problem in the configure file, I'll investigate a little more, but not much unless it's something obvious.
Found it, I think. Not only does the configure file not like sparc linux, neither does root/xrootd/src/xrootd/config/ARCHS. Same fixes as in Attachment 54860 [details, diff] (well, the only attachment on this bug) are required for root-5 configure & ARCHS file. Unfortunately, xrootd is distributed in the root source as a .tgz file, so applying the fix here seems tricky. If you know anyone upstream, I suppose you could copy that person on this bug and let upstream do what they like.
Bringing the summary in line with what we are talking about here. The much more relevant case --- root-5 --- begins at Comment 3. It would be great if upstream would explain how the would add an architecture like this: linux:sparc*:*) arch=linuxsparcgcc ;; and get it to build everything. I know we have to add linuxsparcgcc to config/ARCHS and provide a config/Makefile.linuxsparcgcc, and at that point we have a version which can autodetect a sparc-linux system. I don't know if that also will cause xrootd to start building, but I can't see anything at all that would lead me to believe so: I can't see anything that tells me that just using arch=linux shouldn't build xrootd. It looks to me that once you un-tar the xrootd tree, xrootd/Module.mk is going to go ahead and build it or die trying.
I have a fix for the automatic system determination and failure; it's a small change to the ebuild and one-line patches to three files. However, some curious behavior has come up. On amd64, with MAKEOPTS='j2' or MAKEOPTS='-j1' this package seems to build fine. With -j3, it seems to always fail with 'cannot find -lCore'. And indeed, at -j3 does not get built. On sparc SB1000(1x750,1x900 multiprocessor), at -j3 -lCore always gets built, but -lCintex never gets built in time: I think I can say that for -j<anything>. (The xrootd libries do build fine, though. :)) This could be a problem with how I put together Makefile.linuxsparcgcc, but the only difference between it and Makefile.linux is: "+OPTFLAGS = -O2 -mcpu=ultrasparc" (instead of '-O'), I don't know what that definition is used for: Every time I look, the build is using mine, anyway. Net result: (1) I have a possible fix (very small) to the original problem; (2) some sparc systems consistently cannot build -lCintex; (3) amd64 sometimes cannot build -lCore, but that is not a consistent failure. I'll probably attach the patch which makes this problem seem to go away in a bit (it looks like a new ebuild, but it's just the original with a little src_unpack() provided: We need to patch 6 of root's files (well, 5 and add a new one).)
Something occurs to me. As my previous comment shows, I added a new arch=linuxsparcgcc, but linux would do as well for that. I only need to know what kind of system I am on in xrootd/Module.mk I should really check to see who cares about it. That's an easy test (`uname -m` works as well for me for testing. Not a general fix because (apparently) of windows, but on my systems I don't care at this point. By the way, what I have done does not affect amd64; the ebuild change won't apply it to anything at all. amd64 fails nicely on it's own. :)
Created attachment 109728 [details, diff] Fix for the "unknown architecture" bug reported by gustavoz in Comment 4 This one-line patch adds sparc to the configure script so that it can continue: More concretely, it adds the definition: linux:sparc*:*) arch=linux ;; at the appropriate spot to root/configure. This is a complete fix. A better fix would be 'arch=linuxsparcgcc', but that requires a corresponding Makefile.linuxsparcgcc, and I am not quite sure what would be best (it's easy to make one. Take, say, Makefile.linux and play with it. Since arch=linux works just fine, I'd prefer to wait for feedback, if any, before tailoring a sparc-specific one. This patch should be considered mandatory, and all it takes is: (1) put that little file in the root/files directory; (2) Create a src_unpack function, thus: src_unpack() { unpack ${A} cd ${WORKDIR} epatch ${FILESDIR}/<wlatever-a-good-name-for-this-patch-is> }
Created attachment 109743 [details] An example patch file and ebuild replacement showing the requirements for a complete fix for the initial complaint. This file contains an example ebuild replacement and a patch file containing three one-line patches. This is a complete fix. However, I do not recommending using it, and if you read the ebuild, you will see why: Essentially, it contains the Attachment 109728 [details, diff] fix (which is fine), then it does a manual unpack (i.e., tar) on a second source file bundled within the master source (it is the source for xrootd), applies a couple one-line patches, then repackages it (i.e., tar) so that the root build process can use it. I am not sure, but I suspect that, say, ciaranm would recommend it for a question on a new-developer quiz for "How many things are wrong in this ebuild?". What it does show, however, is how the bug must be fixed, by whatever acceptable means. Such as --- (1) Provide the xrootd part of the patch as a second source file and let src_unpack put it into the root/xrootd directly; (2) Patch the root/xrootd/Module.mk file apply the patch whenever it is asked to unpack the file. That would be at line 75 right after 'touch $$etag ;\' you would have patch <appropriate-options-to-apply-the-patch> ; \ For verifying the patch I didn't need that, and what I am providing makes brutally clear what is going on here. Plus it took about 5 minutes. :)
Let's review the bidding: (1) I don't have anything more to say on it, except that I am willing to work on a more proper fix than the one in Attachment 109743 [details]. I think Comment 16 explains a good solution (let the Module.mk fix its problems itself). (2) The -lCore problem is a problem with parallel makes for root on systems which are already and you provide a MAKEOPTS='-jxxx' a value which can resulting in the make process to use more CPUs than you have. That is what I was doing on the amd64, and I inadvertently reproduced it on sparc at home. I was building with -j4 on two systems using distcc. Now, some operations can't be distrubed, so on occasion I'd have a -j4 build on a 2-processor system. That's OK, but it turns out that this ended up running at the same time as the daily cron runs (makewhatis, slocate, update-eix --- you get the picture). In fact, I just saw it here on a sparc because the system was already building things. Why -lCore? Because it's huge, I am sure. (3) I don't know what the -lCintex problem is. All my systems here see it, but I can't reproduce it at home. E.g., I saw it here on a U60 distributing to a U2. I ran the same test at home (build on a U60 distribute to U2) and that worked fine a couple times. On the third time I saw the -lCore problem, but I am sure that is a timing thing as described in point (2). I will point out, however, that my systems here (U60+U20, SB1000) are all completely current as to installed software (or perhaps more so: I run a lot of ~sparc stuff). The ones at home are not current (there, when portage wants to update things, I pick and choose.) I strongly suspect python (especially since Cintex seems to be building a lot of python stuff) but I have no evidence. The answer is probably in the build log, and I might chase through it sometime (it's huge). More likely, I'll try to figure out how to ask the Makefile to build one specific module, do that for Cintex, and see if I can see any indication that it failed to build the library, but apparently notice it. Suggestions from people who actually know something about building root are welcome. :)
I'll provide a bit more information. These failures in points (2) and (3) above are the same. When it complains about a missing library, it looks as if that library is being built, and the make process is trying to use it before the file is not created. Why do I think this? Because I went into the build directory from a failure, and just entered 'make'. The first thing it did was finish building the libraries. Then I could finesse an install using ebuild. At the same time, my U2 distributing to the U60 built successfully. So the only dependable failure is this system: SB1000(1x900,1x750) asymmetric multiprocessor. I just think somehow the fact that the processors are running at different speeds is messing with the long sequence of linking at the end. I have no clue how to attack such a problem. (-j1 sounds good, but doesn't work because, I suppose, the operating system can choose to reassign CPUs.) Oh, yes, it seems to work just like it is supposed to.
Created attachment 109883 [details] Replacement patch and ebuild This version replaces an old-style test 'if use sparc; then' with a newer style 'if [[ ${ARCH} == sparc ]]; then' and so should work with all package manager candidates (pkgcore dislikes the former, I haven't had a chance to try with paludis). I've rethought my position a bit. Ugly as it is, src_unpack really does have to do just that and should not defer the required patching to a Module.mk file. Anyone reading this, if you have time, please do try the build. It should run to completion and work. If it does not, you are seeing the race condition described in Comment 14 (attempt to use a library before its build is complete). We need to know if this is a Makefile problem or a system problem, and if it is limited to 2 out of 6 systems for me (with random failures on the others if the stars are right). Me, I can cause it on both sparc and amd64 any time I wish, and on my (asymmetric) SB1000MP, it is a hard failure. If the attached patch (or equivalent) is applied, I am prepared to give this a ~sparc, but will not even consider a 'sparc' until I know the answer to the above question. And, of course, if the Makefile is implicated, until it is fixed. (However, upstream at CERN really should "bite the bullet" and add this three-line patch themselves. Why they haven't is beyond me.)
(In reply to comment #19) > > Anyone reading this, if you have time, please do try the build. It should run > to completion and work. If it does not, you are seeing the race condition > described in Comment 14 (attempt to use a library before its build is > complete). That's Comment 18, of course. And if you do see the failure, please check the log. It looks to me that not only hasn't the build for the missing library completed, but also it hasn't started yet. That is, its attempted use sort of sneaks in before the build (which should have been the next step).
This package cannot build reliably with a parallel make, and so unfortunately it looks as if the 'emake \' needs to be 'emake -j1 \'. With MAKEOPTS='-j1' this package builds just fine for me now, but on some systems (e.g., SB1000) it will never build in parallel.
(In reply to comment #19) > Created an attachment (id=109883) [edit] > > 'if [[ ${ARCH} == sparc ]]; then' and so should work with all package manager > candidates (pkgcore dislikes the former, I haven't had a chance to try with > paludis). > Paludis is fine with Attachment 109883 [details] (as are pkgcore & portage).
Created attachment 110992 [details] Parallel make friendly patch and replacement ebuild Parallel make seems to fail (if it fails at all) when it attempts to use -lCore before -lCore gets built. Replacement ebuild uses the rather ugly construct emake ... || emake -j1 ... || die ... As best as I can tell, this should always allow a mostly parallel make (highly desirable on my slower systems) but recover (if it needs to) and complete successfully. Experience from others highly desirable.
Fix spelling error in summary (sci-lphysics --> sci-physics).
> before -lCore gets built. Replacement ebuild uses the rather ugly construct > emake ... || emake -j1 ... || die ... I don't see any of this in the patch attached. Did you attach the right one? I have no problem with -j2 on my single-cpu amd64. Let me know if applying something like the following works for you: emake root emake -j1 lib/libCore.so", before the first emake command in the original ebuild?
Created attachment 111019 [details] Corrected parallel make ebuild No, I did not attach the correct file. This one (I am sure) is. I will verify later, but I am pretty sure your suggestion will not work because libCore.so is needed to build many of root's pieces. And parallel make does work for me in some instances. For example, it works fine (usually) on amd64(SMP) with MAKEOPTS='-j2', but never with MAKEOPTS='-j3'. On SB1000(Asymmetric MP), parallel make never works for me. And so on.
Sorry, I misread your suggestion and when describing the new attachment, of course I could not go back to look at it. I'll play with what you are suggesting later.
Actually, I mistyped as well. Try this bit instead: emake OPTFLAGS="${CXXFLAGS}" rootcint compiledata emake OPTFLAGS="${CXXFLAGS}" -j1 lib/libCore.so emake OPTFLAGS="${CXXFLAGS}" I will try -j3 too.
(In reply to comment #28) > Actually, I mistyped as well. Try this bit instead: > > emake OPTFLAGS="${CXXFLAGS}" rootcint compiledata > emake OPTFLAGS="${CXXFLAGS}" -j1 lib/libCore.so > emake OPTFLAGS="${CXXFLAGS}" > With these, build fails on sparc and on amd64, thus: cint/main/cint_tmp -K -w1 -zipc -ncint/lib/G__c_ipc.c -D__MAKECINT__ -DG__MAKECINT \ -c-2 -Z0 cint/lib/ipc/ipcif.h Error: Symbol __BEGIN_DECLS#include is not defined in current scope /usr/include/sys/types.h:35: Error: Symbol bits is not defined in current scope /usr/include/sys/types.h:35: Error: Symbol types is not defined in current scope /usr/include/sys/types.h:35: Error: Failed to evaluate types.h Error: operator '/' divided by zero /usr/include/sys/types.h:35: Error: Symbol #ifdef__USE_BSD#ifndef__u_char_definedtypedef__u_charu_char is not defined in current scope /usr/include/sys/types.h:35: on sparc Or cint/main/cint_tmp -K -w1 -zipc -ncint/lib/G__c_ipc.c -D__MAKECINT__ -DG__MAKECINT \ -c-2 -Z0 cint/lib/ipc/ipcif.h Warning: Unknown type key_t in function argument cint/lib/ipc/ipcif.h:140: Error: Symbol ushortsem_num is not defined in current scope cint/lib/ipc/ipcif.h:172: !!!Removing cint/lib/G__c_ipc.c cint/lib/G__c_ipc.h !!! make: *** [cint/lib/G__c_ipc.c] Error 1 make: *** Waiting for unfinished jobs.... on amd64 With the original ebuild, this looks like (on both amd64 and sparc): cint/main/cint_tmp -K -w1 -zipc -ncint/lib/G__c_ipc.c -D__MAKECINT__ -DG__MAKECINT \ -c-2 -Z0 cint/lib/ipc/ipcif.h Note: Link requested for undefined class ipc_parm (ignore this message) :0: Note: Link requested for undefined class ipc_perm (ignore this message) :0: Note: Link requested for undefined class semid_ds (ignore this message) :0: Note: Link requested for undefined class msqid_ds (ignore this message) :0: So I don't think that alternative is an option. The emake OPT... || emake -j1 OPT... || die construction still seems to work, though. Problem is that on all my systems, if MAKEOPTS specifies too much parallelism, the "emake OPT..." fails. > I will try -j3 too. >
No luck with -j3 as well on my box. The only target list that successfully worked is the following: emake OPTFLAGS="${CXXFLAGS}" rootcint compiledata emake OPTFLAGS="${CXXFLAGS}" -j1 rootlibs emake OPTFLAGS="${CXXFLAGS}" I put a slightly updated root with the sparc patch on gentooscience overlay. you can try it with layman -S science. Let me know if it works.
(In reply to comment #30) > No luck with -j3 as well on my box. > > The only target list that successfully worked is the following: > > emake OPTFLAGS="${CXXFLAGS}" rootcint compiledata > emake OPTFLAGS="${CXXFLAGS}" -j1 rootlibs > emake OPTFLAGS="${CXXFLAGS}" > > I put a slightly updated root with the sparc patch on gentooscience overlay. > you can try it with layman -S science. Let me know if it works. > Testing now, but two observations: 1) The patch file needs to be called sparc-root-5.14.00c.patch 2) The './configure "${EXTRA_CONF}" \' is wrong. This causes it to try to configure for architecture = "", which fails. Part of the patch adds sparc/linux to the known architectures, and it is autodetected correctly (as linux). So, the ${EXTRA_CONF} should stay at the end of the ./configure call as in the original. (I.e., the ./configure part of the attachment is correct.)
With the changes to mentioned in the "two observations" of Comment #31, root-5.14.00c seems good. I have a couple more systems to check it on, and will have a complete answer sometime the 26th. By the way, you would never want ./configure "${EXTRA_CONF}" \ Suppose we have export EXTRA_CONF="linux --enable-pythia --with-pythia-libdir=/usr/lib" to force build for linux, and to enable pythia (as in the setup example). Then, with ./configure "${EXTRA_CONF}", the configure script would assign ARCH="linux --enable-pythia --with-pithia-libdir=/usr/lib" and the build would instantly fail. I have verified that if you leave it ./configure ${EXTRA_CONF} \ without the quotes, you probably get what what we want (at least, configure no longer complains when EXTRA_CONF is not set to anything).
Locally, tests good with both pkgcore and portage if you fix the name of the patch file to sparc-root-5.14.00c.patch and make it ./configure ${EXTRA_CONF} \ without the quote marks. I need to verify later on Monday on SB1000, but I am pretty sure that this version (with corrections) can have it's ~sparc. :) Thanks for the support.
Hi I commited the root-5.14.00c changes suggested here. Was the sparc test successful? We should send our fixes upstream. Has anyone sent anything yet? Sebastien
(In reply to comment #34) > Hi > > I commited the root-5.14.00c changes suggested here. Was the sparc test > successful? We should send our fixes upstream. Has anyone sent anything yet? > > Sebastien > Sparc seems fine with .00c. Thanks.
(In reply to comment #35) > (In reply to comment #34) > > Hi > > > > I commited the root-5.14.00c changes suggested here. Was the sparc test > > successful? We should send our fixes upstream. Has anyone sent anything yet? > > > > Sebastien > > > Sparc seems fine with .00c. Thanks. > Oh, missed a question. I haven't sent anything upstream; I don't know to whom to send it. That little patch file should be all they need. And so far as I am concerned, this bug is fixed. But I'll leave it to you to make that determination (e.g., does upstream need to apply the three line patch before we can close this?).
Sent upstream both parallel building problem and sparc patch. Closing this bug for now. Thanks.