First Last Prev Next    No search results available      Search page      Enter new bug
Bug#: 101326
Alias:
Product:
Component:
Status: RESOLVED
Resolution: FIXED
Assigned To: Gentoo Linux High-Performance Clustering Team <hp-cluster@gentoo.org>
Hardware:
OS:
Version:
Priority:
Severity:
Reporter: Ferris McCormick <fmccor@gentoo.org>
Add CC:
CC:
Remove selected CCs
URL:
Summary:
Status Whiteboard:
Keywords:

Filename Description Type Creator Created Size Actions
torque-1.2.0_p5-build.log Complete ebuild log from failing install for torque-1.2.0_p5 text/plain Ferris McCormick 2005-08-12 21:12 0000 114.81 KB Details
torque-1.2.0_p5-r1.ebuild.patch Patch to torque-1.2.0_p5-r1 to correct the tclIndex files to refer to files where installed on live system patch Ferris McCormick 2005-08-13 14:40 0000 886 bytes Details | Diff
torque-1.2.0_p5-destdir-fixes.patch Version of torque-1.2.0_p5-destdir-fixes.patch which works to get tclIndex files correct for Gentoo patch Ferris McCormick 2005-08-13 17:23 0000 1.29 KB Details | Diff
Create a New Attachment (proposed patch, testcase, etc.) View All

Bug 101326 depends on: Show dependency tree
Show dependency graph
Bug 101326 blocks: 98067
Votes: 0    Show votes for this bug    Vote for this bug

Additional Comments: (this is where you put emerge --info)







View Bug Activity   |   Format For Printing   |   XML   |   Clone This Bug


Description:   Opened: 2005-08-04 06:40 0000
1. torque-1.2.0_p1-r3 build cannot get started, because the ebuild tries to
download the patchfile torque-1.2.0_p1-respect-destdir.patch.gz, which results
in:No such file `torque-1.2.0_p1-respect-destdir.patch.gz'.  Probably what it
wants is ${FILESDIR}/1.2.0_p1-respect-destdir.patch, which is present.

2.  torque-1.2.0_p1-r2 builds OK, but the install-to-image fails:  No doubt
failure is related to this nonsense:
==============
polylepis etc # pwd
/var/tmp/portage/torque-1.2.0_p1-r2/image/var/tmp/portage/torque-1.2.0_p1-r2/image/etc
==============
but it's trying to install into /var/tmp/portage/torque-1.2.0_p1-r2/image/etc
(which seems correct, although it doesn't exist).  
/var/tmp/portage/torque-1.2.0_p1-r2/image/bin however is fine.

------- Comment #1 From Robin Johnson 2005-08-04 23:20:34 0000 -------
#2 is caused by #1.

as for #1, it should be looking in $DISTDIR after downloading that file from 
the mirrors. could you please sync and try again?

------- Comment #2 From Ferris McCormick 2005-08-05 05:10:11 0000 -------
OK, now, it sits forever:
waiting for lock on
/usr/portage/distfiles/.locks/torque-1.2.0_p1-respect-destdir.patch.gz.portage_lockfile

------- Comment #3 From Ferris McCormick 2005-08-05 05:36:19 0000 -------
Cancel that.  I misread what file it was waiting for.(In reply to comment #2)
> OK, now, it sits forever:
> waiting for lock on
>
/usr/portage/distfiles/.locks/torque-1.2.0_p1-respect-destdir.patch.gz.portage_lockfile
> 


------- Comment #4 From Ferris McCormick 2005-08-05 06:12:21 0000 -------
(In reply to comment #1)
> #2 is caused by #1.
> 
> as for #1, it should be looking in $DISTDIR after downloading that file from 
> the mirrors. could you please sync and try again?

From either a fresh sync or from current cvs version of torque, I get:
!!! Couldn't download torque-1.2.0_p1-respect-destdir.patch.gz. Aborting.

Problem is that the file is not on any of the mirrors.

------- Comment #5 From Ferris McCormick 2005-08-05 07:40:07 0000 -------
(In reply to comment #4)
> (In reply to comment #1)
> > #2 is caused by #1.
> > 
> > as for #1, it should be looking in $DISTDIR after downloading that file from 
> > the mirrors. could you please sync and try again?
> 
> From either a fresh sync or from current cvs version of torque, I get:
> !!! Couldn't download torque-1.2.0_p1-respect-destdir.patch.gz. Aborting.
> 
> Problem is that the file is not on any of the mirrors.

Indeed, looking at a mirror site shows indicates that there are exactly 2
torque-related files on the mirrors/distfiles:
torque-1.0.1p6.tar.gz 
torque-1.2.0p1.tar.gz

------- Comment #6 From Ferris McCormick 2005-08-06 06:39:04 0000 -------
And, for what it's worth, the other patch file ---
distfiles/torque-1.2.0_p1-respect-ldflags.patch.gz --- is also absent from all
mirrors.  So, it seems impossible for torque-1.2.0_p1-r3.ebuild to get beyond:
No such file `torque-1.2.0_p1-respect-destdir.patch.gz'.
(And, if you force that file to exist for verification, same thing for
...-ldflags.patch.gz.)

------- Comment #7 From Ferris McCormick 2005-08-12 15:02:24 0000 -------
This one is starting to seriously annoy me.  There has been a version bump of
torque, ChangeLog assures me that the patch files are on the mirrors.

But they are not:  E.g.,
==================================
 emerge (1 of 2) sys-cluster/torque-1.2.0_p5 to /

.........................

Connecting to ftp.gtlib.cc.gatech.edu[130.207.108.136]:21... connected.
Logging in as anonymous ... Logged in!
==> SYST ... done.    ==> PWD ... done.
==> TYPE I ... done.  ==> CWD /pub/gentoo/distfiles ... done.
==> PASV ... done.    ==> RETR torque-1.2.0_p1-respect-destdir.patch.gz ... 
No such file `torque-1.2.0_p1-respect-destdir.patch.gz'.
==================================

Could someone please try to build this on a system on which the patch files are
not already present, and if it works, indicate just where the patch files load
from?  Because portage certainly can't find them when run on any of my systems.

(This is from the ebuild you get from 'cvs up' run on 12.viii.05, 2200GMT.)

------- Comment #8 From Robin Johnson 2005-08-12 16:22:15 0000 -------
My logs show that I've uploaded the patches.
So where exactly they are I'm not sure.

I've uploaded them again for good measure.

------- Comment #9 From Ferris McCormick 2005-08-12 21:12:26 0000 -------
Created an attachment (id=65810) [edit]
Complete ebuild log from failing install for torque-1.2.0_p5

Now, the patches are present, and we are back to my second point in the
original description.  For testing, I took the ebuilds for torque-1.2.0_p5 and
for openpbs-common-1.1.0 from current cvs, keyworded them as 'sparc' locally,
and to keep things very simple, just did:
MAKEOPTS='-j1' FEATURES='keepwork -distcc' emerge torque >& /tmp/TORQUE
Everything compiles perfectly, but the install phase fails as demonstrated in
the attached log --- it creates this nonsense:
/cache/tmp/portage/torque-1.2.0_p5/image/cache/tmp/portage/torque-1.2.0_p5/image/etc

but then tries to do a sane install into directories that do not exist for the
files it wants to put into /etc.
(On this system, PORTAGE_TMPDIR=/cache/tmp, which is why you see 'image/cache'
instead of 'image/tmp').

I suppose this could be as simple as a missing 'autoconf' or something like
that, but I am using the ebuilds exactly as they are currently constructed,
except for adding the 'sparc' keyword for testing in preparation for keywording
them for ~sparc.  Since I see this failure on all systems I have tried to build
torque on, it should be pretty simple to duplicate:  To repeate, 'emerge
torque' is all I need to cause the failure.

------- Comment #10 From Robin Johnson 2005-08-12 22:36:18 0000 -------
could you please post your 'emerge info' output?

------- Comment #11 From Robin Johnson 2005-08-12 22:53:48 0000 -------
fmccor: I don't see the double ${D}${D}/etc that you do. 

------- Comment #12 From Ferris McCormick 2005-08-12 23:13:30 0000 -------
It turns out that the weird directory in image/ is a red herring; The failure I
see comes from the
make DESTDIR=${D} install || die
If I change that to just make DESTDIR=${D} install, then the build does run to
completion and gives this from genlop:
 Sat Aug 13 05:41:06 2005 >>> sys-cluster/torque-1.2.0_p5
The make fails because of creating a directory more than once, chmod on a
not-yet-created directory, or some such.  If you spend enough time looking at
the emerge log, you can see such error messages.  Later in the ebuild there is
code to try to fix up what should end up in /etc, and after the emerge
completes, somehow /etc does get populated with /etc/init.d/pbs, and so on,
/usr/spool/PBS is there, etc.

(I note in passing that just blindly entering the 'xpbs' command without having
a clue what it should give me and getting back 'xpbs: initialization failed!
output: invalid command name "Pref_Init"' is sort of brutal, but that looks a
lot like it is related to USE=tcltk.  And with that lead in, here is 'emerge
--info):
==============================================================
fer-de-lance torque # emerge --info
Portage 2.0.51.22-r2 (default-linux/sparc/sparc64/2005.0, gcc-3.3.5-20050130,
glibc-2.3.3.20040420-r2, 2.6.13-rc4-vanilla sparc64)
=================================================================
System uname: 2.6.13-rc4-vanilla sparc64 sun4u
Gentoo Base System version 1.6.13
distcc 2.18.3 sparc-unknown-linux-gnu (protocols 1 and 2) (default port 3632)
[enabled]
ccache version 2.3 [disabled]
dev-lang/python:     2.3.5
sys-apps/sandbox:    1.2.11
sys-devel/autoconf:  2.13, 2.59-r6
sys-devel/automake:  1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.5
sys-devel/binutils:  2.15.92.0.2-r10
sys-devel/libtool:   1.5.18-r1
virtual/os-headers:  2.4.23
ACCEPT_KEYWORDS="sparc"
AUTOCLEAN="yes"
CBUILD="sparc-unknown-linux-gnu"
CFLAGS="-mcpu=ultrasparc -O2 -mtune=ultrasparc -pipe"
CHOST="sparc-unknown-linux-gnu"
CONFIG_PROTECT="/etc /usr/kde/2/share/config /usr/kde/3.2/share/config
/usr/kde/3/share/config /usr/lib/X11/xkb /usr/lib/mozilla/defaults/pref
/usr/share/config /var/qmail/control"
CONFIG_PROTECT_MASK="/etc/gconf /etc/terminfo /etc/texmf/web2c /etc/env.d"
CXXFLAGS="-mcpu=ultrasparc -O2 -mtune=ultrasparc -pipe -Wno-deprecated"
DISTDIR="/cache/portage/distfiles"
FEATURES="autoconfig cvs distcc distlocks sandbox sfperms strict"
GENTOO_MIRRORS="http://mirror.datapipe.net/gentoo
ftp://130.207.108.136/pub/gentoo http://mirror.phy.olemiss.edu/mirror/gentoo
http://194.117.143.72 http://194.117.143.69"
MAKEOPTS="-j4"
PKGDIR="/usr/portage/packages"
PORTAGE_TMPDIR="/cache/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/usr/local/portage"
SYNC="rsync://rsync.gentoo.org/gentoo-portage"
USE="sparc X Xaw3d avi berkdb bitmap-fonts cdr crypt curl dlloader eds encode
fam fbcon flac foomaticdb fortran gcc64 gdbm gif gpm graphviz gstreamer gtk gtk2
imagemagick imlib java jpeg junit kerberos libwww mad mikmod motif mozilla mpeg
mpi mysql ncurses nls ogg opengl pam pdflib perl png python qt readline ruby
ruby18 sdl slang spell sqlite ssl tcltk tcpd tetex tiff truetype truetype-fonts
type1-fonts vorbis xml xml2 xmms xv zlib userland_GNU kernel_linux elibc_glibc"
Unset:  ASFLAGS, CTARGET, LANG, LC_ALL, LDFLAGS, LINGUAS
===================================================

------- Comment #13 From Robin Johnson 2005-08-12 23:36:50 0000 -------
ok, I just hacked at the package for a while.
I found a spot where DESTDIR should have been but was missing, that should fix 
your issue here. Please try 1.2.0_p5-r1.

You're actually totally off with your guess, it was a line in the Makefile that 
had error output supressed with a leading @.

For the xpbs stuff, I can't comment too much, as I don't use it (my cluster 
doesn't have a graphical head).

------- Comment #14 From Ferris McCormick 2005-08-13 08:36:03 0000 -------
It now builds and installs correctly.  However, with USE=tcltk, xpbs fails
thus:
xpbs: initialization failed! output: couldn't read file
"/cache/tmp/portage/torque-1.2.0_p5-r1/image/usr/lib/pbs/xpbs/preferences.tcl":
no such file or directory  (So, I'm reopening the bug, but retitling it.)

The problem comes from things like this in the /usr/lib/pbs/xpbs directory:
set auto_index(acctname) [list source [file join $dir / cache tmp portage
torque-1.2.0_p5-r1 im
age usr lib pbs xpbs acctname.tk]]

Since I use tcl and am somewhat familiar with this, I'll look at it myself. 
(This should not be a big problem: It is building in an absolute path back to
the TEMP directory; One example (of many):
set auto_index(prefDoIt) [list source [file join $dir / cache tmp portage
torque-1.2.0_p5-r1 image usr lib pbs xpbs preferences.tcl]]

(Indeed, if you repopolate theTEMPDIR/image directory after the build, then
xpbs
seems to be fine.)

If I find the solution, I can either just fix the problem or post the solution
here; your choice.

------- Comment #15 From Robin Johnson 2005-08-13 11:17:24 0000 -------
fmccor: just post up a patch to fix it please.

------- Comment #16 From Ferris McCormick 2005-08-13 14:40:13 0000 -------
Created an attachment (id=65878) [edit]
Patch to torque-1.2.0_p5-r1 to correct the tclIndex files to refer to files
where installed on live system

The attached patch applies to torque-1.2.0_p5-r1.ebuild.

Currently, if torque builds for tcltk gui, it installs the torque-specific
library into ${ROOT}/usr/lib/pbs/{xpbs,xpbsmon}.  However, it builds the index
itself during the portage src_install phase, and it builds it in place
${DESTDIR}.  This means that on a live system, torque tcl-dependent will always
fail, because the TMPDIR .../image/... tree is, of course, not present.

On USE=tcltk, this patch forces the install phase to correct the tclIndex to
refer to its location on the live system, which is just $dir (like for every
other tcl application), not "$dir tmp cache portage torque-1.2.0_p5-r1 image
..." (which is what the 'make install' set it to).

With the patch applied to the ebuild, both xpbs and xpbsmon can start up
normally.

The actual patch is much shorter than this description of what it's for.

------- Comment #17 From Robin Johnson 2005-08-13 14:55:04 0000 -------
could you come up with a patch that we can apply directly in src_unpack, so we 
can send it upstream properly?

my other patches are pending with upstream already.

------- Comment #18 From Ferris McCormick 2005-08-13 16:16:29 0000 -------
(In reply to comment #17)
> could you come up with a patch that we can apply directly in src_unpack, so we 
> can send it upstream properly?
> 
> my other patches are pending with upstream already.

Not easily; I'll have to think about it.  I am not sure whether changes need to
go into the Makefiles, or into the local buildindex script, or what.  It gets
built the way it does, because torque is apparently forcing absolute paths for
the tcl auto_mkindex setup.  That is, it is assuming that the libraries are
installed where they are supposed to end up.  I don't see off hand how to change
it to be correct for Gentoo and still work the way upstream wants it to work. 
(I think what they are doing is incorrect, but I have to believe that it must
work when ${DESTDIR}=<empty>. But I am not sure how.)

------- Comment #19 From Ferris McCormick 2005-08-13 17:23:46 0000 -------
Created an attachment (id=65885) [edit]
Version of torque-1.2.0_p5-destdir-fixes.patch which works to get tclIndex
files correct for Gentoo

This version of torque-1.2.0_p5-destdir-fixes.patch also changes
gui/Makefile.in and tools/xpbsmon/Makefile.in so that they set corresponding
tclIndex files correctly for Gentoo.  They just change the absolute library
path to '.', and rely on the xpbs[mon] scripts & tcl autoload mechanism to
resolve the library directories correctly.  With this version of the patch file
and the ebuild as distributed (with the addition of ~sparc KEYWORD),
torque-1,2,0_p5-r1 builds and  installs, and xpbs[mon] both seem to work.

I suspect that this change is not good for torque in general, but I can't
really say for sure one way or the other.

------- Comment #20 From Ferris McCormick 2005-08-14 07:40:51 0000 -------
To summarize what's going on with Comment 16 (attachment 65878 [edit]) and with
Comment
19 (attachment 65885 [edit]).  The directories xpbs, xpbsmon may be thought of as
dynamic tcl libraries for torque, and the corresponding tclIndex files are the
libraries's tables of contents.  the xpbs[mon] scripts add these directories to
tcl's search path.

When tcl is asked to execute a command it does not know about, it looks in its
search path for tclIndex files, and looks in them for the missing command.  If
it finds the command, the tclIndex files gives tcl the name of the script to
load for the new command's implementation.  If the tclIndex gives an absolute
path (/a/b/.../z.tcl), then that is used.  Tf the tclIndex gives a relative
path
(a/b/.../z.tcl), then the path is relative to the search path (or perhaps to
the
directory in which tclIndex is found, but in practice they work out to the same
thing).

Torque as distributed forces its tclIndex files to use absolute paths based on
${DESTDIR}${prefix}/lib/pbs/xpbs[mon] --- this works based on the assumption
that ${DESTDIR} is really going to be where the scripts (*.tcl,*.tk) are really
going to end up.  The tclIndex files is built in such a way that the user can
move it to someplace else if for some reason that is desirable.

Gentoo doesn't do that, because DESTDIR, of course, points to some staging area
like /var/tmp/portage/torque-1.2.0_p5-r1/image, and not to the final home base
for torque.  So, as built by torque, the tclIndex files are necessarily
incorrect for Gentoo.

So, what are the patches?  The first may be considered as roughly analogous to
something like opengl-update or to similar scripts:  it fixes up the "links" in
the tclIndex files to refer to the scripts as being based relative to where
they
are going to end up (instead of absolute to .../image).  The second patch
changes the torque build semantics so that the tclIndex files always refer to
the implementation files as relative to tcl's search path.

The two approaches are equivalent for Gentoo, but not for torque.  It is pretty
clear from the torque source and associated comments that torque's current
bahavior is intended, and it does work when $(DESTDIR) is the true install
destination.  I do not know why a user would want to separate a tcl library
from
its index, but torque allows for it.  The second approach has the disadvantage
of having to stay around in torque ebuilds, but it does not disturb the rules
the torque people have established, whatever their reasons.

I don't see any approach that torque could adopt and that would satisfly
Gentoo,
without introducing a lot more complexity into the build/install process.  You
could introduce something like $(FINAL_DESTDIR) beyond $(DESTDIR), but then you
would also have to change torque's buildindex script (for creating the tclIndex
files) because currently it has to be run in the target directory for the
installation (it is run post-actual-install in the 'make install' target), ....
and this seems like excessive effort and ugliness to avoid about 6 lines of
ebuild script. 

------- Comment #21 From Robin Johnson 2005-08-14 14:14:25 0000 -------
argh. I had no idea the problem was this complex. I think I'll stick to your 
ebuild patch for now, and not patch the Makefile.in for relative lookups, as 
that is more in keeping with the style of torque.

------- Comment #22 From Michael Imhof 2005-09-05 06:29:19 0000 -------
Robbat: this one is fixed or will get fixed by you?

------- Comment #23 From Ian Stakenvicius 2006-01-19 07:36:36 0000 -------
Has it been noted that this bug is in 1.2.0_p5-r2 as well?  The .ebuild patch
seems to fix it..

------- Comment #24 From Ferris McCormick 2006-01-19 08:26:33 0000 -------
(In reply to comment #23)
> Has it been noted that this bug is in 1.2.0_p5-r2 as well?  The .ebuild patch
> seems to fix it..
> 

The ebuild patch is not version sensitive, unless upstream addresses the
problem themselves.  (And as noted elsewhere on the bug, from their point of
view there is no problem because of their own assumptions on how the 'make
install' process works for torque.)

------- Comment #25 From Martin Mokrejs 2006-03-07 17:04:09 0000 -------
Please commit at least the first patch, if not both.

------- Comment #26 From Donnie Berkholz 2006-08-14 13:34:38 0000 -------
*** Bug 111410 has been marked as a duplicate of this bug. ***

------- Comment #27 From Donnie Berkholz 2006-09-24 22:47:37 0000 -------
Could you confirm this is still an issue with torque 2.1.2?

------- Comment #28 From Donnie Berkholz 2006-09-25 21:40:11 0000 -------
This looks fixed in torque 2.1.2 by upstream, and I've committed a fix to
openpbs 2.3.16-r4.

------- Comment #29 From Ferris McCormick 2007-04-23 17:17:14 0000 -------
Yes, it now seems to be good with -2.1.6.  I've added ~sparc back for torque
and openpbs-common.  Thanks.

First Last Prev Next    No search results available      Search page      Enter new bug