First Last Prev Next    No search results available      Search page      Enter new bug
Bug#: 136071
Alias:
Product:
Component:
Status: RESOLVED
Resolution: FIXED
Assigned To: Gentoo Linux High-Performance Clustering Team <hp-cluster@gentoo.org>
Hardware:
OS:
Version:
Priority:
Severity:
Reporter: dawe <daweonline@gmail.com>
Add CC:
CC:
URL:
Summary:
Status Whiteboard:
Keywords:

Filename Description Type Creator Created Size Actions
lam-mpi-7.1.1-ssi-boot-rsh-inetexec.patch Proposal of patch to add the space before the bracket patch dawe 2006-06-08 07:24 0000 561 bytes Details | Diff
Create a New Attachment (proposed patch, testcase, etc.) View All

Bug 136071 depends on: Show dependency tree
Bug 136071 blocks:
Votes: 0    Show votes for this bug    Vote for this bug

Additional Comments: (this is where you put emerge --info)


Not eligible to see or edit group visibility for this bug.






View Bug Activity   |   Format For Printing   |   XML   |   Clone This Bug


Description:   Opened: 2006-06-08 07:23 0000
Hi there, after installing lam-mpi-7.1.1 on my computer, specifying USE=-crypt
to enable rsh communication in place of ssh one, I have errors launching it:

$ lamboot 

LAM 7.1.1/MPI 2 C++/ROMIO - Indiana University

ERROR: LAM/MPI unexpectedly received the following on stderr:
localshell: line 0: [: missing `]'
-----------------------------------------------------------------------------
LAM attempted to execute a process on the remote node
"node1.sge.ifom-ieo-campus.it",
but received some output on the standard error.  This heuristic
assumes that any output on the standard error indicates a fatal error,
and therefore aborts.  You can disable this behavior (i.e., have LAM
ignore output on standard error) in the rsh boot module by setting the
SSI parameter boot_rsh_ignore_stderr to 1.

LAM tried to use the remote agent command "rsh" 
to invoke "hboot" on the remote node.

*** PLEASE READ THIS ENTIRE MESSAGE, FOLLOW ITS SUGGESTIONS, AND
*** CONSULT THE "BOOTING LAM" SECTION OF THE LAM/MPI FAQ
*** (http://www.lam-mpi.org/faq/) BEFORE POSTING TO THE LAM/MPI USER'S
*** MAILING LIST.

This can indicate an authentication error with the remote agent, or
can indicate an error in your $HOME/.cshrc, $HOME/.login, or
$HOME/.profile files.  The following is a (non-inclusive) list of items
that you should check on the remote node:

        - You have an account and can login to the remote machine
        - Incorrect permissions on your home directory (should
          probably be 0755) 
        - Incorrect permissions on your $HOME/.rhosts file (if you are
          using rsh -- they should probably be 0644) 
        - You have an entry in the remote $HOME/.rhosts file (if you
          are using rsh) for the machine and username that you are
          running from
        - Your .cshrc/.profile must not print anything out to the 
          standard error
        - Your .cshrc/.profile should set a correct TERM type
        - Your .cshrc/.profile should set the SHELL environment
          variable to your default shell

Try invoking the following command at the unix command line:

        rsh node1.sge.ifom-ieo-campus.it -n '( ! [ -e ./.profile] || .
./.profile;' hboot -t -c lam-conf.lamd -s -I '"-H 85.239.175.36 -P 32809 -n 1
-o 0"' )

You will need to configure your local setup such that you will *not*
be prompted for a password to invoke this command on the remote node.
No output should be printed from the remote node before the output of
the command is displayed.

When you can get this command to execute successfully by hand, LAM
will probably be able to function properly.
-----------------------------------------------------------------------------
ERROR: LAM/MPI unexpectedly received the following on stderr:
localshell: line 0: [: missing `]'
-----------------------------------------------------------------------------
LAM attempted to execute a process on the remote node
"node1.sge.ifom-ieo-campus.it",
but received some output on the standard error.  This heuristic
assumes that any output on the standard error indicates a fatal error,
and therefore aborts.  You can disable this behavior (i.e., have LAM
ignore output on standard error) in the rsh boot module by setting the
SSI parameter boot_rsh_ignore_stderr to 1.

LAM tried to use the remote agent command "rsh" 
to invoke "tkill" on the remote node.

*** PLEASE READ THIS ENTIRE MESSAGE, FOLLOW ITS SUGGESTIONS, AND
*** CONSULT THE "BOOTING LAM" SECTION OF THE LAM/MPI FAQ
*** (http://www.lam-mpi.org/faq/) BEFORE POSTING TO THE LAM/MPI USER'S
*** MAILING LIST.

This can indicate an authentication error with the remote agent, or
can indicate an error in your $HOME/.cshrc, $HOME/.login, or
$HOME/.profile files.  The following is a (non-inclusive) list of items
that you should check on the remote node:

        - You have an account and can login to the remote machine
        - Incorrect permissions on your home directory (should
          probably be 0755) 
        - Incorrect permissions on your $HOME/.rhosts file (if you are
          using rsh -- they should probably be 0644) 
        - You have an entry in the remote $HOME/.rhosts file (if you
          are using rsh) for the machine and username that you are
          running from
        - Your .cshrc/.profile must not print anything out to the 
          standard error
        - Your .cshrc/.profile should set the SHELL environment
          variable to your default shell

Try invoking the following command at the unix command line:

        rsh node1.sge.ifom-ieo-campus.it -n '( ! [ -e ./.profile] || .
./.profile;' tkill )

You will need to configure your local setup such that you will *not*
be prompted for a password to invoke this command on the remote node.
No output should be printed from the remote node before the output of
the command is displayed.

When you can get this command to execute successfully by hand, LAM
will probably be able to function properly.
-----------------------------------------------------------------------------


I think that is because it launches the following:

[ -e ./.profile] 

where a space is missing before the bracket.

$ emerge --info

Portage 2.1_rc3-r1 (default-linux/amd64/2005.1, gcc-3.4.3, glibc-2.3.5-r0,
2.6.15-gentoo-r7-smp x86_64)
=================================================================
System uname: 2.6.15-gentoo-r7-smp x86_64 AMD Opteron(tm) Processor 252
Gentoo Base System version 1.6.12
dev-lang/python:     2.3.5
dev-python/pycrypto: 2.0.1-r5
dev-util/ccache:     [Not Present]
dev-util/confcache:  [Not Present]
sys-apps/sandbox:    1.2.17
sys-devel/autoconf:  2.13, 2.59-r6
sys-devel/automake:  1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.5
sys-devel/binutils:  2.16.1-r2
sys-devel/libtool:   1.5.18-r1
virtual/os-headers:  2.6.11-r2
ACCEPT_KEYWORDS="amd64 ~amd64"
AUTOCLEAN="yes"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-mtune=k8 -O2 -pipe"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/lib/X11/xkb"
CONFIG_PROTECT_MASK="/etc/gconf /etc/terminfo /etc/texmf/web2c /etc/env.d"
CXXFLAGS="-O2 -pipe"
DISTDIR="/usr/portage/distfiles"
FEATURES="autoaddcvs autoconfig ccache distcc distlocks metadata-transfer
sandbox sfperms strict"
GENTOO_MIRRORS="http://mirror.switch.ch/ftp/mirror/gentoo
http://distfiles.gentoo.org
http://www.ibiblio.org/pub/Linux/distributions/gentoo"
MAKEOPTS="-j3"
PKGDIR="/usr/portage/packages"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress
--force --whole-file --delete --delete-after --stats --timeout=180
--exclude='/distfiles' --exclude='/local' --exclude='/packages'"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/usr/local/portage"
SYNC="rsync://rsync.gentoo.org/gentoo-portage"
USE="amd64 avi berkdb bitmap-fonts cli crypt cups dri eds emboss encode
foomaticdb fortran gif gnome gpm gstreamer gtk gtk2 imlib ipv6 isdnlog jpeg kde
lzw lzw-tiff mp3 mpeg ncurses nls opengl pam pcre pdflib perl png pppd python
qt quicktime readline reflection sdl session spell spl ssl tcpd threads tiff
truetype-fonts type1-fonts usb xorg xpm xv zlib elibc_glibc kernel_linux
userland_GNU"
Unset:  CTARGET, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LANG, LC_ALL, LDFLAGS,
LINGUAS, PORTAGE_RSYNC_EXTRA_OPTS

------- Comment #1 From dawe 2006-06-08 07:24:56 0000 -------
Created an attachment (id=88687) [edit]
Proposal of patch to add the space before the bracket

------- Comment #2 From Donnie Berkholz 2006-07-19 23:18:43 0000 -------
Fixed in 7.1.2.

First Last Prev Next    No search results available      Search page      Enter new bug