Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 259420 - app-dicts/aspell-es-0.50.2: can not extract with either "tar" or "pax" - hence, emerge fails
Summary: app-dicts/aspell-es-0.50.2: can not extract with either "tar" or "pax" - henc...
Status: RESOLVED TEST-REQUEST
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: x86 Linux
: High normal (vote)
Assignee: Spell checking utilities and dictionaries -- related bugs (OBSOLETE)
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-02-17 22:13 UTC by Ashu Tiwary
Modified: 2012-10-09 18:16 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Ashu Tiwary 2009-02-17 22:13:18 UTC
When trying to "emerge app-dict/aspell-es", there is a file in the package (tar/bz2) that is named with an accented character - this causes the untar to fail.

Reproducible: Always

Steps to Reproduce:
1.emerge app-dict/aspell-es
2.
3.

Actual Results:  
>>> Unpacking source...
>>> Unpacking aspell-es-0.50-2.tar.bz2 to /mnt/portage/tmp/portage/app-dicts/aspell-es-0.50.2/work
tar: aspell-es-0.50-2/espa\361ol.alias: Cannot open: Invalid argument
tar: Exiting with failure status due to previous errors
 *
 * ERROR: app-dicts/aspell-es-0.50.2 failed.
 * Call stack:
 *               ebuild.sh, line   49:  Called src_unpack
 *             environment, line  121:  Called _eapi0_src_unpack
 *               ebuild.sh, line  602:  Called unpack 'aspell-es-0.50-2.tar.bz2'
 *               ebuild.sh, line  380:  Called die
 * The specific snippet of code:
 *                                      assert "$myfail"
 *  The die message:
 *   failure unpacking aspell-es-0.50-2.tar.bz2
 *
 * If you need support, post the topmost build error, and the call stack if relevant.
 * A complete build log is located at '/mnt/portage/logs/app-dicts:aspell-es-0.50.2:20090217-203042.log'.
 * The ebuild environment file is located at '/mnt/portage/tmp/portage/app-dicts/aspell-es-0.50.2/temp/environment'.
 *

Expected Results:  
Successful emerge

I've tried both tar + pax, changing my locale (LC_ALL) to "es_ES", "C", etc - nothing is able to get past the accented character in the filename.
Comment 1 Ashu Tiwary 2009-02-17 22:17:51 UTC
Portage 2.1.6.7 (default/linux/x86/2008.0, gcc-4.3.3, glibc-2.9_p20081201-r1, 2.6.28-gentoo-r1 i686)
=================================================================
System uname: Linux-2.6.28-gentoo-r1-i686-Intel-R-_Core-TM-2_Duo_CPU_T7700_@_2.40GHz-with-glibc2.0
Timestamp of tree: Tue, 17 Feb 2009 00:45:02 +0000
app-shells/bash:     3.2_p48-r1
dev-java/java-config: 2.1.7
dev-lang/python:     2.5.4-r2
sys-apps/baselayout: 2.0.0
sys-apps/openrc:     0.4.3-r1
sys-apps/sandbox:    1.3.7
sys-devel/autoconf:  2.13, 2.63
sys-devel/automake:  1.5, 1.7.9-r1, 1.8.5-r3, 1.9.6-r2, 1.10.2
sys-devel/binutils:  2.19.1
sys-devel/gcc-config: 1.4.1
sys-devel/libtool:   2.2.6a
virtual/os-headers:  2.6.28-r1
ACCEPT_KEYWORDS="x86 ~x86"
CBUILD="i686-pc-linux-gnu"
CFLAGS="-O2 -pipe -march=native -mmmx -msse -msse2 -msse3 -mssse3 -fomit-frame-pointer -mfpmath=sse"
CHOST="i686-pc-linux-gnu"
CONFIG_PROTECT="/etc /var/spool/torque"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/env.d/java/ /etc/eselect/postgresql /etc/fonts/fonts.conf
 /etc/gconf /etc/gentoo-release /etc/php/apache2-php5/ext-active/ /etc/php/cgi-php5/ext-active/ /etc/php/cli-php5/ext-ac
tive/ /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo /etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/up
dmap.d /etc/texmf/web2c /etc/udev/rules.d"
CXXFLAGS="-O2 -pipe -march=native -mmmx -msse -msse2 -msse3 -mssse3 -fomit-frame-pointer -mfpmath=sse"
DISTDIR="/mnt/portage/distfiles"
FEATURES="distlocks fixpackages metadata-transfer parallel-fetch protect-owned sandbox sfperms strict unmerge-orphans us
erfetch userpriv usersandbox"
GENTOO_MIRRORS="ftp://ftp.gtlib.gatech.edu/pub/gentoo ftp://mirror.iawnet.sandia.gov/pub/gentoo/ ftp://gentoo.mirrors.pa
ir.com/ ftp://gentoo.mirrors.tds.net/gentoo ftp://gentoo.netnitco.net/pub/mirrors/gentoo/source/ ftp://mirror.datapipe.n
et/gentoo ftp://mirror.mcs.anl.gov/pub/gentoo/ ftp://gentoo.cites.uiuc.edu/pub/gentoo/ ftp://distro.ibiblio.org/pub/linu
x/distributions/gentoo/ "
LANG="C"
LC_ALL="C"
LDFLAGS="-Wl,--as-needed"
LINGUAS="en en_US en_GB en_IN hi hi_IN de de_DE es es_ES es_MX fr fr_FR it it_IT ja ja_JP ko ko_KR ru ru_RU zh zh_CN zh_
HK zh_TW"
MAKEOPTS="-j3"
PKGDIR="/mnt/portage/packages"
PORTAGE_RSYNC_EXTRA_OPTS="--timeout=500"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --
timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages"
PORTAGE_TMPDIR="/mnt/portage/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/mnt/portage/local /usr/local/portage/layman/jokey /usr/local/portage/layman/java-overlay /usr/local/po
rtage/layman/vmware"
SYNC="rsync://rsync21.us.gentoo.org/gentoo-portage"
USE="X acl berkdb bzip2 cli cracklib crypt cups dbus doc dri examples fortran gdbm gpm hal htmlhandbook iconv ipv6 isdnlog midi mmx mudflap ncurses nls nptl nptlonly openmp pam pcre perl pppd python readline reflection sample session source
 spl sse sse2 sse3 ssl ssse3 sysfs tcpd test unicode x86 xorg zlib" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87
x ca0106 cmipci emu10k1 emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-a
udio via82xx via82xx-modem ymfpci" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks ie
c958 ioplug ladspa lfloat linear meter mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alia
s asis auth_basic auth_digest authn_alias authn_anon authn_dbd authn_dbm authn_default authn_file authz_dbm authz_defaul
t authz_groupfile authz_host authz_owner authz_user autoindex cache cern_meta charset_lite dav dav_fs dav_lock dbd defla
te dir disk_cache dumpio env expires ext_filter file_cache filter headers ident imagemap include info log_config log_for
ensic logio mem_cache mime mime_magic negotiation proxy proxy_ajp proxy_balancer proxy_connect proxy_ftp proxy_http rewr
ite setenvif speling status substitute unique_id userdir usertrack version vhost_alias" APACHE2_MPMS="worker" CAMERAS="a
dc65 agfa-cl20 aox barbie canon casio clicksmart310 digigr8 digita dimera directory enigma13 fuji gsmart300 hp215 iclick
 jamcam jd11 kodak konica largan lg_gsm mars minolta mustek panasonic pccam300 pccam600 polaroid ptp2 ricoh samsung sier
ra sipix smal sonix sonydscf1 sonydscf55 soundvision spca50x sq905 stv0674 stv0680 sx330z template toshiba agfa_cl20 cas
io_qv dimagev dimera3500 kodak_dc120 kodak_dc210 kodak_dc240 kodak_dc3200 kodak_ez200 konica_qm150 panasonic_coolshot pa
nasonic_dc1000 panasonic_dc1580 panasonic_l859 polaroid_pdc320 polaroid_pdc640 polaroid_pdc700 ricoh_g3 sipix_blink sipi
x_blink2 sipix_web2 sony_dscf1 sony_dscf55 toshiba_pdrm11 jl2005a topfield" ELIBC="glibc" INPUT_DEVICES="evdev joystick
keyboard mouse synaptics vmmouse void" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtx
orb ncurses text" LINGUAS="en en_US en_GB en_IN hi hi_IN de de_DE es es_ES es_MX fr fr_FR it it_IT ja ja_JP ko ko_KR ru
ru_RU zh zh_CN zh_HK zh_TW" LIRC_DEVICES="all" NETBEANS_MODULES="apisupport harness ide java nb cnd groovy gsf identity
j2ee mobility php profiler soa visualweb webcommon websvccommon xml" USERLAND="GNU" VIDEO_CARDS="nvidia nv vmware fbdev
v4l vesa vga"
Unset:  CPPFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, FFLAGS, INSTALL_MASK, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS
Comment 2 Wormo (RETIRED) gentoo-dev 2009-02-18 22:21:46 UTC
What filesystem are you using for /mnt/portage/tmp/portage/ ? I suspect the filesystem is refusing this filename since both tar and pax failed.
Comment 3 Ashu Tiwary 2009-02-19 05:08:00 UTC
I'm using jfs w/ mount options of "iocharset=utf8,noatime" - I've had no trouble with creation of files with accents in the names:
=====
shooze tst # cd /mnt/portage/tmp
shooze tmp # df -m .
Filesystem           1M-blocks      Used Available Use% Mounted on
/dev/mapper/rootvg-portage
                        102365     38223     64143  38% /mnt/portage
shooze tmp # mount | grep portage
/dev/mapper/rootvg-portage on /mnt/portage type jfs (rw,noatime,iocharset=utf8)
shooze tmp # touch héáøøó
shooze tmp # ls -l
total 64
drwxrwxr-x 2 portage portage     1 2009-02-18 19:25 binpkgs
-rw-r--r-- 1 root    root        0 2009-02-19 04:18 héáøøó
drwxrwxr-x 4 portage portage 57344 2009-02-19 04:12 portage

shooze tmp # tar tjvf /mnt/portage/distfiles/aspell-es-0.50-2.tar.bz2
drwxr-x--- kevina/kevina     0 2002-08-28 07:54 aspell-es-0.50-2/
drwx------ kevina/kevina     0 2001-06-13 04:12 aspell-es-0.50-2/doc/
-rw------- kevina/kevina   258 2001-06-13 04:12 aspell-es-0.50-2/doc/README
-rw------- kevina/kevina   355 2002-08-26 03:02 aspell-es-0.50-2/info
-rw------- kevina/kevina  2013 2002-08-28 07:54 aspell-es-0.50-2/README
-rwx------ kevina/kevina  2424 2002-08-26 04:55 aspell-es-0.50-2/configure
-rw------- kevina/kevina   486 2001-06-13 03:57 aspell-es-0.50-2/Copyright
-rw------- kevina/kevina    70 2002-08-28 07:54 aspell-es-0.50-2/es.multi
-rw------- kevina/kevina   104 2002-08-28 07:54 aspell-es-0.50-2/es.dat
-rw------- kevina/kevina 1678454 2001-06-13 04:09 aspell-es-0.50-2/es.cwl
-rw------- kevina/kevina    1340 2002-08-28 07:54 aspell-es-0.50-2/Makefile.pre
-rw-r----- kevina/kevina      72 2002-08-28 07:54 aspell-es-0.50-2/espa\361ol.alias
-rw-r----- kevina/kevina      72 2002-08-28 07:54 aspell-es-0.50-2/spanish.alias
-rw-r----- kevina/kevina      72 2002-08-28 07:54 aspell-es-0.50-2/esponol.alias
-rw------- kevina/kevina   18009 2002-08-28 07:54 aspell-es-0.50-2/COPYING

shooze tmp # tar xjf /mnt/portage/distfiles/aspell-es-0.50-2.tar.bz2
tar: aspell-es-0.50-2/espa\361ol.alias: Cannot open: Invalid argument
tar: Exiting with failure status due to previous errors

shooze tmp # touch español
shooze tmp # ls -l
total 68
drwxr-x--- 3     501     501    96 2002-08-28 07:54 aspell-es-0.50-2
drwxrwxr-x 2 portage portage     1 2009-02-18 19:25 binpkgs
-rw-r--r-- 1 root    root        0 2009-02-19 04:24 español
-rw-r--r-- 1 root    root        0 2009-02-19 04:18 héáøøó
drwxrwxr-x 4 portage portage 57344 2009-02-19 04:12 portage

shooze tmp # echo $LC_ALL $LANG
en_US.UTF-8 en_US.UTF-8
=====

As you can see above - I don't have a problem with creating files with accented characters - from the output of "tar", it appears to be having a problem with the file "aspell-es-0.50-2/espa\361ol.alias" - extended ASCII for \361 = 0xF1 = 241 = "±" in codepage 850/437; in these codepages, the correct extended ASCII for the glyph "ñ" should be 0xA4 (= \244 = 164).

UTF-8 hex value for "ñ" = U+00F1.

Interestingly - I've been able to successfully extract using 7z - it works perfectly fine:

=====
shooze tmp # 7z l /mnt/portage/distfiles/aspell-es-0.50-2.tar.bz2

7-Zip  4.58 beta  Copyright (c) 1999-2008 Igor Pavlov  2008-05-05
p7zip Version 4.58 (locale=en_US.UTF-8,Utf16=on,HugeFiles=on,2 CPUs)

Listing archive: /mnt/portage/distfiles/aspell-es-0.50-2.tar.bz2

   Date      Time    Attr         Size   Compressed  Name
------------------- ----- ------------ ------------  ------------------------
2009-02-16 20:33:35                          157809  aspell-es-0.50-2.tar
------------------- ----- ------------ ------------  ------------------------
                                             157809  1 files, 0 folders
*shooze tmp # 7z x /mnt/portage/distfiles/aspell-es-0.50-2.tar.bz2

7-Zip  4.58 beta  Copyright (c) 1999-2008 Igor Pavlov  2008-05-05
p7zip Version 4.58 (locale=en_US.UTF-8,Utf16=on,HugeFiles=on,2 CPUs)

Processing archive: /mnt/portage/distfiles/aspell-es-0.50-2.tar.bz2

Extracting  aspell-es-0.50-2.tar

Everything is Ok

Size:       1720320
Compressed: 157809
shooze tmp # ls -l
total 1744
-rw-r--r-- 1 root    root    1720320 2009-02-16 20:33 aspell-es-0.50-2.tar
drwxrwxr-x 2 portage portage       1 2009-02-18 19:25 binpkgs
-rw-r--r-- 1 root    root          0 2009-02-19 04:24 español
-rw-r--r-- 1 root    root          0 2009-02-19 04:18 héáøøó
drwxrwxr-x 4 portage portage   57344 2009-02-19 04:12 portage
shooze tmp # 7z l aspell-es-0.50-2.tar

7-Zip  4.58 beta  Copyright (c) 1999-2008 Igor Pavlov  2008-05-05
p7zip Version 4.58 (locale=en_US.UTF-8,Utf16=on,HugeFiles=on,2 CPUs)

Listing archive: aspell-es-0.50-2.tar

   Date      Time    Attr         Size   Compressed  Name
------------------- ----- ------------ ------------  ------------------------
2002-08-28 07:54:51                  0            0  aspell-es-0.50-2
2001-06-13 04:12:14                  0            0  aspell-es-0.50-2/doc
2001-06-13 04:12:14                258          258  aspell-es-0.50-2/doc/README
2002-08-26 03:02:47                355          355  aspell-es-0.50-2/info
2002-08-28 07:54:51               2013         2013  aspell-es-0.50-2/README
2002-08-26 04:55:53               2424         2424  aspell-es-0.50-2/configure
2001-06-13 03:57:44                486          486  aspell-es-0.50-2/Copyright
2002-08-28 07:54:51                 70           70  aspell-es-0.50-2/es.multi
2002-08-28 07:54:51                104          104  aspell-es-0.50-2/es.dat
2001-06-13 04:09:18            1678454      1678454  aspell-es-0.50-2/es.cwl
2002-08-28 07:54:51               1340         1340  aspell-es-0.50-2/Makefile.pre
2002-08-28 07:54:51                 72           72  aspell-es-0.50-2/español.alias
2002-08-28 07:54:51                 72           72  aspell-es-0.50-2/spanish.alias
2002-08-28 07:54:51                 72           72  aspell-es-0.50-2/esponol.alias
2002-08-28 07:54:51              18009        18009  aspell-es-0.50-2/COPYING
------------------- ----- ------------ ------------  ------------------------
                               1703729      1703729  13 files, 2 folders
shooze tmp # 7z x aspell-es-0.50-2.tar

7-Zip  4.58 beta  Copyright (c) 1999-2008 Igor Pavlov  2008-05-05
p7zip Version 4.58 (locale=en_US.UTF-8,Utf16=on,HugeFiles=on,2 CPUs)

Processing archive: aspell-es-0.50-2.tar

Extracting  aspell-es-0.50-2
Extracting  aspell-es-0.50-2/doc
Extracting  aspell-es-0.50-2/doc/README
Extracting  aspell-es-0.50-2/info
Extracting  aspell-es-0.50-2/README
Extracting  aspell-es-0.50-2/configure
Extracting  aspell-es-0.50-2/Copyright
Extracting  aspell-es-0.50-2/es.multi
Extracting  aspell-es-0.50-2/es.dat
Extracting  aspell-es-0.50-2/es.cwl
Extracting  aspell-es-0.50-2/Makefile.pre
Extracting  aspell-es-0.50-2/español.alias
Extracting  aspell-es-0.50-2/spanish.alias
Extracting  aspell-es-0.50-2/esponol.alias
Extracting  aspell-es-0.50-2/COPYING

Everything is Ok

Folders: 2
Files: 13
Size:       1703729
Compressed: 1720320
shooze tmp # ls -l aspell-es-0.50-2/
total 1700
-rw-r--r-- 1 root root    2424 2002-08-26 04:55 configure
-rw-r--r-- 1 root root   18009 2002-08-28 07:54 COPYING
-rw-r--r-- 1 root root     486 2001-06-13 03:57 Copyright
drwx------ 2 root root       8 2009-02-19 05:05 doc
-rw-r--r-- 1 root root 1678454 2001-06-13 04:09 es.cwl
-rw-r--r-- 1 root root     104 2002-08-28 07:54 es.dat
-rw-r--r-- 1 root root      70 2002-08-28 07:54 es.multi
-rw-r--r-- 1 root root      72 2002-08-28 07:54 español.alias
-rw-r--r-- 1 root root      72 2002-08-28 07:54 esponol.alias
-rw-r--r-- 1 root root     355 2002-08-26 03:02 info
-rw-r--r-- 1 root root    1340 2002-08-28 07:54 Makefile.pre
-rw-r--r-- 1 root root    2013 2002-08-28 07:54 README
-rw-r--r-- 1 root root      72 2002-08-28 07:54 spanish.alias
=====

So, it appears that 7z is correctly respecting Unicode (UTF-8), while tar / pax are not (and, indeed, are attempting to use codepage 437/850).
Comment 4 Peter Volkov (RETIRED) gentoo-dev 2009-03-03 10:58:57 UTC
What tar version do you have? If you have tar-1.21 that bug 252680 could be related...
Comment 5 Ashu Tiwary 2009-03-03 23:45:21 UTC
I am running 1.21-r1 (patched tar) - this includes the patch for the bug you've spoken of:
=====
ashu@liberte ~ $ ls -d /var/db/pkg/app-arch/tar*
/var/db/pkg/app-arch/tar-1.21-r1
=====

(same machine - I've just changed to hostname)

bunzip2 works fine - no error code:

=====
ashu@liberte /tmp/aspell $ cp /mnt/portage/distfiles/aspell-es-0.50-2.tar.bz2  .
ashu@liberte /tmp/aspell $ ls -l
total 156
-rw-r--r-- 1 ashu ashu 157809 2009-03-04 00:44 aspell-es-0.50-2.tar.bz2
ashu@liberte /tmp/aspell $ bunzip2 -c aspell-es-0.50-2.tar.bz2 >aspell-es-0.50-2.tar
ashu@liberte /tmp/aspell $ echo $?
0
ashu@liberte /tmp/aspell $ ls -l
total 1836
-rw-r--r-- 1 ashu ashu 1720320 2009-03-04 00:44 aspell-es-0.50-2.tar
-rw-r--r-- 1 ashu ashu  157809 2009-03-04 00:44 aspell-es-0.50-2.tar.bz2
ashu@liberte /tmp/aspell $ tar xf aspell-es-0.50-2.tar
tar: aspell-es-0.50-2/espa\361ol.alias: Cannot open: Invalid argument
tar: Exiting with failure status due to previous errors
ashu@liberte /tmp/aspell $ echo $?
2
=====

Comment 6 Ashu Tiwary 2009-08-08 18:58:12 UTC
I had my JFS filesystems (both "/mnt/portage" (my portage base directory) as well as "/usr") mounted with "iocharset=utf8".  This seems to have been causing the problem: there are some characters stored in the TAR archive in iso8859-1 format; I was able to work around the issue by remounting with "iocharset=none" for both filesystems.

I'm not sure where / to whom to direct this "bug" - is it simply an incompatibility between the JFS implementation of its character encoding with UTF-8 versus how TAR encodes/decodes filenames?
Comment 7 Fabio Correa 2009-11-04 15:04:52 UTC
Do you use >=aspell-0.60? Then you will not be able to use this version of the dictionary. You might need the latest version of aspell-es; please see related Bug #291863 .
Comment 8 Pacho Ramos gentoo-dev 2012-10-09 18:16:32 UTC
(In reply to comment #7)
> Do you use >=aspell-0.60? Then you will not be able to use this version of
> the dictionary. You might need the latest version of aspell-es; please see
> related Bug #291863 .