Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 305503 - sys-apps/usbutils-0.86-r1 - www.linux-usb.org/usb.ids is returned as a gzip stream
Summary: sys-apps/usbutils-0.86-r1 - www.linux-usb.org/usb.ids is returned as a gzip s...
Status: RESOLVED NEEDINFO
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: All Linux
: High normal (vote)
Assignee: Gentoo's Team for Core System packages
URL: http://www.linux-usb.org/usb.ids
Whiteboard:
Keywords:
: 307955 (view as bug list)
Depends on:
Blocks:
 
Reported: 2010-02-17 10:04 UTC by SCox
Modified: 2010-03-08 03:48 UTC (History)
2 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description SCox 2010-02-17 10:04:35 UTC
/usr/sbin/update-usbids (sys-apps/usbutils) which downloads usb.ids from the url http://www.linux-usb.org/usb.ids is being returned as a gzip stream. 

Subsequently the grep fails to 

Reproducible: Always

Actual Results:  
sudo update-usbids
--2010-02-17 22:56:07--  http://www.linux-usb.org/usb.ids
Resolving www.linux-usb.org... 216.34.181.97
Connecting to www.linux-usb.org|216.34.181.97|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 150232 (147K) [text/plain]
Saving to: `/usr/share/misc/usb.ids.new'

100%[===========================================================>] 150,232     --.-K/s   in 0.03s

2010-02-17 22:56:07 (4.59 MB/s) - `/usr/share/misc/usb.ids.new' saved [150232/150232]

update-usbids: missing class info, probably truncated file


Expected Results:  
usr/share/misc/usb.ids should be replaced by the new version

I have confirmed the http header has "Content-Encoding gzip"
and the following sequence of commands will correctly view the file:

mv usb.ids usb.ids.gz
gunzip usb.ids.gz
less usb.ids

the following is an updated script that handles the gzip format:
#!/bin/sh

# see also update-pciids.sh (fancier)

[ "$1" = "-q" ] && quiet="true" || quiet="false"

set -e
SRC="http://www.linux-usb.org/usb.ids"
DEST=/usr/share/misc/usb.ids

# if usb.ids is read-only (because the filesystem is read-only),
# then just skip this whole process.
if ! touch ${DEST} >&2 >/dev/null ; then
        ${quiet} || echo "${DEST} is read-only, exiting."
        exit 0
fi

if which wget >/dev/null 2>&1 ; then
        DL="wget -O $DEST.new.gz $SRC"
        ${quiet} && DL="$DL -q"
elif which lynx >/dev/null 2>&1 ; then
        DL="eval lynx -source $SRC >$DEST.new.gz"
else
        echo >&2 "update-usbids: cannot find wget nor lynx"
        exit 1
fi

if ! $DL ; then
        echo >&2 "update-usbids: download failed"
        rm -f $DEST.new.gz
        exit 1
fi

gunzip $DEST.new.gz

if ! grep >/dev/null "^C " $DEST.new ; then
        echo >&2 "update-usbids: missing class info, probably truncated file"
        exit 1
fi

if [ -f $DEST ] ; then
        mv $DEST $DEST.old
        # --reference is supported only by chmod from GNU file, so let's ignore any errors
        chmod -f --reference=$DEST.old $DEST.new 2>/dev/null || true
fi
mv $DEST.new $DEST

${quiet} || echo "Done."
Comment 1 Jeroen Roovers (RETIRED) gentoo-dev 2010-02-18 02:33:40 UTC
Works for me:

>---<
astrid ~ # update-usbids
--2010-02-18 03:26:55--  http://www.linux-usb.org/usb.ids
Resolving www.linux-usb.org (www.linux-usb.org)... 216.34.181.97
Connecting to www.linux-usb.org (www.linux-usb.org)|216.34.181.97|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 389702 (381K) [text/plain]
Saving to: `/usr/share/misc/usb.ids.new'

100%[=============================================>] 389,702      211K/s   in 1.8s    

2010-02-18 03:27:01 (211 KB/s) - `/usr/share/misc/usb.ids.new' saved [389702/389702]

Done.
>---<

Do you have a special wgetrc or is lynx used instead of wget?
Comment 2 SpanKY gentoo-dev 2010-02-18 09:18:32 UTC
and it works fine for me

no `emerge --info`, no idea what's wrong with your local utils, no info ...
Comment 3 Samuli Suominen (RETIRED) gentoo-dev 2010-03-06 09:01:58 UTC
*** Bug 307955 has been marked as a duplicate of this bug. ***
Comment 4 Rebecca Menessec 2010-03-08 03:35:32 UTC
*sighs* Yes, I have a system-wide wgetrc that was specifically crafted to mimic Firefox as closely as possible for anti-anti-"robot" purposes.

(I don't do bulk spidering, but occasionally I need a file from a site that has heavy-handy "anti-leech" protection, and I want to retrieve the file with wget.)

Anyhow, the relevant problem is:

header = Accept-Encoding: gzip,deflate

Comment 5 SpanKY gentoo-dev 2010-03-08 03:48:39 UTC
i'm not sure there's a flag to wget to force disable that ...

wonder if we could tweak the code to do `file` on the file it downloaded and use that to determine whether we need to gzip it manually ...

also, it might be useful to run wget with the option --header='Accept-Encoding: gzip,deflate' as this seems to cut the d/l size in half ...