Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 119813 - ebookmerge 0.9 vs. Vodafone's transparent proxy
Summary: ebookmerge 0.9 vs. Vodafone's transparent proxy
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Development (show other bugs)
Hardware: All Linux
: High minor (vote)
Assignee: José Alberto Suárez López (RETIRED)
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-01-21 06:52 UTC by Mathias Hasselmann
Modified: 2006-07-13 03:52 UTC (History)
0 users

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Mathias Hasselmann 2006-01-21 06:52:57 UTC
ebookmerge.sh fails to process the fetched ebook list, when it retrieved the list via Vodafone's transparent proxy: Vodafone overs its UMTS customers the "service" to compress all HTML documents downloaded. In a result of that, the booklist from http://lidn.sourceforge.net/books_download.php, which ebookmerge expects to consist of multiple lines, is folded into one single long line (due whitespace compression).

Possible workarrounds:
a) Fix ebookmerge.sh to use a real HTML parser
b) Make sure, that ebookmerge.sh fetches the pure, uncompressed list

Variant a) is far too complex to implement, therefore I'd suggest to follow variant b): Tell ebookmerge.sh how to fetch the plain list - which actually is quite simple, as wget is used, which can be told to emit some helpful HTTP headers - "Cache-Control: no-cache" in our case.

So I'd like to ask you to patch ebookmerge.sh to run wget with the argument "--header 'Cache-Control: no-cache'", when fetching the list. As of release 0.9, line 136 of the script would have to be changed:

  then
      einfo "Dowloading list from http://lidn.sf.net..."
      cd ${EBDIR}
-     wget -q http://lidn.sourceforge.net/books_download.php
+     wget --header 'Cache-Control: no-cache' \
+          -q http://lidn.sourceforge.net/books_download.php
      cat books_download.php | gawk '/Location/{print $2}' | sed -e 's/<\/        b>//' -e 's/<br>//' > ${EBDIR}/.urls.ebook
      rm books_download.php
      einfo "Dowloaded. Use ${BOLD}-l${NORMAL} for a list."
Comment 1 Jakub Moc (RETIRED) gentoo-dev 2006-01-21 06:57:23 UTC
(In reply to comment #0)
> Possible workarrounds:
> a) Fix ebookmerge.sh to use a real HTML parser
> b) Make sure, that ebookmerge.sh fetches the pure, uncompressed list

c/ Ask Vodafone to fix their crappy proxies or find another ISP. Please, don't file blocker bugs about problems that are definitely not caused by software in question.
Comment 2 Mathias Hasselmann 2006-03-16 04:38:44 UTC
> c/ Ask Vodafone to fix their crappy proxies or find another ISP. Please, don't
> file blocker bugs about problems that are definitely not caused by software in
question.

The won't fix this one, as it is a feature they provide for their mobile customers and they also won't fix this, because their Windows software supports some undockumented trick for disable the proxy. 

Choosing another ISP is not the option, as Vodafone is the only provider in this region (at the very border of Germany's capital city Berlin) providing effortable Internet access. Only alternative for Internet access would be paying per minute via ISDN or leasing an even more expensive T1 line.

As the fix is easy (today wget even supports a --no-cache command line switch) I kindly ask you again to fix this issue in ebookmerge.sh. The list in question is provided via php without any cache supporting anyway, so adding --no-cache really doesn't waste any bandwidth (if you care about this).
Comment 3 José Alberto Suárez López (RETIRED) gentoo-dev 2006-07-13 03:52:15 UTC
fixed in 0.9.2 try -n