mirrorselect crashes, sometimes: ```` pi4 ~ # mirrorselect -s 5 -S -R 'North America' * Using url: https://api.gentoo.org/mirrors/distfiles.xml * Limiting test to "region=North America" hosts. * Limiting test to https hosts. * Downloading a list of mirrors... Got 251 mirrors. * Using netselect to choose the top 5 mirrors...Done. Traceback (most recent call last): File "/usr/lib/python-exec/python3.10/mirrorselect", line 55, in <module> MirrorSelect().main(sys.argv) File "/usr/lib/python3.10/site-packages/mirrorselect/main.py", line 469, in main self.change_config( File "/usr/lib/python3.10/site-packages/mirrorselect/main.py", line 107, in change_config hosts[i] = hosts[i].decode("utf-8") UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc5 in position 29: invalid continuation byte ```` Reproducible: Sometimes Steps to Reproduce: 1.mirrorselect -s 5 -S -R 'North America' 2. 3. Note this is a Rapsberry Pi 3B+, whose name if pi4. Because it is the 4th Pi I bought. ```` pi4 ~ # emerge --info app-portage/mirrorselect Portage 3.0.61 (python 3.10.13-final-0, default/linux/arm/17.0, gcc-13, glibc-2.38-r9, 5.10.11-v7 armv7l) ================================================================= System Settings ================================================================= System uname: Linux-5.10.11-v7-armv7l-ARMv7_Processor_rev_4_-v7l-with-glibc2.38 KiB Mem: 995688 total, 66108 free KiB Swap: 2097148 total, 1991932 free Timestamp of repository gentoo: Sat, 13 Jan 2024 19:05:12 +0000 sh bash 5.1_p16-r6 ld GNU ld (Gentoo 2.40 p7) 2.40.0 app-misc/pax-utils: 1.3.7::gentoo app-shells/bash: 5.1_p16-r6::gentoo dev-build/make: 4.4.1-r1::gentoo dev-lang/perl: 5.38.2-r1::gentoo dev-lang/python: 3.10.13::gentoo, 3.11.7::gentoo, 3.12.1::gentoo dev-lang/rust-bin: 1.71.1::gentoo dev-util/cmake: 3.27.9::gentoo dev-util/meson: 1.3.0-r2::gentoo sys-apps/baselayout: 2.14-r1::gentoo sys-apps/openrc: 0.48::gentoo sys-apps/sandbox: 2.38::gentoo sys-devel/autoconf: 2.71-r6::gentoo sys-devel/automake: 1.16.5-r1::gentoo sys-devel/binutils: 2.40-r9::gentoo sys-devel/binutils-config: 5.5::gentoo sys-devel/gcc: 12.3.1_p20230526::gentoo, 13.2.1_p20230826::gentoo sys-devel/gcc-config: 2.11::gentoo sys-devel/libtool: 2.4.7-r1::gentoo sys-kernel/linux-headers: 6.1::gentoo (virtual/os-headers) sys-libs/glibc: 2.38-r9::gentoo Repositories: gentoo location: /var/db/repos/gentoo sync-type: rsync sync-uri: rsync://rsync.namerica.gentoo.org/gentoo-portage priority: -1000 volatile: False sync-rsync-extra-opts: --exclude ChangeLog* --delete-excluded sync-rsync-verify-metamanifest: yes sync-rsync-verify-jobs: 1 sync-rsync-verify-max-age: 3 ACCEPT_KEYWORDS="arm" ACCEPT_LICENSE="*" CBUILD="armv7a-unknown-linux-gnueabihf" CFLAGS="-march=armv8-a+crc -mtune=cortex-a53 -mfpu=crypto-neon-fp-armv8 -mfloat-abi=hard -O2 -pipe" CHOST="armv7a-unknown-linux-gnueabihf" CONFIG_PROTECT="/etc /usr/share/gnupg/qualified.txt" CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo" CXXFLAGS="-march=armv8-a+crc -mtune=cortex-a53 -mfpu=crypto-neon-fp-armv8 -mfloat-abi=hard -O2 -pipe" DISTDIR="/var/cache/distfiles" EMERGE_DEFAULT_OPTS="--keep-going --with-bdeps=y --backtrack=20" ENV_UNSET="CARGO_HOME DBUS_SESSION_BUS_ADDRESS DISPLAY GDK_PIXBUF_MODULE_FILE GOBIN GOPATH PERL5LIB PERL5OPT PERLPREFIX PERL_CORE PERL_MB_OPT PERL_MM_OPT XAUTHORITY XDG_CACHE_HOME XDG_CONFIG_HOME XDG_DATA_HOME XDG_RUNTIME_DIR XDG_STATE_HOME" FCFLAGS="-O2" FEATURES="assume-digests binpkg-docompress binpkg-dostrip binpkg-logs buildpkg-live config-protect-if-modified distlocks ebuild-locks fixlafiles ipc-sandbox merge-sync multilib-strict network-sandbox news parallel-fetch pid-sandbox pkgdir-index-trusted preserve-libs protect-owned qa-unresolved-soname-deps sandbox sfperms unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr" FFLAGS="-O2" GENTOO_MIRRORS="http://gentoo.gossamerhost.com rsync://gentoo.gossamerhost.com/gentoo-distfiles/ http://gentoo.mirrors.tds.net/gentoo ftp://gentoo.netnitco.net/pub/mirrors/gentoo/source/ rsync://mirror.leaseweb.com/gentoo/" LANG="en_US.utf8" LDFLAGS="-Wl,-O1 -Wl,--as-needed" LEX="flex" LINGUAS="en en_US" MAKEOPTS="-j2 -l3" PKGDIR="/var/cache/binpkgs" PORTAGE_CONFIGROOT="/" PORTAGE_RSYNC_EXTRA_OPTS="--exclude ChangeLog* --delete-excluded" PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git" PORTAGE_TMPDIR="/var/tmp" PYTHONPATH="/usr/local/lib/python3.10/site-packages/" SHELL="/bin/bash" USE="acl adns aio apm arm bash-completion blake2 bzip2 caps cli cpudetection crypt dane dri fontconfig fortran gdbm gold gzip harfbuzz hddtemp http2 iconv ipv6 jpeg kmod ldns lm_sensors lz4 lzma ncurses neon nfs nfsv4 nfsv41 nginx nls openmp pam pcre png readline seccomp smp split-usr sqlite ssh ssl test-rust threads tools truetype tty-helpers udev unicode urandom usb vim-syntax wifi xattr zlib zstd" ADA_TARGET="gnat_2021" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_anon authn_dbm authn_file authz_dbm authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir env expires ext_filter file_cache filter headers include info log_config logio mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="karbon sheets words" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock greis isync itrax mtk3301 ntrip navcom oceanserver oncore rtcm104v2 rtcm104v3 sirf skytraq superstar2 tsip tripmate tnt ublox" INPUT_DEVICES="libinput" KERNEL="linux" L10N="en en-US" LCD_DEVICES="bayrad cfontz glk hd44780 lb216 lcdm001 mtxorb text" LUA_SINGLE_TARGET="lua5-1" LUA_TARGETS="lua5-1" NGINX_MODULES_HTTP="access auth_basic autoindex browser charset empty_gif fastcgi geo gzip limit_conn limit_req map memcached proxy referer rewrite scgi spdy split_clients ssi upstream_hash upstream_ip_hash upstream_keepalive upstream_least_conn upstream_zone userid uwsgi" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php8-1" POSTGRES_TARGETS="postgres15" PYTHON_SINGLE_TARGET="python3_10" PYTHON_TARGETS="python3_10 python3_9" RUBY_TARGETS="ruby31" VIDEO_CARDS="exynos fbdev omap dummy" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipp2p iface geoip fuzzy condition tarpit sysrq proto logmark ipmark dhcpmac delude chaos account" Unset: ADDR2LINE, AR, ARFLAGS, AS, ASFLAGS, CC, CCLD, CONFIG_SHELL, CPP, CPPFLAGS, CTARGET, CXX, CXXFILT, ELFEDIT, EXTRA_ECONF, F77FLAGS, FC, GCOV, GPROF, INSTALL_MASK, LC_ALL, LD, LFLAGS, LIBTOOL, MAKE, MAKEFLAGS, NM, OBJCOPY, OBJDUMP, PORTAGE_BINHOST, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, RANLIB, READELF, RUSTFLAGS, SIZE, STRINGS, STRIP, YACC, YFLAGS ================================================================= Package Settings ================================================================= app-portage/mirrorselect-2.4.0::gentoo was built with the following: USE="ipv6 -test" PYTHON_TARGETS="python3_10 -python3_11" FEATURES="usersync pkgdir-index-trusted merge-sync ipc-sandbox multilib-strict binpkg-logs xattr pid-sandbox ebuild-locks unknown-features-warn binpkg-dostrip buildpkg-live unmerge-orphans userfetch assume-digests parallel-fetch unmerge-logs fixlafiles distlocks news preserve-libs qa-unresolved-soname-deps usersandbox sandbox binpkg-docompress sfperms config-protect-if-modified network-sandbox userpriv protect-owned" ````
Seems to fail frequently. Here is -d9: ```` pi4 ~ # mirrorselect -s 5 -S -R 'North America' -d 9 main(); config_path = /etc/portage/make.conf get_filesystem_mirrors(): config_path = /etc/portage/make.conf get_filesystem_mirrors(): mirrorlist = ['https://mirror.reenigne.net/gentoo/', '\\', 'https://172.83.105.10/gentoo/', '\\', 'https://mirror.clarkson.edu/gentoo/', '\\', 'https://mirrors.mit.edu/gentoo-distfiles/', '\\', 'https://128.153.145.19/gentoo/'] get_filesystem_mirrors(): ignoring non-accessible mirror = \ get_filesystem_mirrors(): ignoring non-accessible mirror = \ get_filesystem_mirrors(): ignoring non-accessible mirror = \ get_filesystem_mirrors(): ignoring non-accessible mirror = \ get_filesystem_mirrors(): fsmirrors = [] using url: https://api.gentoo.org/mirrors/distfiles.xml * Using url: https://api.gentoo.org/mirrors/distfiles.xml * Limiting test to "region=North America" hosts. * Limiting test to https hosts. getlist(): fetching https://api.gentoo.org/mirrors/distfiles.xml * Downloading a list of mirrors... Enabled ssl certificate verification: True, for: https://api.gentoo.org/mirrors/distfiles.xml Connector.connect_url(); headers = {'Accept-Charset': 'utf-8', 'User-Agent': 'Mirrorselect-2.4.0'} Connector.connect_url(); connecting to opener Connector.connect_url() HEADERS = {'Date': 'Sat, 13 Jan 2024 20:02:00 GMT', 'Content-Type': 'text/xml', 'Transfer-Encoding': 'chunked', 'Connection': 'keep-alive', 'Vary': 'Accept-Encoding', 'Last-Modified': 'Sat, 06 Jan 2024 06:55:02 GMT', 'ETag': 'W/"6598f946-969b"', 'Expires': 'Sat, 06 Jan 2024 08:40:44 GMT', 'Cache-Control': 'max-age=3600', 'Access-Control-Allow-Origin': '*', 'X-77-NZT': 'EgwB1GYuBwHXSQMAAAwBj/QzEwH3mwMAAA', 'X-77-NZT-Ray': '74b3202c04a3896d38eca265d72fac33', 'X-Accel-Expires': '@1705178879', 'X-Accel-Date': '1705175279', 'X-77-Cache': 'HIT', 'X-77-Age': '1764', 'Content-Encoding': 'gzip', 'Server': 'CDN77-Turbo', 'X-Cache-LB': 'HIT', 'X-Age-LB': '841', 'X-77-POP': 'seattleUSWA'} Connector.connect_url() Status_code = 200 New content downloaded for: https://api.gentoo.org/mirrors/distfiles.xml Got 251 mirrors. Extractor(): fetched mirrors, 7 hosts after filtering * Using netselect to choose the top 5 mirrors... netselect(): running "netselect -s5 https://mirror.csclub.uwaterloo.ca/gentoo-distfiles/ https://mirror.reenigne.net/gentoo/ https://gentoo.osuosl.org/ https://mirrors.mit.edu/gentoo-distfiles/ https://mirrors.rit.edu/gentoo/ https://mirror.clarkson.edu/gentoo/ https://mirror.servaxnet.com/gentoo/" Done. netselect(): returning [b'https://mirror.reenigne.net/gentoo/', b'https://172.83.105.10/gentoo/\x04', b'https://mirror.clarkson.edu/gentoo/', b'https://mirrors.mit.edu/gentoo-distfiles/', b'https://128.153.145.19/gentoo/\xf6v\xf8\xfa\xf6v\x19'] and {b'172': b'https://mirror.reenigne.net/gentoo/', b'207': b'https://172.83.105.10/gentoo/\x04', b'261': b'https://mirror.clarkson.edu/gentoo/', b'282': b'https://mirrors.mit.edu/gentoo-distfiles/', b'312': b'https://128.153.145.19/gentoo/\xf6v\xf8\xfa\xf6v\x19'} Traceback (most recent call last): File "/usr/lib/python-exec/python3.10/mirrorselect", line 55, in <module> MirrorSelect().main(sys.argv) File "/usr/lib/python3.10/site-packages/mirrorselect/main.py", line 469, in main self.change_config( File "/usr/lib/python3.10/site-packages/mirrorselect/main.py", line 107, in change_config hosts[i] = hosts[i].decode("utf-8") UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf6 in position 30: invalid start byte pi4 ~ # ```` One problem appears to be this URL: b'https://128.153.145.19/gentoo/\xf6v\xf8\xfa\xf6v\x19'
https://128.153.145.19/gentoo//xf6v/xf8/xfa/xf6v/x19 https with an IPv4 address?? Bad certificate?? 404?? Why is this in the mirrorlist at all??
Another mirror with a bad cert: https://172.83.105.10/gentoo/ Oddly, mirrorselect was happy with that one and put it in my GENTOO_MIRRORS. Should I file another bug for mirrorselect allowing bad certs? Another bad UTF-8: b'https://128.153.145.19/gentoo/\xf7v'
This at least netselect, and possibly also mirrorselect's parsing doing something weird The underlying mirror data doesn't contain *ANY* IPs $ curl https://api.gentoo.org/mirrors/distfiles.xml -sq |grep '<uri' (read the output, i'm not going to repeat it here) netselect in your command output is the key. Given a list of URLs, it should return those same those URLs - it should NOT be returning the underlying IPs However, I reproduced this part: ``` $ netselect -s5 -t 1 https://mirror.reenigne.net/gentoo/ https://gentoo.osuosl.org/ https://mirror.clarkson.edu/gentoo/ 192 https://172.83.105.10/gentoo/ 197 https://mirror.reenigne.net/gentoo/ 220 https://128.153.145.19/gentoo/ ``` 172.83.105.10 is mirror.reenigne.net 128.153.145.19 is mirror.clarkson.edu What's not clear is if this is an intentional change in the behavior of netselect, or a bug introduced at some point in the past. That leads us the extra output on the end: \xf6v\xf8\xfa\xf6v\x19\ I couldn't reproduce this if I called ``` PYTHONPATH=. ./bin/mirrorselect -s 5 -S -R 'North America' -d 9 -o ... * Using netselect to choose the top 5 mirrors... netselect(): running "netselect -s5 https://mirror.csclub.uwaterloo.ca/gentoo-distfiles/ https://mirror.reenigne.net/gentoo/ https://gentoo.osuosl.org/ https://mirrors.mit.edu/gentoo-distfiles/ https://mirrors.rit.edu/gentoo/ https://mirror.clarkson.edu/gentoo/ https://mirror.servaxnet.com/gentoo/" Done. netselect(): returning [b'https://mirror.reenigne.net/gentoo/', b'https://172.83.105.10/gentoo/', b'https://mirrors.mit.edu/gentoo-distfiles/', b'https://128.153.145.19/gentoo/', b'https://mirror.clarkson.edu/gentoo/'] and {b'134': b'https://mirror.reenigne.net/gentoo/', b'146': b'https://172.83.105.10/gentoo/', b'255': b'https://mirrors.mit.edu/gentoo-distfiles/', b'367': b'https://128.153.145.19/gentoo/', b'381': b'https://mirror.clarkson.edu/gentoo/'} GENTOO_MIRRORS="https://mirror.reenigne.net/gentoo/ \ https://172.83.105.10/gentoo/ \ https://mirrors.mit.edu/gentoo-distfiles/ \ https://128.153.145.19/gentoo/ \ https://mirror.clarkson.edu/gentoo/" ``` So on that front I don't know, but suspect it's also netselect being weird. netselect itself hasn't changed at the base upstream in *14* years. There are a few patches, but I'm wondering if it makes some bad assumptions about libc behavior that are no longer true.
Bad news: I reproduced the weird unicode, but it's definetly a SOMETIMES bug, pointing to weirdness in netselect: netselect(): running "netselect -s50 mirror.leaseweb.com:_044ce454 mirror.kumi.systems:_e98ecbd1 ftp.belnet.be:_24a832b8 mirror.telepoint.bg:_85363425 mirrors.daticum.com:_d9a76195 mirror.init7.net:_c7e45805 mirror.dkm.cz:_860f5c01 mirror.it4i.cz:_46687747 mirrors.dotsrc.org:_d341c0ab mirrors.ircam.fr:_ba12e285 mirrors.soeasyto.com:_70d3fe86 linux.rz.ruhr-uni-bochum.de:_7807a76b ftp.fau.de:_d20af173 ftp.agdsn.de:_7294e1d9 ftp-stud.hs-esslingen.de:_4a06b7ae mirror.eu.oneandone.net:_cdcf10b5 mirror.netcologne.de:_47784534 ftp.halifax.rwth-aachen.de:_17c7163c ftp.gwdg.de:_36afd488 ftp.tu-ilmenau.de:_9eeb5e2b ftp.uni-hannover.de:_cbc0e1cb packages.hs-regensburg.de:_d07183a1 ftp.uni-stuttgart.de:_c023fdd3 ftp.spline.inf.fu-berlin.de:_102a1354 mirror.netzwerge.de:_7c4e6a46 mirror.dogado.de:_d4171a12 quantum-mirror.hu:_f8ec96db gentoo.jss.hu:_727a28db ftp.heanet.ie:_6150fe6c gentoo.mirror.garr.it:_1e633d93 ftp.snt.utwente.nl:_a75c4d1b mirrors.evoluso.com:_4e1313a7 ftp.rnl.tecnico.ulisboa.pt:_cf66a5ba mirrors.ptisp.pt:_2277fcdc mirror1.sox.rs:_3184caa2 ftp.lysator.liu.se:_e7682a56 mirrors.tnonline.net:_6a96f98c mirror.wheel.sk:_08a87aaf repo.ifca.es:_b983eeb5 ftp.linux.org.tr:_6132f956 mirror.bytemark.co.uk:_739d0c3f mirrors.gethosted.online:_e7c68df1 www.mirrorservice.org:_48d9cf82" Raw output b' 101 mirror.leaseweb.com:_044ce454\n 306 mirrors.ircam.fr:_ba12e285\n 322 129.102.1.37:_ba12e285\n 336 193.190.198.27:_24a832b8\n 340 137.226.34.46:_17c7163c\n 359 ftp.fau.de:_d20af173\n 379 mirror.bytemark.co.uk:_739d0c3f\n 386 131.188.12.211:_d20af173\n 391 mirrors.ptisp.pt:_2277fcdc\n 393 mirrors.dotsrc.org:_d341c0ab\n 396 [2001:41c8:20:5e6::150]:_739d0c3f\n 403 ftp-stud.hs-esslingen.de:_4a06b7ae\n 405 mirror1.sox.rs:_3184caa2\n 408 212.110.163.13:_739d0c3f\n 423 130.225.254.116:_d341c0ab\n 450 [2001:6b0:17:f0a0::fd]:_e7682a56*\x01\x04\xf9\n 452 129.143.116.10:_4a06b7ae@\x87\x80a\x13\x7f\n 452 ftp.lysator.liu.se:_e7682a56\n 456 80.68.83.150:_739d0c3f\n 463 141.30.235.39:_7294e1d9\n 470 130.236.254.253:_e7682a56\n 473 gentoo.jss.hu:_727a28db\n 473 130.185.80.122:_2277fcdc\n 481 130.236.254.251:_e7682a56\n 581 ftp.agdsn.de:_7294e1d9\n 618 88.218.137.65:_3184caa2\n 652 194.8.197.22:_47784534\n 1120 155.4.110.241:_6a96f98c\n 1464 mirror.netcologne.de:_47784534\n 1716 mirrors.evoluso.com:_4e1313a7\n'
Good news: the underlying host/ip problem has a draft fix at: https://gitweb.gentoo.org/proj/mirrorselect.git/commit/?h=robbat2/netselect-tags It doesn't have the UTF-8 output fixed, so sometimes it will work, and othertimes it will fail with UnicodeDecodeError.