Summary: | dev-python/twisted consistently stomps on nfs's UDP port | ||
---|---|---|---|
Product: | Gentoo Linux | Reporter: | Mark Glines <mark-gentoo> |
Component: | New packages | Assignee: | Python Gentoo Team <python> |
Status: | RESOLVED UPSTREAM | ||
Severity: | normal | CC: | djc, lordvan |
Priority: | High | ||
Version: | unspecified | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Package list: | Runtime testing required: | --- |
Description
Mark Glines
2007-04-24 01:50:35 UTC
Portage 2.1.2.2 (!../usr/local/portage/kuro-2006-08-26/profiles/kurobox/, gcc-4.1.1, glibc-2.3.6-r5, 2.6.20-kuroboxHG ppc) ================================================================= System uname: 2.6.20-kuroboxHG ppc 82xx Gentoo Base System release 1.12.9 Timestamp of tree: Fri, 30 Mar 2007 14:30:01 +0000 ccache version 2.4 [enabled] dev-lang/python: 2.4.3-r4 dev-python/pycrypto: 2.0.1-r5 dev-util/ccache: 2.4-r6 sys-apps/sandbox: 1.2.17 sys-devel/autoconf: 2.13, 2.61 sys-devel/automake: 1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6-r2, 1.10 sys-devel/binutils: 2.17 sys-devel/gcc-config: 1.3.14 sys-devel/libtool: 1.5.22 virtual/os-headers: 2.6.17-r2 ACCEPT_KEYWORDS="ppc" AUTOCLEAN="yes" CBUILD="powerpc-unknown-linux-gnu" CFLAGS="-O2 -mcpu=603e -fno-strict-aliasing -pipe -fsigned-char" CHOST="powerpc-unknown-linux-gnu" CONFIG_PROTECT="/etc" CONFIG_PROTECT_MASK="/etc/env.d /etc/gconf /etc/revdep-rebuild /etc/terminfo" CXXFLAGS="-O2 -mcpu=603e -fno-strict-aliasing -pipe -fsigned-char" DISTDIR="/usr/portage/distfiles" FEATURES="autoconfig ccache distlocks fixpackages metadata-transfer parallel-fetch sandbox sfperms strict" GENTOO_MIRRORS="http://distfiles.gentoo.org http://distro.ibiblio.org/pub/linux/distributions/gentoo" LINGUAS="en" MAKEOPTS="-j1" PKGDIR="/usr/portage/packages" PORTAGE_RSYNC_EXTRA_OPTS="--exclude-from=/etc/portage/rsync_excludes" PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --delete-after --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --filter=H_**/files/digest-*" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" PORTDIR_OVERLAY="/usr/local/portage /usr/local/portage-overlay/zugaina /usr/local/portage-overlay/gentoo-de" SYNC="rsync://rsync.gentoo.org/gentoo-portage" USE="apache2 berkdb bitmap-fonts bzip2 cjk cli cracklib crypt cups curl dri eds encode ffmpeg gif gpm gstreamer hpn iconv isdnlog jpeg kuro libwww mailwrapper mbox midi milter mkv mp3 mpeg ncurses nls ogg oggvorbis pam pcre perl png ppc ppcsha1 ppds pppd python qt3 qt4 readline reflection samba sasl session slp spell spl ssl tcpd tiff truetype truetype-fonts type1-fonts unicode usb vorbis xml2 xorg xv zlib" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mulaw multi null plug rate route share shm softvol" ELIBC="glibc" INPUT_DEVICES="keyboard mouse evdev" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LINGUAS="en" USERLAND="GNU" VIDEO_CARDS="dummy fbdev v4l" Unset: CTARGET, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LANG, LC_ALL, LDFLAGS, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS I tried to put that in the above message, and got: "Comment too long. If you need to post contents of files or logs, use the attachment feature instead.". It seems a bit misguided to have a size limit so low... I think "Include any detail that seems relevant — you are in very little danger of making your report too long by including too much information." (from http://www.debian.org/Bugs/Reporting) is a better philosophy here. Hmm, odd. the klive tac file does not request a specific port, it just passes 0 (internet.UDPServer(0, ...) at the bottom of /usr/share/klive/klive.tac), which is passed on to bind(2), which means you should get a port from net.ipv4.ip_local_port_range according to udp(7), which is what happens here. On my system /proc/sys/net/ipv4/ip_local_port_range says "32768 61000", and the twistd ends up on 32854/32855. What is that range set to on your system? Are you modifying it somehow, like through /etc/sysctl.conf? (one thing though: I'm starting klive waaay after booting the box, so if something other than /etc/sysctl.conf messes with that range and you start klive earlier you may get different results.) (In reply to comment #2) > On my system /proc/sys/net/ipv4/ip_local_port_range says "32768 61000", and the > twistd ends up on 32854/32855. What is that range set to on your system? Are > you modifying it somehow, like through /etc/sysctl.conf? That's a good thought. On my box, it says "2048 4999". It is not being modified by anything in /etc/ that I can see, though I think *something* must be modifying it at boot time. A recursive grep for 4999 in /etc returns nothing. My /etc/sysctl.conf only has things related to rp_filter. > (one thing though: I'm starting klive waaay after booting the box, so if > something other than /etc/sysctl.conf messes with that range and you start > klive earlier you may get different results.) Well, klive and nfs are both started at boot time here. Klive seems to be started before nfs, therefore it gets the port. But that isn't really klive's fault, or twisted's. :) Ok, I've tracked it down. This is interesting. The kernel adjusts the default values for these *on the fly*, depending on the system's available memory. The machine I'm having this problem on has 128M of memory; my other system (which also runs klive and nfs, but doesn't have this problem) has 1.5G of memory. The default values are "1024 4999". Then some kernel code (tcp_init() in linux/net/ivp4/tcp.c) adjusts them according to the system's memory size: /* Try to be a bit smarter and adjust defaults depending * on available memory. */ for (order = 0; ((1 << order) << PAGE_SHIFT) < (tcp_hashinfo.bhash_size * sizeof(struct inet_bind_hashbucket)); order++) ; if (order >= 4) { sysctl_local_port_range[0] = 32768; sysctl_local_port_range[1] = 61000; tcp_death_row.sysctl_max_tw_buckets = 180000; sysctl_tcp_max_orphans = 4096 << (order - 4); sysctl_max_syn_backlog = 1024; } else if (order < 3) { sysctl_local_port_range[0] = 1024 * (3 - order); tcp_death_row.sysctl_max_tw_buckets >>= (3 - order); sysctl_tcp_max_orphans >>= (3 - order); sysctl_max_syn_backlog = 128; } "smarter?" Maybe, maybe not. "unexpected" is a better word. I'll override these defaults by putting some sane values into my /etc/sysctl.conf, which should solve my klive/nfs problem. Maybe Gentoo should consider having these entries in /etc/sysctl.conf by default? Thanks again for your help. With the exception of insane sysctl defaults, it sounds like this bug is invalid (not klive's fault, not twisted's fault). Mark Hmm. I think the underlying issue is that while the kernel respects the first 1024 ports as "privileged" as described in ip(7) that's not actually enough ports for common services (see /etc/services, there's a fair number of ports "reserved" there that are a lot higher up). Perhaps it would make sense to move that range a bit higher up in the default sysctl.conf, but I don't know if that's a bad idea for performance or memory usage reasons on machines with little ram. I'll ask around a bit or try to figure out what that kernel code adjusting the range is based on. Not closing yet, since it should at least be possible to pick a less popular starting point for the range (nfs is not exactly uncommon, moving the start of the range a few dozen ports up might prevent problems). (In reply to comment #4) > Perhaps it would make sense to move > that range a bit higher up in the default sysctl.conf, but I don't know if > that's a bad idea for performance or memory usage reasons on machines with > little ram. I'll ask around a bit or try to figure out what that kernel code > adjusting the range is based on. I think moving the range up shouldn't affect memory usage, because the kernel will be passing around unsigned 16-bit integers for ports, no matter where in the 16-bit integer space the range falls. The kernel code which chooses the default seems to be more concerned with how wide the range is, rather than exactly where it falls. So with that in mind, such a sysctl setting would have been very useful for me, as it would have prevented this problem. FYI, I just submitted a patch to the kernel guys about this, to start all port ranges at 32768 and only modify the end of the range, not the start. We'll see if they pick up the patch. http://lkml.org/lkml/2007/5/11/380 Thanks for doing this, can you mention in here if that gets accepted/rejected? netdev and lkml look a bit too high-volume for me to track :) (In reply to comment #6) > Thanks for doing this, can you mention in here if that gets accepted/rejected? > netdev and lkml look a bit too high-volume for me to track :) Looks like its gonna get ignored. I'll update this ticket if it does get accepted. Mark (In reply to comment #6) > Thanks for doing this, can you mention in here if that gets accepted/rejected? > netdev and lkml look a bit too high-volume for me to track :) Looks like they thought about it for a while and eventually accepted it - http://marc.info/?l=linux-netdev&m=118065144412426&w=2 Took me a while to notice, sorry :) Mark hmm is it just me or is the summary for this just about wrong ? (In reply to comment #9) > hmm is it just me or is the summary for this just about wrong ? Can you be more specific? Since it looks like this has been solved upstream, can we close this bug? (And do you know in which kernel release your fix ended up going?) Yes, the patch has been merged upstream and this bug is closeable. Thanks for the bump. The patch is in mainline kernel 2.6.22 and above. |