229575 – net-proxy/haproxy-1.3.15.2 bump request

Bug 229575 - net-proxy/haproxy-1.3.15.2 bump request

Summary: net-proxy/haproxy-1.3.15.2 bump request

Status:	RESOLVED FIXED

Alias:	None

Product:	Gentoo Linux
Classification:	Unclassified
Component:	New packages (show other bugs)
Hardware:	All All

Importance:	High enhancement (vote)
Assignee:	Gentoo Network Proxy Developers (OBSOLETE)

URL:	http://haproxy.1wt.eu/
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2008-06-26 13:17 UTC by Krzysztof Olędzki
Modified:	2008-07-08 21:24 UTC (History)
CC List:	0 users

See Also:
Package list:
Runtime testing required:	---

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Krzysztof Olędzki 2008-06-26 13:17:34 UTC

Subject: [ANNOUNCE] haproxy-1.3.15.2 and 1.3.14.6 (major bug)
Date: Sat, 21 Jun 2008 23:11:52 +0200
Sender: Willy Tarreau

Hi all!

Alexander Staubo has run a benchmark on haproxy+mongrel during which he
noticed an anomaly in the response times distribution when running with
maxconn=1 :

   http://affectioncode.wordpress.com/2008/06/11/comparing-nginx-and-haproxy-for-web-applications/

My first analysis was that this problem was caused by "direct" requests
(those with a server cookie) always being considered before the load
balanced ones. But while fixing this design idiocy, I discovered a real
problem : it was perfectly possible for a fresh new request to be served
immediately without passing through the queue, causing requests in the
queue to be delayed for at least as long as the queue timeout, until
they might eventually expire. Now *that* explains the horrible peaks
on Alexander's graphs.

My problem was that it was a real misdesign, which could not be fixed
by a 3-liner patch. So I spent the whole week reworking the queue
management logic in a saner manner and running regression tests.

I have back-ported the fix to both 1.3.15 and 1.3.14, carefully testing
both of them. Since the logic is cleaner and clearer now, and due to
the time I have spent on this, I am quite confident that there is no
regression. But I will not lie to you, it is a big patch so you have to
apply it with care. Especially distro maintainers should wait at least
1 or 2 weeks before upgrading, "just in case", but they should upgrade
eventually because their users are affected.

Well, the good news is that not only this fixes a number of 503 errors
and long response times when running with a low maxconn, but as an
added bonus, the "redispatch" option is now naturally considered when
a server's maxqueue is reached, so that it will now not be necessary
anymore to trade between large queues and the risk of returning 503
errors.

I believe that the most affected people are the ones running Ruby
on Rails, because they often set maxconn to 1 on the servers, which
enhances the risk of the problem happening. Those people should
observe a notable improvement. Note that it is possible that the
measured response time among valid responses increases due to all
requests being served.  If this is the case, it means that before
it was lower because some requests never reached the server, so
they did not take time there. But the new code requires less CPU
power than before and less task wakeups, so it is also possible
that users of high traffic sites will notice a slightly lower CPU
usage.

Last, my friend Benoit has set up a reverse proxy-cache on our
dedicated server, so I have updated the DNS record for
haproxy.1wt.eu to point to flx01.formilux.org. It should be
faster to get updates now  :-) 

Obviously, if you notice anything strange, please tell me. The
cache is configured to maintain objects for 24 hours by default,
but you can force a reload if in doubt.

Please find updates here :

   http://haproxy.1wt.eu/download/1.3/src/
   http://haproxy.1wt.eu/download/1.3/bin/

Regards,
Willy

Reproducible: Always

Steps to Reproduce:

Comment 1 Alin Năstac (RETIRED) gentoo-dev

2008-06-26 18:22:10 UTC

(In reply to comment #0)
> But I will not lie to you, it is a big patch so you have to
> apply it with care. Especially distro maintainers should wait at least
> 1 or 2 weeks before upgrading, "just in case", but they should upgrade
> eventually because their users are affected.

I will follow his advise and wait for 2 weeks.

Comment 2 Alin Năstac (RETIRED) gentoo-dev

2008-07-08 21:24:25 UTC

Fixed in cvs.