Working on bugzilla is very tiresome. Each page request takes up to several seconds. The system is now: ----------------------------------------------- CPU 4x 2.4GHz Opteron 2216 RAM 16GB Storage 2x 250GB SATA2 RAID1(md) https://wiki.gentoo.org/wiki/Project:Infrastructure/Servers/Gannet ----------------------------------------------- I have no idea, if newer hardware or an optimized configuration or both are required. But it is the slowest bug tracker I have ever seen and a faster response would be important for everyone who is willing to assign or clean up bugs. $ httpstat 'https://bugs.gentoo.org/bots.html' Connected to 204.187.15.4:443 from x.x.x.x HTTP/1.1 200 OK Date: Wed, 03 Jun 2020 21:06:58 GMT Server: Apache Last-Modified: Sun, 26 Apr 2020 06:51:50 GMT ETag: "1e65e-1366-5a42c086c5035" Accept-Ranges: bytes Content-Length: 4966 Vary: Accept-Encoding Content-Type: text/html; charset=utf-8 DNS Lookup TCP Connection TLS Handshake Server Processing Content Transfer [ 1286ms | 174ms | 367ms | 175ms | 0ms ] | | | | | namelookup:1286ms | | | | connect:1460ms | | | pretransfer:1827ms | | starttransfer:2002ms | total:2002ms Thank you very much.
jstein: can you look at why your DNS resolver took 1.2 seconds first? The records have a 12 hour TTL, which should be long enough to stay in your cache most of the time. Repeated runs of httpstat SHOULD show it significantly reduced if the cache is working properly. Past that, can you also report the latency to the bugs.g.o host from your system? It should the same as TCP connection if everything is good. The ABSOLUTELY best case you should get is something like: | 0-1ms | (1x LATENCY) | (2x LATENCY) | (1x LATENCY) | 0-1ms | My own system is: DNS Lookup TCP Connection TLS Handshake Server Processing Content Transfer [ 1ms | 5ms | 31ms | 7ms | 0ms ] Latency is ~5.4ms $ ping -w 3 -q -c 3 bugs.gentoo.org PING bugs.gentoo.org(2607:fcc0:4:ffff::4 (2607:fcc0:4:ffff::4)) 56 data bytes --- bugs.gentoo.org ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2003ms rtt min/avg/max/mdev = 5.035/5.321/5.487/0.203 ms Lastly, to rule out anything SPECIFIC to the bugs system, can you also show httpstat to https://wiki.gentoo.org/ and https://forums.gentoo.org; they are hosted at the same facility. For wiki, I get: [ 0ms | 5ms | 32ms | 5ms | 1ms ] For forums, I get: [ 0ms | 5ms | 31ms | 513ms | 5ms ] (you can see how phpBB takes ~500ms to build a response)
ping -w 3 -q -c 3 bugs.gentoo.org is rtt min/avg/max/mdev = 164.077/164.331/164.638/0.231 ms httpstat 'https://wiki.gentoo.org/' Connected to 204.187.15.5:443 from 192.168.1.12:52486 HTTP/2 302 server: nginx date: Fri, 07 Aug 2020 20:07:32 GMT content-type: text/html content-length: 154 location: https://wiki.gentoo.org/wiki/Main_Page strict-transport-security: max-age=31536000 Body stored in: /tmp/tmp2oskyf38 DNS Lookup TCP Connection TLS Handshake Server Processing Content Transfer [ 235ms | 165ms | 351ms | 164ms | 0ms ] | | | | | namelookup:235ms | | | | connect:400ms | | | pretransfer:751ms | | starttransfer:915ms | total:915ms httpstat 'https://forums.gentoo.org/' Connected to 204.187.15.12:443 from 192.168.1.12:38594 HTTP/1.1 200 OK Server: nginx Date: Fri, 07 Aug 2020 20:08:24 GMT Content-Type: text/html; charset=UTF-8 Transfer-Encoding: chunked Connection: close Set-Cookie: phpbb2mysql_sid_s=0c9d8394ace70f9cfb3e5b87136a4fce; path=/; domain=forums.gentoo.org; secure Set-Cookie: phpbb2mysql_data_s=a%3A2%3A%7Bs%3A11%3A%22autologinid%22%3Bs%3A0%3A%22%22%3Bs%3A6%3A%22userid%22%3Bi%3A-1%3B%7D; expires=Sat, 07-Aug-2021 20:08:23 GMT; Max-Age=31536000; path=/; domain=forums.gentoo.org; secure Cache-Control: private, pre-check=0, post-check=0, max-age=0 Expires: 0 Pragma: no-cache X-Clacks-Overhead: GNU Terry Pratchett, Noirin Trouble Pluinceid Body stored in: /tmp/tmpd0d_lc_7 DNS Lookup TCP Connection TLS Handshake Server Processing Content Transfer [ 396ms | 165ms | 352ms | 848ms | 168ms ] | | | | | namelookup:396ms | | | | connect:561ms | | | pretransfer:913ms | | starttransfer:1761ms | total:1929ms
A few more things! - It matters what URL you hit, as some URLs are static and don't invoke the interpreter. - We were using cgi, so 1 perl interpreter per request. This was deemed 'slow'. E.g. main page load took 1200ms. This was fairly routine (e.g. accessing bugs from localhost still took 1200ms when busy.) - We tested a mod-perl implementation on bugs-test and it performed better (300ms vs 1200ms.) Some changes: - Previously we did not collect latency data in our access logs. I switched to combinedts, so we can do analysis on median / tail latency. - Previously, we did not have any IP protection and so folks would wander by and submit lots of requests. Since there is a limit to the # of concurrent workers and requests take 1200ms; this makes it easy to consume all the works; even with a low QPS. We have ip limiting now. - Bugzilla prod now runs on mod-perl, where the median request time is more like 300ms (as opposed to 1200ms.) Future Work: - Bugzilla typically runs with 2 mysql databases (in a master-master weird setup thats complicated.) One of the databases died, and we are working to move to a new set of replicas. This may cause excess load on the remaining database; however I don't believe this is a major factor in latency experienced. - Move away from mod-perl to PSGI and nginx. - Move to new Web frontend machines. - Move to new databases.
$ httpstat https://bugs.gentoo.org/bots.html Connected to 2607:fcc0:4:ffff::4:443 from 2001:985:55c2:1:ae1f:6bff:fef6:d891:59686 HTTP/1.1 200 OK Date: Sun, 20 Sep 2020 06:28:58 GMT Server: Apache Last-Modified: Sun, 26 Apr 2020 06:51:50 GMT ETag: "1e65e-1366-5a42c086c5035" Accept-Ranges: bytes Content-Length: 4966 Vary: Accept-Encoding Content-Type: text/html; charset=utf-8 Body stored in: /tmp/tmpxaob84lz DNS Lookup TCP Connection TLS Handshake Server Processing Content Transfer [ 4ms | 167ms | 352ms | 167ms | 0ms ] | | | | | namelookup:4ms | | | | connect:171ms | | | pretransfer:523ms | | starttransfer:690ms | total:690ms $ httpstat https://bugs.gentoo.org/727020 Connected to 2607:fcc0:4:ffff::4:443 from 2001:985:55c2:1:ae1f:6bff:fef6:d891:59692 HTTP/1.1 302 Found Date: Sun, 20 Sep 2020 06:29:39 GMT Server: Apache Location: https://bugs.gentoo.org/show_bug.cgi?id=727020 Content-Length: 296 Content-Type: text/html; charset=iso-8859-1 Body stored in: /tmp/tmpnewup6k0 DNS Lookup TCP Connection TLS Handshake Server Processing Content Transfer [ 5ms | 151ms | 323ms | 151ms | 0ms ] | | | | | namelookup:5ms | | | | connect:156ms | | | pretransfer:479ms | | starttransfer:630ms | total:630ms $ Connected to 2607:fcc0:4:ffff::4:443 from 2001:985:55c2:1:ae1f:6bff:fef6:d891:59750 HTTP/1.1 200 OK Date: Sun, 20 Sep 2020 06:30:30 GMT Server: Apache Content-disposition: inline; filename="bugs-2020-09-20.html" Content-security-policy: frame-ancestors 'self' Strict-transport-security: max-age=15768000; includeSubDomains X-content-type-options: nosniff X-frame-options: SAMEORIGIN X-xss-protection: 1; mode=block Set-Cookie: LASTORDER=bug_status%2Cpriority%2Cassigned_to%2Cbug_id; path=/; expires=Fri, 01-Jan-2038 00:00:00 GMT Set-Cookie: BUGLIST=727020; path=/; expires=Fri, 01-Jan-2038 00:00:00 GMT Set-Cookie: Bugzilla_login_request_cookie=53qQb4UaPn; path=/; secure; HttpOnly Vary: Accept-Encoding,User-Agent Transfer-Encoding: chunked Content-Type: text/html; charset=UTF-8 Body stored in: /tmp/tmpgna2_drx DNS Lookup TCP Connection TLS Handshake Server Processing Content Transfer [ 5ms | 162ms | 345ms | 448ms | 162ms ] | | | | | namelookup:5ms | | | | connect:167ms | | | pretransfer:512ms | | starttransfer:960ms | total:1122ms
(In reply to Alec Warner from comment #3) > A few more things! Looking at my own data just posted my suggestion would be to: - use HTTP/2 (which may not help with a single request as shown with httpstat but will help with subsequent requests) - use TLS 1.3 (which should be able to shave off a round-trip in the TLS phase)