Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 727020 - bugs.gentoo.org is extremely slow
Summary: bugs.gentoo.org is extremely slow
Status: IN_PROGRESS
Alias: None
Product: Gentoo Infrastructure
Classification: Unclassified
Component: Bugzilla (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: Bugzilla Admins
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-06-03 21:09 UTC by Jonas Stein
Modified: 2020-09-20 06:43 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jonas Stein gentoo-dev 2020-06-03 21:09:06 UTC
Working on bugzilla is very tiresome. 
Each page request takes up to several seconds.

The system is now:
-----------------------------------------------
CPU 	4x 2.4GHz Opteron 2216
RAM 	16GB
Storage 	2x 250GB SATA2 RAID1(md) 
https://wiki.gentoo.org/wiki/Project:Infrastructure/Servers/Gannet
-----------------------------------------------

I have no idea, if newer hardware or an optimized configuration or both are required. But it is the slowest bug tracker I have ever seen and a faster response would be important for everyone who is willing to assign or clean up bugs.

$ httpstat 'https://bugs.gentoo.org/bots.html'

Connected to 204.187.15.4:443 from x.x.x.x

HTTP/1.1 200 OK
Date: Wed, 03 Jun 2020 21:06:58 GMT
Server: Apache
Last-Modified: Sun, 26 Apr 2020 06:51:50 GMT
ETag: "1e65e-1366-5a42c086c5035"
Accept-Ranges: bytes
Content-Length: 4966
Vary: Accept-Encoding
Content-Type: text/html; charset=utf-8


  DNS Lookup   TCP Connection   TLS Handshake   Server Processing   Content Transfer
[   1286ms   |      174ms     |     367ms     |       175ms       |        0ms       ]
             |                |               |                   |                  |
    namelookup:1286ms         |               |                   |                  |
                        connect:1460ms        |                   |                  |
                                    pretransfer:1827ms            |                  |
                                                      starttransfer:2002ms           |
                                                                                 total:2002ms 


Thank you very much.
Comment 1 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2020-06-03 21:34:46 UTC
jstein: can you look at why your DNS resolver took 1.2 seconds first? The records have a 12 hour TTL, which should be long enough to stay in your cache most of the time.

Repeated runs of httpstat SHOULD show it significantly reduced if the cache is working properly.

Past that, can you also report the latency to the bugs.g.o host from your system? It should the same as TCP connection if everything is good.

The ABSOLUTELY best case you should get is something like:

| 0-1ms | (1x LATENCY) | (2x LATENCY) | (1x LATENCY) | 0-1ms |

My own system is:
  DNS Lookup   TCP Connection   TLS Handshake   Server Processing   Content Transfer
[     1ms    |       5ms      |     31ms      |        7ms        |        0ms       ]

Latency is ~5.4ms
$ ping -w 3 -q -c 3 bugs.gentoo.org
PING bugs.gentoo.org(2607:fcc0:4:ffff::4 (2607:fcc0:4:ffff::4)) 56 data bytes

--- bugs.gentoo.org ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2003ms
rtt min/avg/max/mdev = 5.035/5.321/5.487/0.203 ms

Lastly, to rule out anything SPECIFIC to the bugs system, can you also show httpstat to https://wiki.gentoo.org/ and https://forums.gentoo.org; they are hosted at the same facility.

For wiki, I get:
[     0ms    |       5ms      |     32ms      |        5ms        |        1ms       ]

For forums, I get:
[     0ms    |       5ms      |     31ms      |       513ms       |        5ms       ]

(you can see how phpBB takes ~500ms to build a response)
Comment 2 Jonas Stein gentoo-dev 2020-08-07 20:10:30 UTC
ping -w 3 -q -c 3 bugs.gentoo.org is rtt min/avg/max/mdev = 164.077/164.331/164.638/0.231 ms


httpstat 'https://wiki.gentoo.org/'
Connected to 204.187.15.5:443 from 192.168.1.12:52486

HTTP/2 302 
server: nginx
date: Fri, 07 Aug 2020 20:07:32 GMT
content-type: text/html
content-length: 154
location: https://wiki.gentoo.org/wiki/Main_Page
strict-transport-security: max-age=31536000

Body stored in: /tmp/tmp2oskyf38

  DNS Lookup   TCP Connection   TLS Handshake   Server Processing   Content Transfer
[    235ms   |      165ms     |     351ms     |       164ms       |        0ms       ]
             |                |               |                   |                  |
    namelookup:235ms          |               |                   |                  |
                        connect:400ms         |                   |                  |
                                    pretransfer:751ms             |                  |
                                                      starttransfer:915ms            |
                                                                                 total:915ms 

httpstat 'https://forums.gentoo.org/'
Connected to 204.187.15.12:443 from 192.168.1.12:38594

HTTP/1.1 200 OK
Server: nginx
Date: Fri, 07 Aug 2020 20:08:24 GMT
Content-Type: text/html; charset=UTF-8
Transfer-Encoding: chunked
Connection: close
Set-Cookie: phpbb2mysql_sid_s=0c9d8394ace70f9cfb3e5b87136a4fce; path=/; domain=forums.gentoo.org; secure
Set-Cookie: phpbb2mysql_data_s=a%3A2%3A%7Bs%3A11%3A%22autologinid%22%3Bs%3A0%3A%22%22%3Bs%3A6%3A%22userid%22%3Bi%3A-1%3B%7D; expires=Sat, 07-Aug-2021 20:08:23 GMT; Max-Age=31536000; path=/; domain=forums.gentoo.org; secure
Cache-Control: private, pre-check=0, post-check=0, max-age=0
Expires: 0
Pragma: no-cache
X-Clacks-Overhead: GNU Terry Pratchett, Noirin Trouble Pluinceid

Body stored in: /tmp/tmpd0d_lc_7

  DNS Lookup   TCP Connection   TLS Handshake   Server Processing   Content Transfer
[    396ms   |      165ms     |     352ms     |       848ms       |       168ms      ]
             |                |               |                   |                  |
    namelookup:396ms          |               |                   |                  |
                        connect:561ms         |                   |                  |
                                    pretransfer:913ms             |                  |
                                                      starttransfer:1761ms           |
                                                                                 total:1929ms
Comment 3 Alec Warner (RETIRED) archtester gentoo-dev Security 2020-08-13 00:37:14 UTC
A few more things!

 - It matters what URL you hit, as some URLs are static and don't invoke the interpreter.
 - We were using cgi, so 1 perl interpreter per request. This was deemed 'slow'. E.g. main page load took 1200ms. This was fairly routine (e.g. accessing bugs from localhost still took 1200ms when busy.)
 - We tested a mod-perl implementation on bugs-test and it performed better (300ms vs 1200ms.)

Some changes:
 - Previously we did not collect latency data in our access logs. I switched to combinedts, so we can do analysis on median / tail latency.
 - Previously, we did not have any IP protection and so folks would wander by and submit lots of requests. Since there is a limit to the # of concurrent workers and requests take 1200ms; this makes it easy to consume all the works; even with a low QPS. We have ip limiting now.
 - Bugzilla prod now runs on mod-perl, where the median request time is more like 300ms (as opposed to 1200ms.)

Future Work:
 - Bugzilla typically runs with 2 mysql databases (in a master-master weird setup thats complicated.) One of the databases died, and we are working to move to a new set of replicas. This may cause excess load on the remaining database; however I don't believe this is a major factor in latency experienced.
 - Move away from mod-perl to PSGI and nginx.
 - Move to new Web frontend machines.
 - Move to new databases.
Comment 4 Hans de Graaff gentoo-dev Security 2020-09-20 06:30:44 UTC
$ httpstat https://bugs.gentoo.org/bots.html
Connected to 2607:fcc0:4:ffff::4:443 from 2001:985:55c2:1:ae1f:6bff:fef6:d891:59686

HTTP/1.1 200 OK
Date: Sun, 20 Sep 2020 06:28:58 GMT
Server: Apache
Last-Modified: Sun, 26 Apr 2020 06:51:50 GMT
ETag: "1e65e-1366-5a42c086c5035"
Accept-Ranges: bytes
Content-Length: 4966
Vary: Accept-Encoding
Content-Type: text/html; charset=utf-8

Body stored in: /tmp/tmpxaob84lz

  DNS Lookup   TCP Connection   TLS Handshake   Server Processing   Content Transfer
[     4ms    |      167ms     |     352ms     |       167ms       |        0ms       ]
             |                |               |                   |                  |
    namelookup:4ms            |               |                   |                  |
                        connect:171ms         |                   |                  |
                                    pretransfer:523ms             |                  |
                                                      starttransfer:690ms            |
                                                                                 total:690ms  

$ httpstat https://bugs.gentoo.org/727020
Connected to 2607:fcc0:4:ffff::4:443 from 2001:985:55c2:1:ae1f:6bff:fef6:d891:59692

HTTP/1.1 302 Found
Date: Sun, 20 Sep 2020 06:29:39 GMT
Server: Apache
Location: https://bugs.gentoo.org/show_bug.cgi?id=727020
Content-Length: 296
Content-Type: text/html; charset=iso-8859-1

Body stored in: /tmp/tmpnewup6k0

  DNS Lookup   TCP Connection   TLS Handshake   Server Processing   Content Transfer
[     5ms    |      151ms     |     323ms     |       151ms       |        0ms       ]
             |                |               |                   |                  |
    namelookup:5ms            |               |                   |                  |
                        connect:156ms         |                   |                  |
                                    pretransfer:479ms             |                  |
                                                      starttransfer:630ms            |
                                                                                 total:630ms  


$ Connected to 2607:fcc0:4:ffff::4:443 from 2001:985:55c2:1:ae1f:6bff:fef6:d891:59750

HTTP/1.1 200 OK
Date: Sun, 20 Sep 2020 06:30:30 GMT
Server: Apache
Content-disposition: inline; filename="bugs-2020-09-20.html"
Content-security-policy: frame-ancestors 'self'
Strict-transport-security: max-age=15768000; includeSubDomains
X-content-type-options: nosniff
X-frame-options: SAMEORIGIN
X-xss-protection: 1; mode=block
Set-Cookie: LASTORDER=bug_status%2Cpriority%2Cassigned_to%2Cbug_id; path=/; expires=Fri, 01-Jan-2038 00:00:00 GMT
Set-Cookie: BUGLIST=727020; path=/; expires=Fri, 01-Jan-2038 00:00:00 GMT
Set-Cookie: Bugzilla_login_request_cookie=53qQb4UaPn; path=/; secure; HttpOnly
Vary: Accept-Encoding,User-Agent
Transfer-Encoding: chunked
Content-Type: text/html; charset=UTF-8

Body stored in: /tmp/tmpgna2_drx

  DNS Lookup   TCP Connection   TLS Handshake   Server Processing   Content Transfer
[     5ms    |      162ms     |     345ms     |       448ms       |       162ms      ]
             |                |               |                   |                  |
    namelookup:5ms            |               |                   |                  |
                        connect:167ms         |                   |                  |
                                    pretransfer:512ms             |                  |
                                                      starttransfer:960ms            |
                                                                                 total:1122ms
Comment 5 Hans de Graaff gentoo-dev Security 2020-09-20 06:43:36 UTC
(In reply to Alec Warner from comment #3)
> A few more things!

Looking at my own data just posted my suggestion would be to:

- use HTTP/2 (which may not help with a single request as shown with httpstat but will help with subsequent requests)
- use TLS 1.3 (which should be able to shave off a round-trip in the TLS phase)