Summary: | Proposal: make bugs.gentoo.org more search-engine friendly, but w/o email addresses | ||
---|---|---|---|
Product: | Gentoo Infrastructure | Reporter: | Daniel Santos <daniel.santos> |
Component: | Bugzilla | Assignee: | Bugzilla Admins <bugzilla> |
Status: | RESOLVED INVALID | ||
Severity: | enhancement | ||
Priority: | Normal | ||
Version: | unspecified | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Package list: | Runtime testing required: | --- |
Description
Daniel Santos
2012-12-07 22:43:47 UTC
Try not to file a bug asking for more than one concrete thing. 1) No one is going to be ever-watching the logs to ban bots that ignore robots.txt. We all have better things to do. 2) Email addresses are only visible to logged in users. Many bots don't do this because it adds state to the crawler, and when you crawl billions of sites, it adds a bunch of extra complexity. Email addresses in raw text (comments, etc.) are of course available. 3) We already generate cached datasets for bots. It is listed in our bot policy. https://bugs.gentoo.org/bots.html -A Sorry for my late response. I guess I'll open a new bug for the problem, which appears to be that https://bugs.gentoo.org/bots.html links to static cached files which robots.txt disallows with this rule: Disallow: /data/cached/ I need to re-read the robots.txt spec first though. Thanks for the response. Daniel |