Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 626032 - attribute values of html attibute "id" must be unique - https://devmanual.gentoo.org/ebuild-writing/eapi/
Summary: attribute values of html attibute "id" must be unique - https://devmanual.gen...
Status: RESOLVED FIXED
Alias: None
Product: Documentation
Classification: Unclassified
Component: Devmanual (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: Gentoo Devmanual Team
URL: https://devmanual.gentoo.org/ebuild-w...
Whiteboard:
Keywords: PATCH
Depends on:
Blocks:
 
Reported: 2017-07-24 08:23 UTC by charles17
Modified: 2020-01-07 08:42 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
combined-parent-anchors.patch (combined-parent-anchors.patch,2.43 KB, patch)
2019-04-27 13:01 UTC, Michael Orlitzky
Details | Diff
0001-devbook.xsl-disambiguate-auto-generated-header-ident.patch (0001-devbook.xsl-disambiguate-auto-generated-header-ident.patch,5.72 KB, patch)
2019-04-28 17:43 UTC, Michael Orlitzky
Details | Diff
0002-devbook.xsl-strip-leading-trailing-whitespace-from-h.patch (0002-devbook.xsl-strip-leading-trailing-whitespace-from-h.patch,1.27 KB, patch)
2019-04-28 17:43 UTC, Michael Orlitzky
Details | Diff
0003-Makefile-add-new-app-text-tidy-html5-sanity-check.patch (0003-Makefile-add-new-app-text-tidy-html5-sanity-check.patch,2.69 KB, patch)
2019-04-28 17:44 UTC, Michael Orlitzky
Details | Diff
0004-keywording-change-ALLARCHES-subsection-to-a-section.patch (0004-keywording-change-ALLARCHES-subsection-to-a-section.patch,1.02 KB, patch)
2019-04-28 17:44 UTC, Michael Orlitzky
Details | Diff
0005-appendices-contributing-devbook-guide-fix-missing-ur.patch (0005-appendices-contributing-devbook-guide-fix-missing-ur.patch,2.42 KB, patch)
2019-04-28 17:44 UTC, Michael Orlitzky
Details | Diff
0001-devbook.xsl-disambiguate-auto-generated-header-ident.patch (0001-devbook.xsl-disambiguate-auto-generated-header-ident.patch,5.72 KB, patch)
2019-12-26 16:11 UTC, Michael Orlitzky
Details | Diff
0002-devbook.xsl-strip-leading-trailing-whitespace-from-h.patch (0002-devbook.xsl-strip-leading-trailing-whitespace-from-h.patch,1.31 KB, patch)
2019-12-26 16:11 UTC, Michael Orlitzky
Details | Diff
0003-Makefile-add-new-app-text-tidy-html5-sanity-check.patch (0003-Makefile-add-new-app-text-tidy-html5-sanity-check.patch,1.98 KB, patch)
2019-12-26 16:11 UTC, Michael Orlitzky
Details | Diff
0001-devbook.xsl-disambiguate-auto-generated-header-ident.patch (0001-devbook.xsl-disambiguate-auto-generated-header-ident.patch,4.55 KB, patch)
2019-12-26 17:57 UTC, Michael Orlitzky
Details | Diff
0002-devbook.xsl-strip-leading-trailing-whitespace-from-h.patch (0002-devbook.xsl-strip-leading-trailing-whitespace-from-h.patch,1.25 KB, patch)
2019-12-26 17:58 UTC, Michael Orlitzky
Details | Diff
0003-Makefile-add-new-app-text-tidy-html5-sanity-check.patch (0003-Makefile-add-new-app-text-tidy-html5-sanity-check.patch,1.98 KB, patch)
2019-12-26 17:59 UTC, Michael Orlitzky
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description charles17 2017-07-24 08:23:52 UTC
Some id attribute values are not unique on https://devmanual.gentoo.org/ebuild-writing/eapi/

4 x id="helpers"
4 x id="phases"
3 x id="metadata"
2 x id="variables"

(Sending a pull request seems not possible since page version on github dos not have id attributes at all, see https://github.com/gentoo/devmanual.gentoo.org/blob/master/ebuild-writing/eapi/text.xml)
Comment 1 charles17 2017-07-24 08:46:14 UTC
The reason is that those templates from https://gitweb.gentoo.org/proj/devmanual.git/tree/devbook.xsl#n27 are not suitable if section titles or resp. subsection titles are not unique.
Comment 2 Michael Orlitzky gentoo-dev 2019-04-27 13:01:12 UTC
Created attachment 574416 [details, diff]
combined-parent-anchors.patch

Here's a proof-of-concept patch to devmanual.xsl that makes the IDs/anchors a little more unique, by combining e.g. the parent section's name into each subsection name. Instead of five subsections with id="helpers", we now have

  * id="eapi=2-helpers"
  * id="eapi=4-helpers"
  * id="eapi=5-helpers"
  * id="eapi=6-helpers"
  * id="eapi=7-helpers"

I haven't tested it on any other page, and it probably breaks in some cases, and definitely breaks existing links to some subsections. But, it shows that maybe this can be fixed.
Comment 3 Michael Orlitzky gentoo-dev 2019-04-28 17:43:38 UTC
Created attachment 574552 [details, diff]
0001-devbook.xsl-disambiguate-auto-generated-header-ident.patch

Procrastinating real hard today. Here's a much better patch that uses the full path into the hierarchy as the identifier. It also adds a tidy-html5 check to the Makefile to ensure that this doesn't get screwed up again. Finally, I fix some other random issues that tidy-html5 pointed out.
Comment 4 Michael Orlitzky gentoo-dev 2019-04-28 17:43:58 UTC
Created attachment 574554 [details, diff]
0002-devbook.xsl-strip-leading-trailing-whitespace-from-h.patch
Comment 5 Michael Orlitzky gentoo-dev 2019-04-28 17:44:13 UTC
Created attachment 574556 [details, diff]
0003-Makefile-add-new-app-text-tidy-html5-sanity-check.patch
Comment 6 Michael Orlitzky gentoo-dev 2019-04-28 17:44:36 UTC
Created attachment 574558 [details, diff]
0004-keywording-change-ALLARCHES-subsection-to-a-section.patch
Comment 7 Michael Orlitzky gentoo-dev 2019-04-28 17:44:53 UTC
Created attachment 574560 [details, diff]
0005-appendices-contributing-devbook-guide-fix-missing-ur.patch
Comment 8 Michael Orlitzky gentoo-dev 2019-04-28 17:48:18 UTC
This has still only been lightly tested, but the fact that tidy-html5 now runs clean is reassuring. And three of the five commits are independently correct:

  * Trimming leading/trailing whitespace from anchors
  * Fixing the ALLARCHES header
  * Fixing the GuideXML <uri> elements

The addition to the Makefile is generally harmless, but maybe there's a way to improve the ugly way I'm running tidy-html5.

The only thing that needs real scrutiny is the overall effect of the new identifier-generation rules.
Comment 9 Ulrich Müller gentoo-dev 2019-12-06 16:13:51 UTC
(In reply to Michael Orlitzky from comment #3)
> Created attachment 574552 [details, diff] [details, diff]
> 0001-devbook.xsl-disambiguate-auto-generated-header-ident.patch
> 
> Procrastinating real hard today. Here's a much better patch that uses the
> full path into the hierarchy as the identifier. It also adds a tidy-html5
> check to the Makefile to ensure that this doesn't get screwed up again.
> Finally, I fix some other random issues that tidy-html5 pointed out.

Why would we want to do this? The path into the hierarchy is already present in the URL path, so it doesn't have to be repeated in the fragment identifier. Unless you like really long and redundant URLs like this:
https://devmanual.gentoo.org/archs/amd64/index.html#arch-specific-notes----amd64/em64t--multilib-on-amd64--the-multilib-strict-feature

In other words, in order to have unique ids, it is enough to start at the section level.


(In reply to Michael Orlitzky from comment #7)
> Created attachment 574560 [details, diff] [details, diff]
> 0005-appendices-contributing-devbook-guide-fix-missing-ur.patch

This should really be fixed in devbook.xsl. IMHO the example from the devbook-guide "<uri>https://forums.gentoo.org</uri> is my favorite web site." is perfectly valid and used to work in GuideXML.
Comment 10 Michael Orlitzky gentoo-dev 2019-12-06 17:41:32 UTC
(In reply to Ulrich Müller from comment #9)
> 
> Why would we want to do this? The path into the hierarchy is already present
> in the URL path, so it doesn't have to be repeated in the fragment
> identifier. Unless you like really long and redundant URLs like this:
> https://devmanual.gentoo.org/archs/amd64/index.html#arch-specific-notes----
> amd64/em64t--multilib-on-amd64--the-multilib-strict-feature
> 
> In other words, in order to have unique ids, it is enough to start at the
> section level.

Agreed that the URLs are horrendous. I can try that once your XML pull request is merged, and once the URI thing is sorted out. I was mainly convincing myself that this issue could actually be solved, here. Now it's just a matter of solving it in the best way possible.


> 
> (In reply to Michael Orlitzky from comment #7)
> > Created attachment 574560 [details, diff] [details, diff] [details, diff]
> > 0005-appendices-contributing-devbook-guide-fix-missing-ur.patch
> 
> This should really be fixed in devbook.xsl. IMHO the example from the
> devbook-guide "<uri>https://forums.gentoo.org</uri> is my favorite web
> site." is perfectly valid and used to work in GuideXML.

Ok, done on bug 702180, but only lightly tested.
Comment 11 Ulrich Müller gentoo-dev 2019-12-26 12:38:04 UTC
(In reply to Michael Orlitzky from comment #10)

ping
Comment 12 Michael Orlitzky gentoo-dev 2019-12-26 12:45:24 UTC
(In reply to Ulrich Müller from comment #11)
> (In reply to Michael Orlitzky from comment #10)
> 
> ping

It's still on my radar; after the XML validation I was waiting on the javascript search stuff. I won't delete the bugzilla email from my inbox until I've updated the patches.
Comment 13 Michael Orlitzky gentoo-dev 2019-12-26 16:11:05 UTC
Created attachment 600638 [details, diff]
0001-devbook.xsl-disambiguate-auto-generated-header-ident.patch
Comment 14 Michael Orlitzky gentoo-dev 2019-12-26 16:11:28 UTC
Created attachment 600640 [details, diff]
0002-devbook.xsl-strip-leading-trailing-whitespace-from-h.patch
Comment 15 Michael Orlitzky gentoo-dev 2019-12-26 16:11:56 UTC
Created attachment 600642 [details, diff]
0003-Makefile-add-new-app-text-tidy-html5-sanity-check.patch
Comment 16 Michael Orlitzky gentoo-dev 2019-12-26 16:15:55 UTC
(In reply to Ulrich Müller from comment #9)
> 
> Why would we want to do this? The path into the hierarchy is already present
> in the URL path, so it doesn't have to be repeated in the fragment
> identifier. Unless you like really long and redundant URLs like this:
> https://devmanual.gentoo.org/archs/amd64/index.html#arch-specific-notes----
> amd64/em64t--multilib-on-amd64--the-multilib-strict-feature
> 
> In other words, in order to have unique ids, it is enough to start at the
> section level.
> 

The new DTD says that multiple chapters are allowed on a single page, which I think makes it necessary to include the chapter information in the anchor as well. For example,

  <guide>
    <chapter>
      <title>Chapter 1</title>
      <section>
        <title>Introduction</title>
      </section>
    </chapter>

    <chapter>
      <title>Chapter 2</title>
      <section>
        <title>Introduction</title>
      </section>
    </chapter>
  </guide>

would otherwise produce identical anchors for the two sections.
Comment 17 Ulrich Müller gentoo-dev 2019-12-26 16:42:08 UTC
(In reply to Michael Orlitzky from comment #16)
> The new DTD says that multiple chapters are allowed on a single page, which
> I think makes it necessary to include the chapter information in the anchor
> as well.

It is enough to start at the section level, because all chapters are in separate files. If the DTD says anything different then it should be fixed.
Comment 18 Ulrich Müller gentoo-dev 2019-12-26 17:04:11 UTC
(In reply to Ulrich Müller from comment #17)
> If the DTD says anything different then it should be fixed.

Done: https://gitweb.gentoo.org/proj/devmanual.git/commit/?id=aa26b473992dd1df1417153b0b03cae6cfe28f8a
Comment 19 Michael Orlitzky gentoo-dev 2019-12-26 17:57:49 UTC
Created attachment 600654 [details, diff]
0001-devbook.xsl-disambiguate-auto-generated-header-ident.patch
Comment 20 Michael Orlitzky gentoo-dev 2019-12-26 17:58:12 UTC
Created attachment 600658 [details, diff]
0002-devbook.xsl-strip-leading-trailing-whitespace-from-h.patch
Comment 21 Michael Orlitzky gentoo-dev 2019-12-26 17:59:01 UTC
Created attachment 600662 [details, diff]
0003-Makefile-add-new-app-text-tidy-html5-sanity-check.patch

Ok, starting from the section now.
Comment 22 Ulrich Müller gentoo-dev 2019-12-26 18:10:55 UTC
(In reply to Michael Orlitzky from comment #19)
> Created attachment 600654 [details, diff] [details, diff]
> 0001-devbook.xsl-disambiguate-auto-generated-header-ident.patch

Note to self: Convert tabs to spaces before merging.

(I've unified indentation style to 2 spaces in https://gitweb.gentoo.org/proj/devmanual.git/commit/?id=e9cfcb2d945a8379624467ef6f85fb7968db47fe, so we really shouldn't reintroduce tabs.)
Comment 23 Ulrich Müller gentoo-dev 2020-01-02 11:00:06 UTC
Any idea how to automatically update all internal cross references (and rewrapping to 80 chars if necessary)?
Comment 24 Larry the Git Cow gentoo-dev 2020-01-02 12:54:58 UTC
The bug has been closed via the following commit(s):

https://gitweb.gentoo.org/proj/devmanual.git/commit/?id=2330779776b8ed53ec91629db2d2d56b65e64eb7

commit 2330779776b8ed53ec91629db2d2d56b65e64eb7
Author:     Michael Orlitzky <mjo@gentoo.org>
AuthorDate: 2019-04-28 16:24:51 +0000
Commit:     Ulrich Müller <ulm@gentoo.org>
CommitDate: 2020-01-02 12:43:21 +0000

    Makefile: add new app-text/tidy-html5 sanity check.
    
    This new PHONY "make tidy" target runs the tidy-html5 program, using a
    new .tidyrc file, to ensure that the HTML we have generated is free
    from certain problems. In particular, it should complain if bug 626032
    ever resurfaces and there are duplicate identifiers in some document.
    
    Closes: https://bugs.gentoo.org/626032
    Signed-off-by: Michael Orlitzky <mjo@gentoo.org>
    [Command line options instead of .tidyrc file. Don't fail on first error.]
    Signed-off-by: Ulrich Müller <ulm@gentoo.org>

 Makefile | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

Additionally, it has been referenced in the following commit(s):

https://gitweb.gentoo.org/proj/devmanual.git/commit/?id=b5bfc69fab686a49e6fbf54aceae2bb12885dc5a

commit b5bfc69fab686a49e6fbf54aceae2bb12885dc5a
Author:     Michael Orlitzky <mjo@gentoo.org>
AuthorDate: 2019-04-28 16:51:57 +0000
Commit:     Ulrich Müller <ulm@gentoo.org>
CommitDate: 2020-01-02 12:41:01 +0000

    devbook.xsl: strip leading/trailing whitespace from header identifiers.
    
    We were already replacing spaces *within* these identifiers, so we
    don't have to worry about that.
    
    Bug: https://bugs.gentoo.org/626032
    Signed-off-by: Michael Orlitzky <mjo@gentoo.org>
    Signed-off-by: Ulrich Müller <ulm@gentoo.org>

 devbook.xsl | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
Comment 25 Ulrich Müller gentoo-dev 2020-01-02 12:58:51 UTC
Patches 2/3 and 3/3 merged.
Reopening, because patch 3/3 doesn't resolve this (and I missed the "Closes" tag).
Comment 26 Michael Orlitzky gentoo-dev 2020-01-02 14:19:50 UTC
(In reply to Ulrich Müller from comment #23)
> Any idea how to automatically update all internal cross references (and
> rewrapping to 80 chars if necessary)?

I have an idea, but you're not going to like it.

The W3C's link-checker (https://github.com/w3c/link-checker) can tell if an anchor link points to a dead-end. You'd have to install it locally (it's not packaged for Gentoo), hack it to run on file:// URLs, and kill the one-second minimum delay... but then you can grep out all of the broken anchors pretty quickly.

There are other broken links, though. It probably makes more sense to go through and fix them all afterwards than it does to selectively ignore a bunch of stuff that is actually broken. The online checker (https://validator.w3.org/checklink) could be used for that.
Comment 27 Ulrich Müller gentoo-dev 2020-01-04 14:54:18 UTC
I've started looking at the internal links. IIUC, for a link to the "Suitable Download Hosts" subsection in the mirrors chapter, one would have to use the following as a link attribute now: "::general-concepts/mirrors#Automatic Mirroring--Suitable Download Hosts"

I can't say that I find this syntax intuitive.

Maybe we should just make section titles unique within each chapter. It affects only ebuild-writing/functions/src_compile (where arguably the examples could be in a list instead of subsections, or EAPI 0 could just be dropped) and ebuild-writing/eapi (where we could simply say "EAPI 2 Helpers" etc.).
Comment 28 Larry the Git Cow gentoo-dev 2020-01-07 08:42:27 UTC
The bug has been closed via the following commit(s):

https://gitweb.gentoo.org/proj/devmanual.git/commit/?id=1343712b8137aa73e45df0a249f8bda303ed328a

commit 1343712b8137aa73e45df0a249f8bda303ed328a
Author:     Ulrich Müller <ulm@gentoo.org>
AuthorDate: 2020-01-04 18:21:32 +0000
Commit:     Ulrich Müller <ulm@gentoo.org>
CommitDate: 2020-01-04 18:21:32 +0000

    Make (sub-)section titles unique within each chapter.
    
    Section, subsection, etc. titles are used to construct ID attributes,
    which must be unique within each chapter.
    
    One approach to disambiguate these identifiers would be to use the
    complete section hierarchy for constructing them (see bug 626032).
    However, that would break external references (and for example, links
    from bugzilla or from mailing list archives cannot be updated).
    
    Use a less intrusive approach instead and make the titles of the few
    ambiguous subsections unique, or convert them to a list.
    
    Closes: https://bugs.gentoo.org/626032
    Signed-off-by: Ulrich Müller <ulm@gentoo.org>

 ebuild-writing/eapi/text.xml                  | 42 +++++++++++++--------------
 ebuild-writing/functions/src_compile/text.xml | 36 ++++++++++++-----------
 2 files changed, 40 insertions(+), 38 deletions(-)