As requested by jkt... rst2guidexml for GLEPs and devmanual will require the following extensions to guidexml. If these aren't possible, we'll have to do straight HTML rendering instead. * Definition lists. These are used heavily in devmanual and GLEPs, and they're a pretty basic structural element. Definition terms must be allowed to contain multiple paragraphs. * Multiple paragraph list items. Not sure whether these are legal currently? * Multiple paragraph admonitions. * Nested lists. Again, I'm not sure whether these are legal? * Ability to specify "numeric" or "alphabetical" or "roman" for ordered lists. This one's more of a maybe than a requirement, especially since my parser doesn't get these right yet :) * Inline markup for superscript and subscript. * Inline markup for strong emphasis (rendered as bold in visual user agents). * A few more admonitions. We currently use todo, epigraph and terminology. We also use danger and caution, but these could probably be rendered as warnings. Note that epigraph is a two part thing. The rst looks like this: .. epigraph:: A policy is a restrictive document to prevent a recurrence of a single incident, in which that incident is never mentioned. -- Kaufman's Law The docbook equivalent looks like this: <epigraph> <attribution>Kaufman's Law</attribution> <para>A policy is a restrictive document to prevent a recurrence of a single incident, in which that incident is never mentioned.</para> </epigraph> * Tables: We need captions. We could make good use of column spanning. Being able to specify the scope of a heading would be nice. * pre code blocks: Having some way of setting a red background and a big "don't do this" would be good. Syntax highlighting would be nice too, but we know that that's not going to happen :) A few of the GLEPs use nasty stuff like blockquotes (this is almost always accidental) and nested structural elements (eg GLEP 2, which is demonstrating how rst supports nested structural elements, and GLEP 36, which can easily be changed). Personally I don't care about these... Easier to tinker with the GLEP formatting slightly... Plus rst2guidexml will reject nested markup anyway for sanity reasons. Anyway, if these aren't feasible, we could render direct to HTML.
Can we CC the www-redesign folks, as well? (I'm hoping that Swift knows if there's a bugs alias set up for that.) My understanding is that they're working on a substantial xsl rewrite as it is, so perhaps we can drop this bug on them as well.
From the quick look of it, I'd say that: * These we already have: > * Nested lists. Again, I'm not sure whether these are legal? Should be valid, from what I read from dtd. Lists' content models contain lists as well. Don't know about xslt, but unless it does something really weird it should work. > * Inline markup for strong emphasis (rendered as bold in visual user agents). But we already have one more level of emphases in elements than HTML: e, b, and brite vs. HTML em and strong? * These'd be quite trivial to add if so decided: > * Definition lists. These are used heavily in devmanual and GLEPs, and they're a > pretty basic structural element. Definition terms must be allowed to contain > multiple paragraphs. > > * Multiple paragraph list items. Not sure whether these are legal currently? > > * Multiple paragraph admonitions. > > * Ability to specify "numeric" or "alphabetical" or "roman" for ordered lists. > This one's more of a maybe than a requirement, especially since my parser > doesn't get these right yet :) > > * Inline markup for superscript and subscript. These could be done by adding simple elements or attributes to dtd and almost direct transformation to xslt. For multiple paragraphs inside block level elements we could either bump paragraph in to inline group or add a new group for stuff that can exist nested to some other block elements. And these might require some thinking: > * A few more admonitions. We currently use todo, epigraph and terminology. We > also use danger and caution, but these could probably be rendered as warnings. > Note that epigraph is a two part thing. The rst looks like this: Todo and terminology are probably trivial additions as well, as they are exactly equal to pre-existing elements (I do hope that we won't style them as yet another coloured block though, the current documents are rainbowy enough as it is ;-) For epigraph I'd suggest using the HTML's blockquote (p*,cite) model, but that still might require some further thinking? > * Tables: We need captions. We could make good use of column spanning. Being > able to specify the scope of a heading would be nice. Captions are trivial, and I'd easily welcome it as obligatory for tables, since it's in HTML really useful accessibility element etc. (Of course in GuideXML it probably need not be separate element but an attribute.) Complex table models however, are always a major PITA dealing with markup languages, it is possible to copycat HTML's version of table model somehow, but should it ever be needed to do something else than simple transform of guideXML->HTML, it's gonna be a mess. > * pre code blocks: Having some way of setting a red background and a big "don't > do this" would be good. Would this essentially be a code block inside of warn block? That's something I think should be doable in relatively few changes everywhere. > Syntax highlighting would be nice too, but we know that > that's not going to happen :) Well, adding the support for syntax highlighting in code blocks wouldn't be very hard, just one new inline element for code blocks that has one attribute for syntactic classes or colors or whatever. Only problem is that if that is used in a code block, the resulting markup will become totally unreadable. Just my 0,02
From the quick look of it, I'd say that: * These we already have: > * Nested lists. Again, I'm not sure whether these are legal? Should be valid, from what I read from dtd. Lists' content models contain lists as well. Don't know about xslt, but unless it does something really weird it should work. > * Inline markup for strong emphasis (rendered as bold in visual user agents). But we already have one more level of emphases in elements than HTML: e, b, and brite vs. HTML em and strong? * These'd be quite trivial to add if so decided: > * Definition lists. These are used heavily in devmanual and GLEPs, and they're a > pretty basic structural element. Definition terms must be allowed to contain > multiple paragraphs. > > * Multiple paragraph list items. Not sure whether these are legal currently? > > * Multiple paragraph admonitions. > > * Ability to specify "numeric" or "alphabetical" or "roman" for ordered lists. > This one's more of a maybe than a requirement, especially since my parser > doesn't get these right yet :) > > * Inline markup for superscript and subscript. These could be done by adding simple elements or attributes to dtd and almost direct transformation to xslt. For multiple paragraphs inside block level elements we could either bump paragraph in to inline group or add a new group for stuff that can exist nested to some other block elements. And these might require some thinking: > * A few more admonitions. We currently use todo, epigraph and terminology. We > also use danger and caution, but these could probably be rendered as warnings. > Note that epigraph is a two part thing. The rst looks like this: Todo and terminology are probably trivial additions as well, as they are exactly equal to pre-existing elements (I do hope that we won't style them as yet another coloured block though, the current documents are rainbowy enough as it is ;-) For epigraph I'd suggest using the HTML's blockquote (p*,cite) model, but that still might require some further thinking? > * Tables: We need captions. We could make good use of column spanning. Being > able to specify the scope of a heading would be nice. Captions are trivial, and I'd easily welcome it as obligatory for tables, since it's in HTML really useful accessibility element etc. (Of course in GuideXML it probably need not be separate element but an attribute.) Complex table models however, are always a major PITA dealing with markup languages, it is possible to copycat HTML's version of table model somehow, but should it ever be needed to do something else than simple transform of guideXML->HTML, it's gonna be a mess. > * pre code blocks: Having some way of setting a red background and a big "don't > do this" would be good. Would this essentially be a code block inside of warn block? That's something I think should be doable in relatively few changes everywhere. > Syntax highlighting would be nice too, but we know that > that's not going to happen :) Well, adding the support for syntax highlighting in code blocks wouldn't be very hard, just one new inline element for code blocks that has one attribute for syntactic classes or colors or whatever. Only problem is that if that is used in a code block, the resulting markup will become totally unreadable. Just my 0,02 from techical pov, not considering much other issues :-)
(In reply to comment #2) > But we already have one more level of emphases in elements than HTML: e, b, and > brite vs. HTML em and strong? Oh, we do? I'm using http://www.gentoo.org/doc/en/xml-guide.xml as my reference. It mentions <e> , but not <b> or <brite>. > Todo and terminology are probably trivial additions as well, as they are exactly > equal to pre-existing elements (I do hope that we won't style them as yet > another coloured block though, the current documents are rainbowy enough as it > is ;-) I'm using a light grey for todo stuff in the devmanual currently. Terminology is colourless but rendered in a box thingie. > Complex table models however, are always a major PITA dealing with markup > languages, it is possible to copycat HTML's version of table model somehow, but > should it ever be needed to do something else than simple transform of > guideXML->HTML, it's gonna be a mess. *cough*xmlsucks*cough* The HTML-style rowspan and colspan attributes are workable. Docbook uses a similar model, and my own RST parser basically stores things as spans internally too. > > * pre code blocks: Having some way of setting a red background and a big "don't > > do this" would be good. > > Would this essentially be a code block inside of warn block? That's something I > think should be doable in relatively few changes everywhere. Hrm. Not really the effect I was after. The way it's used is: You should not make the value of SRC_URI conditional: SRC_URI="http://example.com/${P}.tar.bz2" if use doc ; then SRC_URI="${SRC_URI} http://example.com/${P}-doc.tar.bz2" fi Instead, use the usual USE conditional syntax within the variable: SRC_URI="http://example.com/${P}.tar.bz2 doc? ( http://example.com/${P}-doc.tar.bz2 )" The first code block needs to be clearly marked as "this is wrong!" in case anyone skim-reading accidentally picks it up and copies it. > > Syntax highlighting would be nice too, but we know that > > that's not going to happen :) > > Well, adding the support for syntax highlighting in code blocks wouldn't be very > hard, just one new inline element for code blocks that has one attribute for > syntactic classes or colors or whatever. Only problem is that if that is used in > a code block, the resulting markup will become totally unreadable. > > Just my 0,02
(In reply to comment #2) > But we already have one more level of emphases in elements than HTML: e, b, and > brite vs. HTML em and strong? Oh, we do? I'm using http://www.gentoo.org/doc/en/xml-guide.xml as my reference. It mentions <e> , but not <b> or <brite>. > Todo and terminology are probably trivial additions as well, as they are exactly > equal to pre-existing elements (I do hope that we won't style them as yet > another coloured block though, the current documents are rainbowy enough as it > is ;-) I'm using a light grey for todo stuff in the devmanual currently. Terminology is colourless but rendered in a box thingie. > Complex table models however, are always a major PITA dealing with markup > languages, it is possible to copycat HTML's version of table model somehow, but > should it ever be needed to do something else than simple transform of > guideXML->HTML, it's gonna be a mess. *cough*xmlsucks*cough* The HTML-style rowspan and colspan attributes are workable. Docbook uses a similar model, and my own RST parser basically stores things as spans internally too. > > * pre code blocks: Having some way of setting a red background and a big "don't > > do this" would be good. > > Would this essentially be a code block inside of warn block? That's something I > think should be doable in relatively few changes everywhere. Hrm. Not really the effect I was after. The way it's used is: You should not make the value of SRC_URI conditional: SRC_URI="http://example.com/${P}.tar.bz2" if use doc ; then SRC_URI="${SRC_URI} http://example.com/${P}-doc.tar.bz2" fi Instead, use the usual USE conditional syntax within the variable: SRC_URI="http://example.com/${P}.tar.bz2 doc? ( http://example.com/${P}-doc.tar.bz2 )" The first code block needs to be clearly marked as "this is wrong!" in case anyone skim-reading accidentally picks it up and copies it. > > Syntax highlighting would be nice too, but we know that > > that's not going to happen :) > > Well, adding the support for syntax highlighting in code blocks wouldn't be very > hard, just one new inline element for code blocks that has one attribute for > syntactic classes or colors or whatever. Only problem is that if that is used in > a code block, the resulting markup will become totally unreadable. > > Just my 0,02 from techical pov, not considering much other issues :-) XML is already unreadable. It's not like we'll ever be writing it by hand. *shrug* This one really isn't a biggie.
My personal feeling has always been to not extend GuideXML too much - it is meant to be simple (but not feature-full). It is already hacked up beyond what it was meant to do with the Handbook. I wouldn't mind updating GuideXML to support a hellofalot more, but... ... there are a few document formats around that are already quite impressive feature-wise. Perhaps we should think about supporting one of the more powerful languages? If not, I'd rather pick one of the following two options: - leave GuideXML as-is - improve GuideXML, might break compatibility With the second option, we should be able to tackle the current shortcomings of GuideXML (like including shared content regardless of size (part, chapter or sections), inter-page links, ...) if XSLT allows it.
Well, there's always docbook, which does world + dog and is a perfect example of why XML is so frickin' nasty...
I have made several comments and suggestions about the requested improvements and implemented some at http://gentoo.neysx.org/mystuff/bug-106017.xml I'll add support for <dl>, <dt> & <dd> if really required.
(In reply to comment #6) > I have made several comments and suggestions about the requested improvements > and implemented some at http://gentoo.neysx.org/mystuff/bug-106017.xml > I'll add support for <dl>, <dt> & <dd> if really required. Although I might well agree with many of the arguments in neysx's RFC page if the issue were purely the goal of keeping guidexml simple, that page seems to miss the point (or at least I missed it there) that the reason for requesting these additions is to make it easier to convert from rst (which is what GLEPs and the devmanual use) directly to guidexml. (Incidentally, GLEP 11 makes a good case for having definition lists.) Ciaranm, perhaps you could identify (based on the previous postings on this bug) which bits of rst are still going to be hard to render in guidexml without some guidexml additions?
Ok, based upon Xavier's examples... We need definition lists. GLEPs 11, 16, 22 use them. The devmanual's full of them (/general-concepts/features/, /general-concepts/tree/ are some examples -- the changelog is only a definition list because of a bug in an old docutils version). The alternative is zillions of heading levels. I guess we could use two <br />s for multi paragraph lists. Kinda icky, but workable. sub and sup aren't used in the GLEPs explicitly. I think there're a couple of them in the devmanual that can be removed. I *was* going to use them for footnotes (rst supports these explicitly), but I can render them as [1], [2] etc instead if necessary. Still, this one looks like an easy change... We could use some nasty markup hacks for admontions and epigraphs too. Again, icky. The code block marking using comments should work. Emphasis probably isn't necessary. The colour coding... The way it's done currently for devmanual is via Vim. Vim gives us names, not colours. Currently we're using Number, Special, Identifier, Type, PreProc, String, Constant, lnr, Comment, Statement, DiffAdd, Title, Underlined. There are a half dozen or so more things it *could* generate that just happen to not be used yet. The colours it uses are the same ones that quite a few Gentoo developers use when editing ebuilds -- mostly it's because I'm too lazy to select a set designed for display rather than editing. Anyway, this one can wait. Table spans aren't needed for GLEPs. They're used in the devmanual in a few places. It may be possible to get around that. So the biggie is definition lists. They're used in GLEPs (which are a higher priority than devmanual imo) and there isn't a nice alternative. Superscript would be good too if it's easy. The rest can wait.
Looks like some guys still need to make up their mind. I have shown how <dt> could be implemented. Gosh, 7 minutes of my life wasted! From GLEP-11: <dl> <dt>The main issues are:</dt> <dd><ul class="first last simple"> <li>transition of existing configuration files to the /etc/webapps/${PF}/ directory.</li> <li>modification/reconfiguration of applications so that they are aware of the location of configuration files.</li> <li>creating the VHost Config toolset to enable installation and configuration of web applications irrespective of web server.</li> </ul> </dd> </dl> That shows more how <dl> can be abused than how much it is so required to allow a couple of devs write handful of tiny gleps that could easily have been written in plain GuideXML in the first place.
No, that shows how terrible the code generated by rst2html is.
(In reply to comment #10) > No, that shows how terrible the code generated by rst2html is. I don't care how the html code looks. Shown definition list has a single term "The main issues are:" defined as a 3-item list. <irony>Now I see why <dl/> is so much required</irony>
No no no. That's only a definition list because rst2html tries to equate anything that gets indented to either a blockquote or a definition list. See, if you give rst2html code like this: Stuff: * Moo * Oink It will handle it as a definition list with a bullet list inside the definition body. What a sane processor does is moan at the user to stick in a blank line like the RST spec requires, and then generates a paragraph and a bullet list as expected. There are at least three GLEPs that use definition lists legitimately. The processor we'll be using to generate guidexml handles these sanely, and won't spit out the kind of nonsense that rst2html does.
(In reply to comment #11) > (In reply to comment #10) > > No, that shows how terrible the code generated by rst2html is. > > I don't care how the html code looks. > Shown definition list has a single term "The main issues are:" defined as a > 3-item list. > <irony>Now I see why <dl/> is so much required</irony> Actually, it's buggy rst. Take a look at http://www.gentoo.org/proj/en/glep/glep-0011.txt, and I think you'll see that it wasn't supposed to be a definition list. On the other hand, the top of that file shows entries which clearly _are_ definitions. It is for these elements that we'd like to have definition lists. Mr. Neys, I understand that this request probably seems stupid to you. Why shouldn't GLEPs just be written in guidexml in the first place? The answer is that many people find rst to be much simpler to write (and read, when it comes to raw text) than even our admittedly quite nice guidexml. If you don't believe me (that other people may find rst easier, even if you personally do not), then notice that although we have always allowed GLEP authors to use either rst or guidexml (http://www.gentoo.org/proj/en/glep/glep-0001.html#glep-formating-and-template), only one GLEP to date has been written in guidexml.
(In reply to comment #13) > Mr. Neys, I understand that this request probably seems stupid to you. Why > shouldn't GLEPs just be written in guidexml in the first place? The answer is > that many people find rst to be much simpler to write (and read, when it comes > to raw text) than even our admittedly quite nice guidexml. If you don't believe > me (that other people may find rst easier, even if you personally do not), then > notice that although we have always allowed GLEP authors to use either rst or > guidexml > (http://www.gentoo.org/proj/en/glep/glep-0001.html#glep-formating-and-template), > only one GLEP to date has been written in guidexml. What about employing GDP (or any other entity, maybe GLEP editors) for doing RST->GuideXML transformations?
Are you up for doing manual conversions on 50k words of developer documentation that will probably be changed several times per week?
(In reply to comment #15) > Are you up for doing manual conversions on 50k words of developer documentation > that will probably be changed several times per week? Sure. If it is under CC-BY-SA, if it can be placed under /doc/en/ and if you'll submit bugs with patches for updates, I'll convert it and happily commit your updates.
(In reply to comment #14) > What about employing GDP (or any other entity, maybe GLEP editors) for doing > RST->GuideXML transformations? Um, I am the GLEP editor. I appreciate guidexml, but I don't really enjoy writing it. Thus, part of the desire for an rst2guidexml program....
(In reply to comment #7) > Although I might well agree with many of the arguments in neysx's RFC page if > the issue were purely the goal of keeping guidexml simple, that page seems to > miss the point (or at least I missed it there) that the reason for requesting > these additions is to make it easier to convert from rst (which is what GLEPs > and the devmanual use) directly to guidexml. Not quite. The point of this RFC is to make rst2guidexml possible without having to code day & night for the next 3 months. Keeping guidexml simple enough is not a goal, it's a requirement. (In reply to comment #12) > No no no. That's only a definition list because rst2html tries to equate > anything that gets indented to either a blockquote or a definition list. [snip] > It will handle it as a definition list with a bullet list inside the definition > body. What a sane processor does is moan at the user to stick in a blank line > like the RST spec requires, and then generates a paragraph and a bullet list as > expected. I stand corrected. Thanks. I did mean it when I wrote "I have no real objection to <dl>, <dt> and <dd>." Unless swift strongly objects to it, I've already shown how easy it is to implement. (In reply to comment #13) > Mr. Neys, No need to be so formal, but I appreciate the thought :) > I understand that this request probably seems stupid to you. No, it doesn't, but it does not mean that I would allow stupid things to creep into GuideXML either. > Why shouldn't GLEPs just be written in guidexml in the first place? To make them look that a genuine Gentoo document, which is why we have this RFC > The answer is > that many people find rst to be much simpler to write (and read, when it comes > to raw text) than even our admittedly quite nice guidexml. If you don't believe > me (that other people may find rst easier, even if you personally do not), then > notice that although we have always allowed GLEP authors to use either rst or > guidexml I do understand that, but it is still not a reason to tweak GuideXML beyond reason just to try to fit rst into it or to make rst2guidexml easier to write. What's next, some dev has written that really nice doc with textile markup and wants all its bells & whistles imported? I'd like to turn those gleps into nice GuideXML, I'd like to make that devmanual a nice handbook and I believe our users would really appreciate it, but I do not want to turn GuideXML into a huge mess or damage it in the process. To sum it up: <dl>: requested & done Paragraphs inside paragraphs: even html does not allow that, I've already shown how to add line breaks where you want line breaks. Nested lists: already in GuideXML Numbering type in <ol>: requested & done Inline markup for superscript & subscript: requested, done, not required after all Inline markup for strong emphasis: already in GuideXML Admonitions: we already have <note>, <impo> & <warn>, I agreed one one more. Make up your mind. Epigraphs: requested & done in a very simple & straightforward way Table titles with rowspan & colspan: requested, done, maybe not really required after all Colour coding: requested & done even though I do not want to add a dozen tags because a lexical parser puts labels on everything. <special>This must be some fscking special code</special> is meaningless. Let's start with 5 meaningful tags (not counting <comment> & <i> that already exist). Just give me 5 names & 5 colours (#rrggbb) and our <pre> will already look much better. I'd like to wait on swift's comments before I implement anything. Not that I'll do as he says anyway, but I'd like his input :) I also do not mind implementing (& documenting) one feature at a time, but I'm not going to add, remove, redo, undo... I believe I have answered every items on this RFC in a reasonable way that would both make rst2guidexml possible and enhance GuideXML while preserving its key features.
So we're all waiting for my input huh? Okay then :) 1/ I won't mind definition lists to be added. They can be very well used by our documentation as well, although I really don't see why we can't use something like: <ul> <li><b>Term</b>: Definition</li> </ul> 2/ Using different types of listings is imo useless... more cosmetic than content. 3/ Adding <sub> and <sup>... fine by me - it is indeed interesting to use for footnotes. 4/ Regarding admonitions: I don't think we really want this. After all, you can friendly express anything with plain English texts - no need for additional bells and whistles imo. 5/ Epigraphs: okay, I can see why we want it. But do we really want to add anything to GuideXML to change <p> Blabla<br/><br/>-- Foobar </p> to <p by="Foobar"> Blabla </p> ? 6/ Colspan: yes, agree with adding support for it. 7/ Adding <b> in <pre>, sure 8/ Adding colors in <pre>: well, yes - but I really don't want any GDP members to use them exuberantly - it'll make the code harder to read/edit/audit. Just my opinion.
(In reply to comment #19) I suppose you all have noticed the latest GDP status (Swift: thanks for posting it). > 5/ Epigraphs: okay, I can see why we want it. But do we really want to add > anything to GuideXML to change > <p> > Blabla<br/><br/>-- Foobar > </p> That does not like an epigraph, neither in xml, nor in the generated html. <p> <e>Blabla</e><br/><br/>-- Foobar </p> makes the html look a bit better, and the xml worse. <p by="Foobar"> Blabla </p> is easy to remember, easy to write, easy to search for (//p[@by]) and easy to transform. > 7/ Adding <b> in <pre>, sure Done as well. > 8/ Adding colors in <pre>: well, yes - but I really don't want any GDP members > to use them exuberantly - it'll make the code harder to read/edit/audit. > Just my opinion. Agreed, that's why I'd like to keep it to 4/5 meaningful tags. Looking at http://gentoo.neysx.org/mystuff/bug-106017.xml#doc_chap2_pre3 I do find that the coloured version looks better. If we could agree on 4 or 5 tags/colours, I'd be happy to implement it.
As asked by Mark, I have added <var> to the suggested tags. <keyword>, <ident>, <const>, <stmt> and <var> will be added very soon as shown on http://gentoo.neysx.org/mystuff/bug-106017.xml#doc_chap2_sect10 unless anyone starts shouting.
Colour coding is now available. Enjoy!