I've found a patch quite useful for tidy. It converts HTML to DocBook/XML and DocBook/SGML, i've updated the ebuild to work with it.
Created attachment 5524 [details] htmltidy-2.7.18.ebuild
Created attachment 5525 [details, diff] dbpatch - patch which does the magic of converting HTML -> (XML|SGML)
This is a pretty impressive patch. Would it be better to submit to the upstream maintainer and then roll our own sources till [s]he implements it?
Any updates? Have you contacted upstream?
I had no response from ........ maybe if you try....
I'll just commit the changes. :) Thanks for your work!