My locale is ru_RU.UTF-8, but all messages that show "man" utility in utf-8. For example if enter $man without parameters, it say, if translate to English: "What manual page do you want?". but I see only a garbage. but if I do like this $man 2>&1 | iconv -f koi8-r -t utf-8 all fine. so I change man ebuild : if there is unicode flag in USE, all russian messages will converted to utf-8. Reproducible: Always Steps to Reproduce: 1. 2. 3.
Created attachment 59618 [details] fixed ebuild
I think it's upstream problem. Can you create bug on upstream bugzilla ?
>I think it's upstream problem. >Can you create bug on upstream bugzilla ? You mean sent bug's description to developer(Andries Brouwer) of "man" utility?
Yes. I think it's bug of russian translator of "man". All translates MUST be in utf-8 encoding.
>All translates MUST be in utf-8 encoding. why do you think so? if you look at subidrectory "msgs" in "man" source tree, you can see this situation: $ cat *.codeset $ codeset=cp1251 $ codeset=iso-8859-2 $ codeset=iso-8859-1 $ codeset=iso-8859-1 $ codeset=iso-8859-7 $ codeset=iso-8859-1 $ codeset=iso-8859-1 $ codeset=iso-8859-1 $ codeset=iso-8859-1 $ codeset=iso-8859-2 $ codeset=iso-8859-1 $ codeset=euc-jp $ codeset=euc-kr $ codeset=iso-8859-1 $ codeset=iso-8859-2 $ codeset=iso-8859-1 $ codeset=iso-8859-2 $ codeset=koi8-r $ codeset=iso-8859-2 no one message in utf-8.
(In reply to comment #5) > >All translates MUST be in utf-8 encoding. > > why do you think so? > > if you look at subidrectory "msgs" in "man" source tree, > you can see this situation: > > $ cat *.codeset > $ codeset=cp1251 > $ codeset=iso-8859-2 > $ codeset=iso-8859-1 > $ codeset=iso-8859-1 > $ codeset=iso-8859-7 > $ codeset=iso-8859-1 > $ codeset=iso-8859-1 > $ codeset=iso-8859-1 > $ codeset=iso-8859-1 > $ codeset=iso-8859-2 > $ codeset=iso-8859-1 > $ codeset=euc-jp > $ codeset=euc-kr > $ codeset=iso-8859-1 > $ codeset=iso-8859-2 > $ codeset=iso-8859-1 > $ codeset=iso-8859-2 > $ codeset=koi8-r > $ codeset=iso-8859-2 > > no one message in utf-8. > > yes it now that! BUT! In the future all transtation to be in utf-8.
Created attachment 61867 [details] man-1.5p-r1.ebuild
I recieved answer from current mantainter of man utility. He said that it will be fixed. I change ebuild script to fix this problem for all locales if unicode in USE flags.
why so big delay? it is realy hard to add this trivial fix to ebuild? now 1.6 stable and have the same problem.
Created attachment 67503 [details] patched ebuild for man-1.6-r1 I try to solve this problem with my way, but found solution here and use it (after bug fixing). Also I resolve problem with incorrect supported language detection (it did not detect russian LINGUA). This ebuild is checked for correct working with English and Russian. But I think that it will work correctly with other languages.
(In reply to comment #10) > Created an attachment (id=67503) [edit] > patched ebuild for man-1.6-r1 > > I try to solve this problem with my way, but found solution here and use it > (after bug fixing). Also I resolve problem with incorrect supported language > detection (it did not detect russian LINGUA). > This ebuild is checked for correct working with English and Russian. But I > think that it will work correctly with other languages. I repeat This is UPSTREAM problem! Create bug on upstream bugzilla!
>I repeat This is UPSTREAM problem! Yes, it is. But I sent a bug report several months ago, mantainer confirm, that there is such bug, and nothing... Why not add this as temporary solution, and when it was fixed remove this five lines from ebuild? Why say this is upstream problem, and wait several years when mantainer of "man" fix this problem?
> >I repeat This is UPSTREAM problem! I have been notified by Evgeniy, and this will be fixed with the minor release addressing UTF-8 support for non 8859-1 languages, which will likely come in late October. > Why not add this as temporary solution, > and when it was fixed remove this five lines from ebuild? evgeniy is right. The established community process is to notify the maintainer (which he did) for the final, solid fix (which takes longer), and introduce a distro-fix until he releases the final one.
Indeed, it is not a exatctly "uft-8", it is happen for all locales with more that one encoding (may be reassign bug?) So I create patch with convert all messages to utf-8(ebuild patch), and convert from utf-8 to current encoding(man patch). here is ebuild patch, patch for man in attachment --- /usr/portage/sys-apps/man/man-1.6b-r2.ebuild 2005-12-25 18:36:02.0000 00000 +0300 +++ man-1.6b-r3.ebuild 2005-12-30 23:53:56.630521250 +0300 @@ -53,10 +53,25 @@ epatch "${FILESDIR}"/man-1.5p-man2html.patch epatch "${FILESDIR}"/man-1.5p-mandirlist.patch + #fix messages encoding + epatch "${FILESDIR}"/man-1.6a-messages.patch + # use non-lazy binds for man append-ldflags $(bindnow-flags) strip-linguas $(eval $(grep ^LANGUAGES= configure) ; echo ${LANGUAGES//, / }) + + cd msgs + + for mess in `ls mess.* | grep -v codeset`; do + if [ -e ${mess}.codeset ]; then + codeset=`sed s/\$\ codeset=//g ${mess}.codeset` + iconv -f $codeset -t utf8 $mess > ${mess}.utf8 + mv $mess.utf8 $mess + echo "$ codeset=utf8">${mess}.codeset + fi + done + cd .. } src_compile() {
Created attachment 75817 [details, diff] fix garbage in man's messages
sys-apps/man-1.6d have same problem
Is there a reason to leave man-text-messages broken for more than a year without commiting a working fix?
Created attachment 102427 [details, diff] use iconv for converting catgets texts to correct charset Patch based on "fix garbage in man's messages"-patch + makefile-changes to convert catgets-files to utf8.
(In reply to comment #17) > Is there a reason to leave man-text-messages broken for more than a year > without commiting a working fix? > As I understand position of utf8 gentoo team, this is problem of mainstream, not Gentoo. But I sent patch year or so ago, the maintainer said that he will look at it, and that's all, there is no more reaction.
(In reply to comment #18) > Created an attachment (id=102427) [edit] > use iconv for converting catgets texts to correct charset > > Patch based on "fix garbage in man's messages"-patch > + makefile-changes to convert catgets-files to utf8. > Works for me.
(In reply to comment #20) > (In reply to comment #18) > > Created an attachment (id=102427) [edit] > > use iconv for converting catgets texts to correct charset > > > > Patch based on "fix garbage in man's messages"-patch > > + makefile-changes to convert catgets-files to utf8. > > > > Works for me. > and for me. Opened: 2005-05-23 04:08 PST upstream has died ?
(In reply to comment #21) > (In reply to comment #20) > > (In reply to comment #18) > > > Created an attachment (id=102427) [edit] > > > use iconv for converting catgets texts to correct charset > > > > > > Patch based on "fix garbage in man's messages"-patch > > > + makefile-changes to convert catgets-files to utf8. > > > > > > > Works for me. > > > > and for me. > > Opened: 2005-05-23 04:08 PST > > upstream has died ? > There is comment from current developer of "man" in this disscussion: ----- Comment #13 From Federico Lucifredi 2005-09-19 may be he just forget about this issue, and may be it is possible to add him into "Cc" list, and resend all these comments to him.
*** This bug has been marked as a duplicate of bug 126361 ***
>*** This bug has been marked as a duplicate of bug 126361 *** >please add utf8 support to groff This bug has no attitude to groff, it is about "man" by it self, for example messages like "Man page not found" and so on.
*** Bug 211547 has been marked as a duplicate of this bug. ***
Created attachment 144777 [details] sys-apps/man-1.6f ebuild for modified patch As insisted in Bug #211547 Comment #20 I will repeat everything here (but not give up, as someone, maybe, hopes). I have contacted Federico Lucifredi - current maintainer of man, and he says that they will think about solution. But I insist that there is need of temporary solution for this issue (I don't know, how many days/months/years will pass, until upstream will merge it, or will at all). This ebuild converts catgets to UTF-8 using ebuild build-in scripting, without patching Makefile, and uses modified patch from Comment #18, with removed Makefile pathing (if you don't like this way - you can just add patch from c#18 to already existing ebuild and it will work).
Created attachment 144781 [details, diff] Modified patch from comment #18 (removed Makefile patching) In answer to Bug #211547 Comment #19 / #18 : Yes, I have my answer. And, as you see, I don't like it, because it tolds opinion and interests of one man, without any arguments to his position. I was my mistake to talk on dublicated bug, but I suggest that there is no need of dublicating information, sorry. But don't try to hide behind bureaucracy meanings - I _will_ continue "wasting" your time, until I will have normal and argumented position WHY this bug wasn't resolved for 3 years passed, while you had everything you need to resolve it.
I have mailled to upstream and he replied me saying that is already working on this and, hopefully, next release will fix this :-)
I doubt that there will actually be a release that works (if one doesn't consider Man-DB). So let's just disable broken functionality by default: i.e., man should never produce any translated messages. This is already done with the "-nls" USE flag. And, until bug #259176 is fixed, it is a good idea to provide a USE flag to disable support for translated manual pages completely, by applying the last hunk from the patch from http://www.mail-archive.com/lfs-dev@linuxfromscratch.org/msg12112.html
5+ years and still no solution?
One of possible solutions is to use sys-apps/man-db instead of sys-apps/man. It doesn't have such problems with character encodings...
(In reply to comment #34) > 5+ years and still no solution? > there is solution, there are patches, but looks like nobody from gentoo maintainers have whole ten minutes to push them into portage tree.
Created attachment 240751 [details] merge message encoding fix to man-1.6f-r4.ebuild
*** Bug 235305 has been marked as a duplicate of this bug. ***
I don't think iconv would be a good solution, due to things like i.e. uclibc.
(In reply to comment #39) > I don't think iconv would be a good solution, due to things > like i.e. uclibc. > man pages on embedded system? Any way with or without this patch you can build "man" with nls and with uclibc, because of it used catopen, catclose, catgets that not implemented in uclibc. So I don't think that we should think about uclibc, when solving this problem.
(In reply to comment #40) > Any way with or without this patch you can build "man" with nls and with > uclibc, s/can/can not/g
Guys, try man-db and recent man-pages-ru. For me it fixes issues out of box.
looks like an old issue with comment on 2010-12-22 that it is fixed, could the bug be closed?
Ben Sagal, this bug is not fixed. man-db is not stable (currently it masked) and not standard.
*** Bug 399255 has been marked as a duplicate of this bug. ***
Assigning to man maintainers as utf8 herd is dead for ages
Subject change was wrong: it's not "manpages" looks like garbage, it's /usr/bin/man messages (like: No manual entry for ...) print to console looks like garbage.
Yes, although to an extent it's probably both (with varying levels of ease of reproduction). Just switch to man-db already? :-)
(In reply to comment #48) > Yes, although to an extent it's probably both (with varying levels of ease of > reproduction). Not really, that's just a question of using correct configuration in /etc/man.conf: NROFF /usr/bin/enconv -L ru -x KOI8-R -C iconv | /usr/bin/nroff -mandoc -Tlatin1 -c | /usr/bin/enconv -L ru -x UTF8 > Just switch to man-db already? :-) Actually I'm using Vim (with viewdoc plugin) to view man pages in console (but internally it uses /usr/bin/man) - this make syntax highlight, search and navigation between man pages much more comfortable. As for switch to man-db - I hope it compatible enough with man to not break viewdoc plugin. Currently all man-db versions in portage are ~x86. And I don't see any reason to switch, actually - I don't see any real reasons to start using berkdb instead of plain files here. So, why I should even think about switching? :)
(In reply to comment #49) > (In reply to comment #48) > > Yes, although to an extent it's probably both (with varying levels of ease of > > reproduction). > > Not really, that's just a question of using correct configuration in > /etc/man.conf: > > NROFF /usr/bin/enconv -L ru -x KOI8-R -C iconv | /usr/bin/nroff > -mandoc -Tlatin1 -c | /usr/bin/enconv -L ru -x UTF8 > this not fixes man output itself (help for example).
(In reply to comment #49) > Not really, that's just a question of using correct configuration in > /etc/man.conf: > > NROFF /usr/bin/enconv -L ru -x KOI8-R -C iconv | /usr/bin/nroff > -mandoc -Tlatin1 -c | /usr/bin/enconv -L ru -x UTF8 It's pretty bizarre in this day and age that people should have to configure this manually. man-db will generally just figure it out by itself, for all languages I've seen manual pages written in (it's easy to add new encoding support, and a lot of pages are just in UTF-8 these days anyway which will work by default), without configuration. It has supported Russian KOI8-R pages with no configuration since 2003, and automatic detection of KOI8-R vs. UTF-8 since 2007. > Actually I'm using Vim (with viewdoc plugin) to view man pages in console (but > internally it uses /usr/bin/man) - this make syntax highlight, search and > navigation between man pages much more comfortable. > > As for switch to man-db - I hope it compatible enough with man to not break > viewdoc plugin. I often use man.vim myself which comes with vim; but I've just tested viewdoc and it basically works fine. The only problem is completion, because man-db's /usr/bin/man didn't support running 'man --path' to print the manpath; of course this was a trivial fix and I've just committed compatibility code to support this. Generally, I'd expect compatibility problems to be rare. This is the first one I recall seeing in a couple of years. > Currently all man-db versions in portage are ~x86. Not so. Since July, CVS has had: KEYWORDS="~alpha ~amd64 ~arm ~hppa ~ia64 ~m68k ~mips ~ppc ~ppc64 ~s390 ~sh ~sparc ~x86" > And I don't see any reason to switch, actually - I don't see any real > reasons to start using berkdb instead of plain files here. So, why I > should even think about switching? :) There seems to be an idea that just won't die that man-db is only about adding a database to man (incidentally, since 2008 I've recommended configuring man-db to use GDBM, not Berkeley DB). Perhaps this is my fault since I haven't made much effort to emphasise real benefits in the documentation. These days, the database is the least important of the differences between man and man-db. I don't want to get into a giant advocacy discussion, but a few reasons to use man-db: * Correct encoding support out of the box that doesn't require primitive hardcoding in configuration files, supporting the use of a variety of languages and encodings without reconfiguration. * man uses catgets for message translations, which nearly everyone else stopped using in the 1990s, and which is the fundamental cause of this Gentoo bug. One of the first things I did when I took over man-db in 2001 was to convert it to gettext, which is more correct and robust. * man has lots of code like 'command = my_xsprintf("%s%s '%S' | %s%s", ...)' which should fail any competent security review; consider the case where you're using man in a CGI script, for instance. man-db is designed from top to bottom to have safe and correct command execution (this is the point of libpipeline). * man-db is actually maintained. I mean no disrespect to Federico - we're even co-workers these days! - but man has only had one release since 2007 and that only really had a few minor changes; it doesn't look as though he has time to maintain it. man-db has had ten full releases since then, and I follow bug reports from several distributions. I work on distributions too; I know that there's a strong urge to fix the software you're currently using rather than to switch to a replacement. However, I honestly think at this point man is several years behind man-db as far as i18n is concerned - both this bug and the harder problem of dealing with manual page encodings properly - and it shows no signs of catching up. When I took over man-db it was in a state much like man is now, and it took me a few years of upstream development before I was really satisfied with how all the locale handling worked. So, when I advise just switching to man-db, that isn't just "hey, I maintain it, it must be better", but the result of a lot of bitter experience fixing just this kind of bug.
I released man-db 2.6.1 last week, which provides that viewdoc plugin compatibility I mentioned.
*** Bug 339307 has been marked as a duplicate of this bug. ***
*** Bug 491564 has been marked as a duplicate of this bug. ***
use man-db for automatic charset conversion. no plans on making man work.