Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 468428 - sys-apps/man shows mojibake when viewing localized UTF-8 man pages
Summary: sys-apps/man shows mojibake when viewing localized UTF-8 man pages
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: Gentoo's Team for Core System packages
URL:
Whiteboard:
Keywords: PATCH
Depends on:
Blocks:
 
Reported: 2013-05-03 13:16 UTC by Zoltán Halassy
Modified: 2020-03-07 21:59 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
strace of man (man_strace_log.txt,17.23 KB, text/plain)
2013-05-03 13:16 UTC, Zoltán Halassy
Details
strace -f of man (man_strace_log.txt,261.40 KB, text/plain)
2013-05-03 13:21 UTC, Zoltán Halassy
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Zoltán Halassy 2013-05-03 13:16:20 UTC
Created attachment 347266 [details]
strace of man

I use hu_HU.UTF-8 as my current locale, LINGUAS="hu en" is also set up.

Let's take net-analyzer/nmap-6.25 . Man page in hu is available, it is installed into /usr/share/man/hu/man1/nmap.1.bz2 (same goes for chsh, gpasswd, groups, hunspell, login, mc, mplayer, newgrp, passwd and su) . The file(s) is(are) natively encoded in UTF-8.

For example, nmap manpage title says (directly in the file, in the 31th line, after the .SH "NAME" directive), with correct UTF-8 encoding:

"Hálózat feltérképező és biztonsági/kapu letapogató eszköz"

However, 

$ man -P cat nmap | fgrep -A1 NAME

gives this:

NAME
       nmap - Hálózat feltérképezŠés biztonsági/kapu letapogató

No idea what is wrong. /etc/locale.gen contains one row, this:

hu_HU.UTF-8 UTF-8

Everything else works fine in the console. Bash returns hungarian error messages with proper accents. Readline handles hungarian accented letters properly. Midnight Commander speaks hungarian with accented letters properly. less (the default pager) can show UTF-8 files properly. Only man is broken, somehow, it thinks it should convert the man pages to something, which is already in a proper form.

Actually, I never seen man to work properly with unicode man pages.

Added strace of man.
Comment 1 Zoltán Halassy 2013-05-03 13:21:56 UTC
Created attachment 347268 [details]
strace -f of man
Comment 2 Zoltán Halassy 2013-05-03 14:11:40 UTC
Actually could fix it:

The default in /etc/man.conf is the following:

NROFF /usr/bin/nroff -mandoc

If I change this to this:

NROFF /usr/bin/groff -mandoc -Tutf8 -k

It fixes the accent problems, and makes the manual colored.

I just wonder would it be possible somehow to make this work for all users. For example a wrapper around groff, which detects the terminal encoding from locale settings, or something. Changing the default to this could potentially break man pages for other users.
Comment 3 Zoltán Halassy 2013-05-03 14:26:19 UTC
Actually, /usr/bin/nroff itself is a wrapper script already, the following change would suffice (line 136, simply add the -k option):

-PATH="$GROFF_RUNTIME$PATH" groff -mtty-char $T $opts ${1+"$@"}
+PATH="$GROFF_RUNTIME$PATH" groff -k -mtty-char $T $opts ${1+"$@"}

Could we add such a change to the sys-apps/groff ?
Comment 4 SpanKY gentoo-dev 2013-05-06 15:41:19 UTC
does man-db work ?  `emerge -C man && emerge man-db`
Comment 5 Zoltán Halassy 2013-05-06 16:01:35 UTC
Yes, it does, thanks. I don't mind migrating to man-db. (However, man-db calls preconv too (but directly), as can be seen in strace. The same what the -k option causes to groff.) I used the old man because it was the default, no other particular reason.
Comment 6 SpanKY gentoo-dev 2013-05-06 16:06:30 UTC
(In reply to comment #5)

i'm debating how much work i want to do with sys-apps/man if man-db does everything for me ;)

it would be easy to update files/man-1.6f-unicode.patch to include -k ... but i think man-db does it a bit more selectively than just always running preconv.  it looks for certain markers in the start of the file iirc.
Comment 7 Markus Oehme 2014-10-17 09:08:10 UTC
I just hit this in bug #523440. Adding a default for an unconditional conversion  from utf8 in /etc/man.conf as suggested in comment 2 seems to be the most reasonable thing. This fixes most cases while not breaking working ascii files. The basic problem of borked internationalization in sys-apps/man however would be much more work.
Comment 8 Ulrich Fieseler 2014-11-30 01:01:13 UTC
Messing with the nroff script and making it something else (comment 3), completely circumventing it and therefore its use of the locale (comment 2) or even using a replacement package for man that seems to add another layer of complications by storing man pages in its own database rather than in plain files in the file system tree does not look like a reasonable choice. As pointed out in bug #523440 , setting environment variable GROFF_ENCODING might help, the only problem is: where does man (or the nroff or groff it calls) get its environment from? Setting GROFF_ENCODING before calling man does not help!
Comment 9 Larry the Git Cow gentoo-dev 2020-03-07 21:59:16 UTC
The bug has been closed via the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=ce370f012e25ad2eb756cbcaf768bf053161d067

commit ce370f012e25ad2eb756cbcaf768bf053161d067
Author:     Mike Gilbert <floppym@gentoo.org>
AuthorDate: 2020-03-07 21:54:44 +0000
Commit:     Mike Gilbert <floppym@gentoo.org>
CommitDate: 2020-03-07 21:59:07 +0000

    sys-apps/man: remove package
    
    Closes: https://bugs.gentoo.org/468428
    Closes: https://bugs.gentoo.org/515534
    Closes: https://bugs.gentoo.org/524588
    Closes: https://bugs.gentoo.org/589738
    Closes: https://bugs.gentoo.org/605352
    Closes: https://bugs.gentoo.org/651038
    Closes: https://bugs.gentoo.org/683494
    Signed-off-by: Mike Gilbert <floppym@gentoo.org>

 profiles/package.mask                              |   6 -
 sys-apps/man/Manifest                              |   1 -
 sys-apps/man/files/makewhatis.cron                 |   5 -
 sys-apps/man/files/man-1.5m2-apropos.patch         |  16 ---
 sys-apps/man/files/man-1.6-cross-compile.patch     |  61 ----------
 .../files/man-1.6c-cut-duplicate-manpaths.patch    |  83 -------------
 sys-apps/man/files/man-1.6e-headers.patch          |  13 --
 .../man-1.6f-makewhatis-compression-cleanup.patch  |  69 -----------
 .../files/man-1.6f-man2html-compression-2.patch    |  61 ----------
 sys-apps/man/files/man-1.6f-parallel-build.patch   |  78 ------------
 sys-apps/man/files/man-1.6f-so-search-2.patch      |  34 ------
 sys-apps/man/files/man-1.6f-unicode.patch          |  28 -----
 sys-apps/man/files/man-1.6g-compress.patch         |  17 ---
 sys-apps/man/files/man-1.6g-echo-escape.patch      |  15 ---
 sys-apps/man/files/man-1.6g-fbsd.patch             |  15 ---
 sys-apps/man/files/man-1.6g-xz.patch               |  53 ---------
 sys-apps/man/man-1.6g-r1.ebuild                    | 131 ---------------------
 sys-apps/man/metadata.xml                          |   8 --
 18 files changed, 694 deletions(-)