Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 417791 - app-i18n/man-pages-ja - upstream changed its encoding
Summary: app-i18n/man-pages-ja - upstream changed its encoding
Status: RESOLVED INVALID
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal enhancement (vote)
Assignee: CJK Team
URL:
Whiteboard:
Keywords: EBUILD, PATCH
Depends on:
Blocks:
 
Reported: 2012-05-27 15:49 UTC by OKUMURA N. Shin-ya
Modified: 2012-07-18 06:45 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
ebuild for both of encoding, traditional eucJP and modern UTF-8 (man-pages-ja-20120515.ebuild,2.59 KB, text/plain)
2012-05-27 15:54 UTC, OKUMURA N. Shin-ya
Details

Note You need to log in before you can comment on or make changes to this bug.
Description OKUMURA N. Shin-ya 2012-05-27 15:49:43 UTC
Upstream linuxjm.sourceforge.jp has changed man-page's encoding, from eucJP to UTF-8, since 2012-04-10.

Reproducible: Always

Steps to Reproduce:
Currently no problem, because the newest app-i18n/man-pages-ja ebuild is ver.20111020.  It will cause problem when the next version bump.



Idealy, all of app-i18n/man-pages-ja, sys-apps/man and sys-apps/groff are to be *FULLY* multibyte-utf8 supported but currently not.  I feel that is better to provide both of encoding version, eucJP and utf-8 for the moment.
Comment 1 OKUMURA N. Shin-ya 2012-05-27 15:54:38 UTC
Created attachment 313287 [details]
ebuild for both of encoding, traditional eucJP and modern UTF-8

This ebuild installs manpages for two encodings, both traditional eucJP (/usr/share/man/ja_JP.eucJP) and modern UTF-8 (/usr/share/man/ja_JP.UTF-8).
Comment 2 Jeroen Roovers (RETIRED) gentoo-dev 2012-05-28 11:29:37 UTC
Comment on attachment 313287 [details]
ebuild for both of encoding, traditional eucJP and modern UTF-8

--- man-pages-ja-20111020.ebuild        2011-11-24 03:08:07.000000000 +0100
+++ -   2012-05-28 13:29:22.390906363 +0200
@@ -56,7 +56,20 @@
                einfo "install $pkg"
 
                for y in $(ls -d manual/$pkg/man* 2>/dev/null); do
-                       doman -i18n=ja $y/*
+                       doman -i18n=ja_JP.UTF-8 $y/*
+               done
+
+               for y in $(ls -d manual/$pkg/man* 2>/dev/null); do
+                       local f
+                       for f in $y/*.[0-9a-z] $y/*.[0-9][a-z]; do
+                               if [ -f $f ]; then
+                                       local t
+                                       t=`mktemp`
+                                       iconv -cs -f utf-8 -t eucJP $f >$t
+                                       mv $t $f
+                               fi
+                       done
+                       doman -i18n=ja_JP.eucJP $y/*
                done
 
                pkg=
@@ -71,7 +84,20 @@
                        einfo "install $x"
 
                        for z in $(for y in $x/*.[1-9]; do echo ${y##*.}; done | sort | uniq); do
-                               doman -i18n=ja $x/*.$z
+                               doman -i18n=ja_JP.UTF-8 $x/*.$z
+                       done
+
+                       for z in $(for y in $x/*.[1-9]; do echo ${y##*.}; done | sort | uniq); do
+                               local f
+                               for f in $z/*.[0-9a-z] $z/*.[0-9][a-z]; do
+                                       if [ -f $f ]; then
+                                               local t
+                                               t=`mktemp`
+                                               iconv -cs -f utf-8 -t eucJP $f >$t
+                                               mv $t $f
+                                       fi
+                               done
+                               doman -i18n=ja_JP.eucJP $x/*.$z
                        done
                fi
        done
Comment 3 Naohiro Aota gentoo-dev 2012-07-08 17:06:23 UTC
What is the purpose of having two encodings? I think encoding change is just a problem of changing JNROFF entry of /etc/man.conf
Comment 4 Tomoh K. 2012-07-09 07:23:55 UTC
(In reply to comment #3)
> What is the purpose of having two encodings? I think encoding change is just
> a problem of changing JNROFF entry of /etc/man.conf

+1.
sys-apps/groff-1.21 + sys-apps/man can handle the UTF-8 encoded manpages on ja_JP.UTF-8 locale with 
JNROFF /usr/bin/groff -Dutf8 -Tutf8 -mandoc -mja
in /etc/man.conf.

And IMO, to install manpages to the directory which has a name containing charmap name is basically bad idea.

sys-apps/man searches for manpages in $MANPATH/$LANG but does not normalize the charmap name. Therefore, if users set his/her $LANG to ja_JP.EUC-JP, /usr/bin/man searches manpages in $MANPATH/ja_JP.EUC-JP instead $MANPAGE/ja_JP.eucJP.

And, with using sys-apps/man-db(not yet stabilized but listed in virtual/man), man-db tries to load language specific troff macro assumed by the directory name where manpage was found. So if the directory name contains a charset name, man-db fails to load ja.tmac and manpages' appearance will be poor.
Comment 5 OKUMURA N. Shin-ya 2012-07-16 05:32:10 UTC
O.K, I agree. I make it 'invalid.'
Comment 6 OKUMURA N. Shin-ya 2012-07-16 05:37:08 UTC
Um?  Sorry I cannot change the status invalid via the bugzilla.  Can I mark it RESOLVED, or will anyone chunk it out?