Bug 27700 - utf-8 updates for dialog, libiterm, netkit-telnetd
Bug#: 27700 Product:  Gentoo Linux Version: unspecified Platform: All
OS/Version: Linux Status: RESOLVED Severity: enhancement Priority: P2
Resolution: FIXED Assigned To: liquidx@gentoo.org Reported By: lowzl@hotmail.com
Component: Ebuilds
URL: 
Summary: utf-8 updates for dialog, libiterm, netkit-telnetd
Keywords:  EBUILD
Status Whiteboard: 
Opened: 2003-09-01 05:47 0000
Description:   Opened: 2003-09-01 05:47 0000
This is my biggest enhancement request yet.

Basically:

USE="utf8" builds ncursesw. (Compatibility symlinks are installed, no worries)
USE="utf8" enables ncursesw support in dialog.

Slang is installed with mandatory UTF-8 support - it might be neccessary to
rebuild apps because of slight ABI changes.

libiterm and fbiterm give the console the ability to display halfwidth and
fullwidth Unicode characters - with more than 512 glyphs.

unifont is a Unicode bitmap font for X11, a dependency of libiterm.

netkit-telnetd is hardcoded to use ncurses (or was it curses?) - fixed.

There might be other apps that are hardcoded to use the non-w version of ncurses
- please provide an ldscript to fix - I don't grasp how they work.

The ncursesw component can be considered a dup of bug 25992

I think Gentoo should also form a project to enable Unicode/UCS support across
the board.



Reproducible: Always
Steps to Reproduce:

------- Comment #1 From Zhen Lin 2003-09-01 05:57:13 0000 -------
Created an attachment (id=16902) [details]
the ebuilds

------- Comment #2 From Zhen Lin 2003-09-01 23:16:33 0000 -------
Created an attachment (id=16945) [details]
ncurses-5.3-r1.ebuild

Updated ebuild that generates libxxx.so to libxxxw.so ldscripts for more libs

------- Comment #3 From Alastair Tse (RETIRED) 2003-09-11 05:56:14 0000 -------
the forming of a group/herd to manage utf8 issues seems like a good idea. it
would also be related to cjk stuff i believe. 

i'll adopt this bug for the moment, but if anyone has any ideas or would like
to implement this before me, i would also appreciate it.

------- Comment #4 From Mamoru KOMACHI (RETIRED) 2003-09-11 14:09:46 0000 -------
That sounds interesting. I couldn't make fbiterm display Unicode characters
(although X counterpart xiterm works fine) so I'd be pleased if these ebuilds
contribute to UTF-8 support for Gentoo. Nevertheless I stopped making
fbiterm ebuild because jfbterm (app-i18n/jfbterm) now supports UTF-8 along
with SJIS and other stuff (jfbterm is pretty faster than fbiterm -- fbiterm is very slow).

Alastair, I'll take libiterm, fbiterm and unifont part for you.

------- Comment #5 From Mamoru KOMACHI (RETIRED) 2003-09-11 15:22:18 0000 -------
I emerged your libiterm, fbiterm and unifont but no multibyte characters
displayed.  It opens unifont (it is hardcoded to the full path of unifont)
and loads it into memory when I run fbiterm, but I see only ASCII
characters.  (This behaviour is just the same as I use my version of
fbiterm) Are you able to display multibyte characters with this ebuild?

------- Comment #6 From Zhen Lin 2003-09-12 00:40:10 0000 -------
Did you remember to run unicode_start? And what methodology did you use to
test; cat UTF-8-file? (That seems to work best)

It works for me.

------- Comment #7 From Mamoru KOMACHI (RETIRED) 2003-09-12 01:29:09 0000 -------
Thanks. I didn't know about unicode_start. Anyhow, unicode_start
doesn't solve the problem with fbiterm. After I run unicode_start
I can see UTF-8 text with console (but I'm using jconsole patches, which
adds EUC-JP and partial UTF-8 support for native framebuffer), but
once I ran fbiterm, I was not able to see any multibyte characters.
I tested it with cat UTF-8.txt and w3m-m17n with UTF-8 display code.

------- Comment #8 From Zhen Lin 2003-09-12 02:08:26 0000 -------
Well, it works for me; out of the box UTF-8 support.

Just for information:

LANG=en_GB.UTF-8
LANGUAGE=en_GB.UTF-8
LC_ALL=en_GB.UTF-8

unicode_start before fbiterm

------- Comment #9 From Mamoru KOMACHI (RETIRED) 2003-09-12 02:19:47 0000 -------
Created an attachment (id=17554) [details]
sample text for UTF-8

------- Comment #10 From Mamoru KOMACHI (RETIRED) 2003-09-12 02:27:34 0000 -------
It doesn't work for me either (I tested both ja_JP.UTF-8 and en_GB.UTF-8).
Can you see Japanese text in attached file with fbiterm? I can see only
' UTF-8' in the second line with fbiterm. (I wrote 2 lines and each line
contains Japanese characters)

Just FYI:

rico% emerge info
Portage 2.0.49-r3 (default-x86-1.4, gcc-3.2.3, glibc-2.3.2-r1, 2.4.22)
=================================================================
System uname: 2.4.22 i686 Pentium III (Coppermine)
ACCEPT_KEYWORDS="x86"
AUTOCLEAN="yes"
CFLAGS="-O -mcpu=pentium3 -march=i586 -funroll-loops -fomit-frame-pointer -pipe"
CHOST="i686-gentoo-linux"
COMPILER="gcc3"
CONFIG_PROTECT="/etc /var/qmail/control /usr/share/config
/usr/kde/2/share/config /usr/kde/3/share/config"
CONFIG_PROTECT_MASK="/etc/gconf /etc/env.d"
CXXFLAGS="-O -mcpu=pentium3 -march=i586 -funroll-loops -fomit-frame-pointer -pipe"
DISTDIR="/home/distfiles"
FEATURES="sandbox buildpkg ccache digest cvs -autoaddcvs"
GENTOO_MIRRORS="ftp://sb.itc.u-tokyo.ac.jp/GENTOO http://gentoo.oregonstate.edu"
MAKEOPTS="-j2"
PKGDIR="/home/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/home/gentoo-x86"
SYNC="rsync://sb.itc.u-tokyo.ac.jp/gentoo-portage"
USE="x86 oss apm avi crypt cups encode foomaticdb gif jpeg libg++ mad mikmod mpeg ncurses nls pdflib png quicktime spell truetype xml2 xmms xv zlib gdbm berkdb slang readline arts svga tcltk java ruby sdl gpm tcpd pam libwww perl python esd imlib oggvorbis qt kde opengl cdr X gtk gtk2 -gnome -alsa cjk maildir usagi ipv
6 -motif canna -freewnn ssl mmx sse emacs tetex"

rico% ldd /usr/bin/fbiterm 
	libm.so.6 => /lib/libm.so.6 (0x412a4000)
	libXfont.so.1 => /usr/X11R6/lib/libXfont.so.1 (0x41145000)
	libiterm.so.1 => /usr/lib/libiterm.so.1 (0x40012000)
	libz.so.1 => /usr/lib/libz.so.1 (0x4133c000)
	libc.so.6 => /lib/libc.so.6 (0x41016000)
	libfribidi.so.0 => /usr/lib/libfribidi.so.0 (0x40026000)
	/lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x41000000)


------- Comment #11 From Zhen Lin 2003-09-12 04:32:28 0000 -------
Created an attachment (id=17561) [details]
photo

Did you actually create the en_GB.UTF-8 locale? Check.

For some reason, it works for me. (In fact, most things do - I wonder why.)

Photographic evidence.

Could you at least try to explain the (mis)output?

------- Comment #12 From Mamoru KOMACHI (RETIRED) 2003-09-12 19:35:34 0000 -------
Finally I succeeded! Thanks a lot. I did create ja_JP.UTF-8 locale
by 

# localedef -v -c -i jp_JP -f UTF-8 ja_JP.UTF-8

but got the output described in Comment #10. It looks like

% cat utf-8                             # prompt
                                        # blank line
 UTF-8                                  # only see ' UTF-8'
%                                       # prompt again

I tried it again today. Instruction follows:

1. create en_GB.UTF-8

  # localedef -v -c -i en_GB -f UTF-8 en_GB.UTF-8

2. set env

  % LC_CTYPE=en_GB.UTF-8
  % LC_ALL=en_GB.UTF-8
  % LANG=en_GB.UTF-8
  % LANGUAGE=en_GB.UTF-8
  % export LC_CTYPE LC_ALL LANG LANGUAGE

3. unicode_start

  % unicode_start

4. fbiterm

  % fbiterm

5. cat ;-)

  % cat utf-8.txt
  

------- Comment #13 From Mamoru KOMACHI (RETIRED) 2003-09-12 19:35:34 0000 -------
Finally I succeeded! Thanks a lot. I did create ja_JP.UTF-8 locale
by 

# localedef -v -c -i jp_JP -f UTF-8 ja_JP.UTF-8

but got the output described in Comment #10. It looks like

% cat utf-8                             # prompt
                                        # blank line
 UTF-8                                  # only see ' UTF-8'
%                                       # prompt again

I tried it again today. Instruction follows:

1. create en_GB.UTF-8

  # localedef -v -c -i en_GB -f UTF-8 en_GB.UTF-8

2. set env

  % LC_CTYPE=en_GB.UTF-8
  % LC_ALL=en_GB.UTF-8
  % LANG=en_GB.UTF-8
  % LANGUAGE=en_GB.UTF-8
  % export LC_CTYPE LC_ALL LANG LANGUAGE

3. unicode_start

  % unicode_start

4. fbiterm

  % fbiterm

5. cat ;-)

  % cat utf-8.txt
  ¤Û¤²¤Û¤²
  ¤³¤ì¤Ï UTF-8 ¤Î¥Æ¥¹¥È¤Ç¤¹¡£
 
After I become able to see UTF-8 with en_GB.UTF-8, I also can do the
same thing with ja_JP.UTF-8 ... strange, but it works. I don't know
why things didn't work yesterday (I did the same thing except I
created en_GB.UTF-8 locale today).

I'll commit libiterm and fbiterm shortly.

------- Comment #14 From Zhen Lin 2003-09-12 20:30:34 0000 -------
Three (libiterm, fbiterm, unifont) down, three (ncursesw, slang, dialog) to go.

Related bugs: bug #17282 for TeX. bug #18735 as a meta-bug. bug #20006 for ncursesw (Duped here) bug #20854 Multibyte encoding for ghostscript (no fix).




------- Comment #15 From Zhen Lin 2003-09-12 20:31:24 0000 -------
Oops, bug #18375 not #18735

------- Comment #16 From Mamoru KOMACHI (RETIRED) 2003-09-12 20:39:03 0000 -------
I looked into your unifont ebuild. I think there are several things we
should solve before I commit it into Portage tree.

First, I think it is better to set the HOMEPAGE (perhaps
http://czyborra.com/ ?  http://dvdeug.dhis.org/unifont.html isn't available
atm). Sometimes it is not clear what is the main homepage of the software,
but we should try to find one.

Second, you must choose at least one lisence from /usr/portage/licenses and
set LICENSE (I suggest "freedist" in this case).  It must match a filename
in that directory (see man 5 ebuild)

Third, you must set DEPEND to ensure everyone can build the software.  If
you look at Makefile Debian's patch created, you will find that it uses perl
to convert hex file into bdf (requires dev-lang/perl) and bdftopcf to
convert bdf into pcf (requires virtual/x11).

Fourth, it is not a good idea to install the font to
/usr/X11R6/lib/X11/fonts/misc because Gentoo Policy follows FHS standard.
FHS standard requires all files under /usr/X11R6 should belong only to
XFree86 distribution. So we chose /usr/share/fonts for bitmap fonts and
/usr/share/fonts/ttf for TrueType fonts respectively (not all fonts in
Portage follow FHS, but that doesn't mean we can ignore it).  

Lastly, I think you forgot to add mkfontdir in this ebuild (Debian does it
in their postrm sh script). You need to add
`mkfontdir ${D}/usr/share/fonts/${PN}` or whatever in your ebuild to get the
right fonts.dir. It's not necessary to have fonts.dir for fbiterm to use 
unifont, but I think it is fair enough to create one since not only fbiterm
will use unifont (for example, I used it for Opera 5.x at that time).

------- Comment #17 From Mamoru KOMACHI (RETIRED) 2003-09-12 20:55:33 0000 -------
I examined libiterm.

In these ebuilds, you are correct in setting LICENSE to CPL, but as I wrote
in the previous comment, you must choose one from /usr/portage/licenses.
It should be either "CPL-0.5" or "CPL-1.0", "CPL" is not allowed (in this
case, CPL-1.0 is the right one).

I noticed that you applied a patch from Debian, but as I looked through the
whole patch I don't think we need it for Gentoo. The patch is for Debian 
(including Debian GNU/NetBSD ;-p) to compile iterm correctly. If you find
something significant in this patch to improve the software for Gentoo,
please correct me about it.

Also you correctly set dev-libs/fribidi as a dependency for libiterm,
but you cleared it in RDEPEND. If you run ldd to libiterm, you will get

% ldd /usr/lib/libiterm.so
        libfribidi.so.0 => /usr/lib/libfribidi.so.0 (0x40025000)
        libc.so.6 => /lib/libc.so.6 (0x41016000)
        /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x80000000)

And this is the reason you have to have dev-libs/fribidi in RDEPEND list.

As for fribidi, iterm supports either pls (comes with iterm distribution) or
fribidi. We don't need to force people to use fribidi because pls is enabled
by default.  Rather, I think we will use fribidi if bidi USE flag is set.
Currently, fribidi stays local USE flag for fvwm but we might ask it for
gentoo-dev to move it into global USE flag.

------- Comment #18 From Mamoru KOMACHI (RETIRED) 2003-09-12 21:03:33 0000 -------
Well, let's turn to fbiterm.

Almost all look fine for me. DEPEND should be
>=app-i18n/libiterm-${PV} rather than >=app-i18n/libiterm-${PV}*
See http://dev.gentoo.org/~liquidx/ebuildmistakes.html for common mistakes ;-)

In addtion to it, I think it's better to have >=sys-apps/sed-4 as a
dependency because you used sed -i (inplace) option in src_unpack() section.
Not all sed support -i (for example, FreeBSD's sed didn't support -i)
and considering Portage will extend to other platforms we better have
>=sys-apps/sed-4 as its dependency (it's not so hard to rewrite the ebuild
without -i, though. It's up to you).

------- Comment #19 From Zhen Lin 2003-09-12 21:09:49 0000 -------
Sorry, I've become a little lax when it comes to writing ebuilds.

RDEPEND="" should be RDEPEND="${DEPEND}" or not there at all.

fbiterm needs libXfont - I have left out X as a dependency.

Also, it might be useful to have a USE flag for utempter for utmp access.

The debian patch seems to only apply to configure - I came across this through debian, may as well give it the benefit of the doubt.

My >=app-i18n/libiterm-${PV}* dependency is incorrect - use =app-i18n/libiterm-${PV}* instead - the two should be in lockstep, like gnustep-{make,base}.

Rewrite the sed modification for sed 3, if what you say is true. 

------- Comment #20 From Mamoru KOMACHI (RETIRED) 2003-10-02 15:10:19 0000 -------
Committed media-fonts/unifont.
I'll commit libiterm/fbiterm later.

------- Comment #21 From Artem Baguinski 2003-10-24 05:15:14 0000 -------
has the working group on gentoo UTF suppor been formed? i haven't realize
that poor UTF support was gentoo's fault and not mine untill i read this
and other bugs. reaction of Latin-speaking community in many related bugs
is disappointing, but i guess it's our, non-latin's, job to make gentoo as
internationalized as RedHat and friends. i wonder why the ebuilds proposed
here do not appear in the portage tree? is there some central UTF-related
place out there?

BTW, i have utf support in xterm and had no problem with displaying little
japanese fragment (well, at least i think it was japanese and i think i had
no problem ;). i can see all examples in quickbrownfox.txt and in UTF-8-demo.txt
(don't remember where did i d/l them from), off course i canot judge wether
they are displayed correctly but looks good. 


------- Comment #22 From Artem Baguinski 2003-10-24 05:17:26 0000 -------
Created an attachment (id=19731) [details]
utf-8 text in many different languages + special symbols.

use -misc-fixed-medium-r-*-*-18-*-*-*-*-*-iso10646-* to display this text
in
e.g. xterm

------- Comment #23 From Artem Baguinski 2003-10-24 05:18:25 0000 -------
Created an attachment (id=19732) [details]
another sample utf-8 text

------- Comment #24 From Alastair Tse (RETIRED) 2003-10-24 07:29:57 0000 -------
i'm afraid we've been falling behind in UTF-8 support. 

luckily we are having an i18n subproject being formed that definitely have
this as one of their todo's. i too would really like some decent UTF-8 support,
and i've already got some patches ready for utf-8 locale generation for glibc.

so rest assure, we are trying to sort this UTF-8 business out as soon as
we can.

------- Comment #25 From Zhen Lin 2003-10-24 07:37:05 0000 -------
Thanks, that's great to know. 

http://www.columbia.edu/kermit/utf8.html
http://www.cl.cam.ac.uk/~mgk25/ucs/examples/
http://www.macchiato.com/unicode/Unicode_transcriptions.html

Some "nice" example pages. From the UTF-8 FAQ. A 'I feel lucky' search should
find it.

------- Comment #26 From Mamoru KOMACHI (RETIRED) 2003-10-24 09:57:35 0000 -------
Yeah, I'd like to commit libiterm and fbiterm as soon as I get response
fromseemant
about his intention to put libiterm-mbt on x11-libs. I sent him amail about
it three weeks ago and I'm waiting for him to reply...Also I want to verify
utf8 patch for ncurses since recent nvi seems to need it in order to display
UTF-8 text (but I haven't tested it).

------- Comment #27 From Alastair Tse (RETIRED) 2004-02-04 10:58:50 0000 -------
ok, sorry this has been sitting around for so long, i think after 2004.0 we
need to target this.

the unicode USE flag has been approved, so i think we can use it to signify
utf-8 support. i would much prefer these ebuilds be submitted as seperate bug
reports and linked to this one rather than having one massive report.

also shorting the bug title since unifont and fbiterm are in portage now.

------- Comment #28 From Mamoru KOMACHI (RETIRED) 2004-05-21 07:18:37 0000 -------
*** Bug 51634 has been marked as a duplicate of this bug. ***

------- Comment #29 From Mamoru KOMACHI (RETIRED) 2004-05-21 07:26:54 0000 -------
Does anyone in cjk herd want to add unicode USE flag for ncurses and slang?
If no (and there is no objection), I'll add them this weekend.

------- Comment #30 From Zhen Lin 2004-05-21 07:28:33 0000 -------
I believe that nano has some bugs when used with ncursesw. I've since switched
to slang.

------- Comment #31 From Heinrich Wendel (RETIRED) 2004-08-19 08:13:41 0000 -------
ncurses, slang and dialog are now in portage with utf-8 support, the useflag
for that is unicode. iterm seems not directly related, it's just a extra ebuild
so i think you should fill a new bug report for this.