Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 121502 - man pages with unicode give unexpected behavior with dashes
Summary: man pages with unicode give unexpected behavior with dashes
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: All Linux
: High normal (vote)
Assignee: Gentoo's Team for Core System packages
URL:
Whiteboard:
Keywords:
: 173165 176363 191488 (view as bug list)
Depends on:
Blocks: 146315
  Show dependency tree
 
Reported: 2006-02-03 21:38 UTC by ta2002
Modified: 2008-02-25 03:04 UTC (History)
4 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
groff-man-UTF-8.diff-modified (groff-man-UTF-8.diff,777 bytes, patch)
2006-09-05 01:55 UTC, Matthias Schwarzott
Details | Diff
groff-man-UTF-8.diff-second-try (groff-man-UTF-8.diff,778 bytes, patch)
2006-09-05 02:01 UTC, Matthias Schwarzott
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description ta2002 2006-02-03 21:38:15 UTC
One of the strangest things I have ever seen.

ONLY a problem on amd64 (not my x86 machines).

ONLY a problem with LANG set to ANY utf-8 locale (for example, en_US.UTF-8,
but NOT simply en_US or POSIX).

(So far) ONLY a problem for ANY man page in the openssh package
(about a dozen files). Other packages (newly emerged) do not
have this problem, and I just re-emerged openssh because of the
security update. No change.

The problem: using either less or man to view any of these man pages,
the search function ("/") will not find the dash character ("-") in the
file (even with many of them obviously visible).

Hope somebody can duplicate this, but if not, happy to do whatever
testing I can.

$ emerge -p info
Portage 2.0.54 (default-linux/amd64/2005.1, gcc-3.4.4, glibc-2.3.5-r2, 2.6.15-gentoo-r1 x86_64)
=================================================================
System uname: 2.6.15-gentoo-r1 x86_64 AMD Athlon(tm) 64 Processor 3200+
Gentoo Base System version 1.6.14
dev-lang/python:     2.4.2
sys-apps/sandbox:    1.2.12
sys-devel/autoconf:  2.13, 2.59-r6
sys-devel/automake:  1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6-r1
sys-devel/binutils:  2.16.1
sys-devel/libtool:   1.5.22
virtual/os-headers:  2.6.11-r2
ACCEPT_KEYWORDS="amd64"
AUTOCLEAN="yes"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-march=k8 -O3 -pipe -msse2 -mfpmath=sse"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/kde/2/share/config /usr/kde/3.4/env /usr/kde/3.4/share/config /usr/kde/3.4/shutdown /usr/kde/3/share/config /usr/lib/X11/xkb /usr/share/config /var/qmail/control"
CONFIG_PROTECT_MASK="/etc/gconf /etc/terminfo /etc/env.d"
CXXFLAGS="-march=k8 -O3 -pipe -msse2 -mfpmath=sse"
DISTDIR="/usr/portage/distfiles"
FEATURES="autoconfig distlocks sandbox sfperms strict"
GENTOO_MIRRORS="http://distfiles.gentoo.org http://distro.ibiblio.org/pub/linux/distributions/gentoo"
LANG="en_NZ.UTF-8"
LINGUAS="en ru"
PKGDIR="/usr/portage/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/usr/local/portage"
SYNC="rsync://rsync.gentoo.org/gentoo-portage"
USE="amd64 X aac aalib acpi alsa apache2 arts audiofile avi berkdb bitmap-fonts bzip2 caps cdparanoia cdr cjk crypt css cups dga directfb divx4linux dvd dvdr emboss encode exif expat faad fam fbcon ffmpeg flac freetype gd ggi gif gmp gphoto2 gpm gstreamer gtk2 idea idn imagemagick imap imlib ipv6 javascript jikes joystick jpeg kde lcms libcaca libwww live lm_sensors lzw lzw-tiff mad matroska mbox memlimit mng motif mp3 mpeg mpi mysql nas ncurses network nls nptl nptlonly ogg opengl pcre pdflib perl png ppds qt quicktime readline real rtc samba scanner sdl silc speex spell ssl tcpd theora tiff truetype truetype-fonts type1-fonts udev unicode usb userlocales utf8 vcd vorbis wifi xinerama xml2 xmms xpm xv xvid zlib linguas_en linguas_ru userland_GNU kernel_linux elibc_glibc"
Unset:  ASFLAGS, CTARGET, LC_ALL, LDFLAGS, MAKEOPTS
Comment 1 Harald van Dijk (RETIRED) gentoo-dev 2006-02-22 11:54:43 UTC
Could you please verify that what you're seeing is the ASCII minus sign, rather than a non-ASCII Unicode symbol which looks exactly the same? One way to find out is by viewing one of these manpages, copying the character with the mouse, and typing

echo - | cat -v

in a shell, except that instead of typing -, you paste it. I'm guessing you'll see  "M-bM-^HM-^R" instead of "-". If this is the case, could you please make sure your /etc/man.conf is the same on all your machines, and if not, if you can reproduce this on other systems by making it the same?
Comment 2 ta2002 2006-02-23 12:20:50 UTC
Oops. :(

It seems like you have it right:

$ echo − | cat -v
M-bM-^HM-^R

And yes, I do have a difference in man.conf (the -Tascii option).

I guess that solves the problem.

Now I need to figure out if I even want to use utf-8 for man pages
(searching on what looks like an ascii "-" seems obvious to me, and
I do it all the time to find the description of an option).

Why on earth would the openssh people make those non-ascii characters
(in the middle of pure ascii text) when a far more obvious (at least
to me) alternative exists?
Comment 3 Harald van Dijk (RETIRED) gentoo-dev 2006-02-23 14:09:46 UTC
> Why on earth would the openssh people make those non-ascii characters
> (in the middle of pure ascii text) when a far more obvious (at least
> to me) alternative exists?

It's not their decision. The manpage contains macros that tell nroff "format '1' as an option", but it doesn't tell nroff how to do that. Other manpages would contain "format '-1' in bold" instead, which is why it happens to work with them, but I actually think openssh is doing the right thing here. (If you want to be sure, you can check `gzip -dc /usr/share/man/man1/scp.1.gz`, and look for the .Fl macros. Its meaning is described in the groff_mdoc manpage.) I do think this may be a groff bug though, since −1 isn't a valid scp option, only -1 is. base-system, as responsible for groff, added to CC for additional input. Does this description sound about right, and if so, should groff maybe be changed to force ASCII - for command-line options?
Comment 4 Matthias Schwarzott gentoo-dev 2006-09-04 04:12:06 UTC
Does this error still exists?

Can you please tell us what versions of "man" and "groff" you have installed.
Comment 5 ta2002 2006-09-04 10:50:20 UTC
(In reply to comment #4)
> Does this error still exists?

No. My (very recent - as in two minutes ago :) ) update of man from 1.6-r1 to 1.6d appears to have fixed the problem.
Comment 6 Matthias Schwarzott gentoo-dev 2006-09-04 11:51:00 UTC
(In reply to comment #5)
> (In reply to comment #4)
> > Does this error still exists?
> 
> No. My (very recent - as in two minutes ago :) ) update of man from 1.6-r1 to
> 1.6d appears to have fixed the problem.
> 
1. Please also give us your version of groff.
2. With which man-page did you check the error?

For me on x86 it produces the error with "man scp" with man-1.6d and all available versions of groff (1.18.1.1, 1.19.1-r2 and 1.19.2-r1).

Comment 7 Jakub Moc (RETIRED) gentoo-dev 2006-09-04 16:14:04 UTC
Still broken here (x86 and amd64): sys-apps/man-1.6d, sys-apps/groff-1.19.2-r1
Comment 8 Matthias Schwarzott gentoo-dev 2006-09-05 01:53:56 UTC
This bug can be solved by adding the hack now positioned in /usr/share/groff/site-tmac/man.local also to /usr/share/groff/site-tmac/mdoc.local.

See attached (modified) groff-man-UTF-8.diff.
Comment 9 Matthias Schwarzott gentoo-dev 2006-09-05 01:55:34 UTC
Created attachment 96041 [details, diff]
groff-man-UTF-8.diff-modified
Comment 10 Matthias Schwarzott gentoo-dev 2006-09-05 02:01:20 UTC
Created attachment 96043 [details, diff]
groff-man-UTF-8.diff-second-try
Comment 11 Jakub Moc (RETIRED) gentoo-dev 2007-04-04 07:28:43 UTC
*** Bug 173165 has been marked as a duplicate of this bug. ***
Comment 12 spiritus 2007-04-04 19:19:10 UTC
I have seen groff and man in FC and Debian Etch are patched for compatibility with UTF8 and autorecoding non-UTF8 mans(in KOI8-R, etc) to UTF. Patches are inside their source packages. For examples: http://mirrors.dotsrc.org/fedora/6/source/SRPMS/man-1.6d-1.1.src.rpm and http://mirrors.dotsrc.org/fedora/6/source/SRPMS/groff-1.18.1.1-11.1.src.rpm.
Comment 13 Łukasz Damentko (RETIRED) gentoo-dev 2007-04-28 15:12:30 UTC
*** Bug 176363 has been marked as a duplicate of this bug. ***
Comment 14 Jakub Moc (RETIRED) gentoo-dev 2007-09-06 15:58:45 UTC
*** Bug 191488 has been marked as a duplicate of this bug. ***
Comment 15 SpanKY gentoo-dev 2008-02-24 18:11:12 UTC

*** This bug has been marked as a duplicate of bug 126361 ***
Comment 16 SpanKY gentoo-dev 2008-02-24 18:42:46 UTC
blah, goddamn mess of dupes

this bug is about the dash issue with unicode / non-unicode

it is not about anything else
Comment 17 SpanKY gentoo-dev 2008-02-24 18:57:43 UTC
looks like this was half way fixed (man.local) but the important part (mdoc.local) was left out

groff-1.19.2-r2 includes mdoc.local as well

http://sources.gentoo.org/sys-apps/groff/files/groff-1.19.2-man-unicode-dashes.patch?rev=1.1