Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 62216 - kdebase 3.2.3-r1: kwrite ignores utf8 encoding when opening files
Summary: kdebase 3.2.3-r1: kwrite ignores utf8 encoding when opening files
Status: RESOLVED UPSTREAM
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] KDE (show other bugs)
Hardware: All Linux
: High normal (vote)
Assignee: Gentoo KDE team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-08-30 05:00 UTC by gna
Modified: 2004-09-02 07:28 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
test file that illustrates the problem (test_utf8.txt,105 bytes, text/plain)
2004-08-30 05:03 UTC, gna
Details

Note You need to log in before you can comment on or make changes to this bug.
Description gna 2004-08-30 05:00:48 UTC
When I open a file with Chinese characters encoded in UTF-8 with kwrite it ignores the UTF-8 encoding of my locale which is set to en_AU.UTF-8. Thus the chinese characters are displayed as garbage. kedit, kate and gedit all open the file and display the Chinese correctly. This happens when I open the file via a right click on the file in konqueror or if I start kwrite and use the file open dialog.

Reproducible: Always
Steps to Reproduce:
1. set up a UTF-8 locale
2. open a utf8 encoded file with non-ascii characters in kwrite
3.

Actual Results:  
non-ascii characters were displayed as though the file had a single byte encoding.

Expected Results:  
displayed the multibyte characters correctly

$ locale
LANG=en_AU.UTF-8
LC_CTYPE="en_AU.UTF-8"
LC_NUMERIC="en_AU.UTF-8"
LC_TIME="en_AU.UTF-8"
LC_COLLATE="en_AU.UTF-8"
LC_MONETARY="en_AU.UTF-8"
LC_MESSAGES="en_AU.UTF-8"
LC_PAPER="en_AU.UTF-8"
LC_NAME="en_AU.UTF-8"
LC_ADDRESS="en_AU.UTF-8"
LC_TELEPHONE="en_AU.UTF-8"
LC_MEASUREMENT="en_AU.UTF-8"
LC_IDENTIFICATION="en_AU.UTF-8"
LC_ALL=

$ emerge info
Portage 2.0.50-r10 (gcc34-x86-2004.2, gcc-3.3.3, glibc-2.3.4.20040808-r0,
2.6.8-gentoo-r1)
=================================================================
System uname: 2.6.8-gentoo-r1 i686 AMD Athlon(tm) XP 2500+
Gentoo Base System version 1.4.16
distcc 2.13 i686-pc-linux-gnu (protocols 1 and 2) (default port 3632) [disabled]
Autoconf: sys-devel/autoconf-2.59-r4
Automake: sys-devel/automake-1.8.3
ACCEPT_KEYWORDS="x86"
AUTOCLEAN="yes"
CFLAGS="-O2 -march=athlon-xp -fomit-frame-pointer -pipe"
CHOST="i686-pc-linux-gnu"
COMPILER=""
CONFIG_PROTECT="/etc /usr/X11R6/lib/X11/xkb /usr/kde/2/share/config
/usr/kde/3.2/share/config /usr/kde/3/share/config /usr/lib/mozilla/defaults/pref
/usr/share/config /var/qmail/control"
CONFIG_PROTECT_MASK="/etc/gconf /etc/terminfo /etc/env.d"
CXXFLAGS="-O2 -march=athlon-xp -fomit-frame-pointer -pipe"
DISTDIR="/usr/portage/distfiles"
FEATURES="autoaddcvs ccache sandbox"
GENTOO_MIRRORS="ftp://linux.nctu.net/dists/gentoo/
http://www.zentek-international.com/mirrors/gentoo/"
MAKEOPTS="-j2"
PKGDIR="/usr/portage/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/usr/local/portage"
SYNC="rsync://tux.localdomain/gentoo-portage"
USE="3dnow X acl acpi alsa apache2 apm arts avi berkdb bzlib cdr cjk crypt cups
doc dvd encode esd ethereal fbcon foomaticdb freetype gdbm gif glade gmp gnome
gpg gphoto2 gpm gtk gtk2 imap imlib innodb jabber java jpeg kde ldap libg++
libwww mad maildir mikmod mmx motif mozilla mpeg ncurses nls nptl oggvorbis
opengl oss pam pda pdflib perl png postgres ppds python qt quicktime radeon
readline samba scanner sdl slang speedo spell sse ssl svga tcpd tiff truetype
type1 unicode usb videos x86 xml2 xmms xv yahoo zlib"
Comment 1 gna 2004-08-30 05:03:53 UTC
Created attachment 38494 [details]
test file that illustrates the problem

Probably some characters will display with boxes if you don't have chinese
fonts on your system, but some should display. The problem shows up as latin 1
characters with codepoints >= 128 rather than boxes due to missing glyphs
Comment 2 gna 2004-08-30 21:17:27 UTC
Just discovered that kwrite as an option to set the encoding in the view menu. When I openned kwrite the encoding was set to auto. If I manually set the encoding to utf8 the Chinese displays correctly. If I set it back to auto the problem comes back. If I set the encoding to Western European iso 8859-1 then it looks the same as when set to auto.

Thus it seems to be that the auto encoding should get the encoding from the locale but it is instead assuming an 8859-1 encoding (or possibly 8859-15).
Comment 3 Gregorio Guidi (RETIRED) gentoo-dev 2004-08-31 06:26:38 UTC
"auto" means the global encoding specified in .kde/share/config/kwriterc 
(opposed to the file-specific one that can be set by the drop-down menu).

If you don't have an Enconding parameter in kwriterc, then your locale is 
taken into account, but from that time on, the value stored in kwriterc takes precedence over locale, and should be changed in
Settings -> Configure Editor -> Open/Save -> Encoding
Comment 4 gna 2004-09-01 04:40:14 UTC
ok I see the option. Changing it to utf8 means the file opens correctly as you say. I didn't set this option. It must have been set by the kde setup wizard when it asked me what country I was from.

Personally I think it is quite a bad design. I guess the setup wizard is assuming you want the default encoding that is associated with your language. Generally a system should use one encoding as there is no file metadata that tells someone what encoding a file is using and it is not possible to determine it just from looking at the file unless you have other information as well. Thus I think the auto should refer to a system setting rather than an application specific setting. Imagine the pain of changing your locale if every application did this. I am not sure what should be changed to make kde more utf8 friendly. 
Comment 5 Gregorio Guidi (RETIRED) gentoo-dev 2004-09-01 16:01:27 UTC
It's not set by the setup wizard, if yiu did not set it before it gets its 
value from the locale variables. I agree there can be a bit of ambiguity in
the configuration arrangement, but I don't have anything to propose.
At this point you can close this bug, and if an idea comes to your mind, you 
should make yourself heard in http://bugs.kde.org/
Comment 6 gna 2004-09-02 07:28:37 UTC
I don't have any brilliant ideas.