Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 23254 - glib misbehaving with G_BROKEN_FILENAMES
Summary: glib misbehaving with G_BROKEN_FILENAMES
Status: RESOLVED WONTFIX
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: x86 Linux
: High normal (vote)
Assignee: Gentoo Linux Gnome Desktop Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2003-06-21 23:37 UTC by Spundun Bhatt
Modified: 2004-08-19 14:55 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Spundun Bhatt 2003-06-21 23:37:17 UTC
I had this problem and finally I have narrowed it down to the change that
introduced the problem. I posted the problem first on gentoo-user mailing list
in 5th June. 
The change from glib-2.2.1 to 2.2.1-r1 that of introducing the new env variable
G_BROKEN_FILENAME broke UTF-8 filename support on my machine. I had some files
with name in gujarati and hindi which it stopped displaying in the opne dialog.
I think the problem is there with all UTF-8 characters (french etc...) try
saving a file in gedit with UTF-8 characters in filename. It doesnt work for
2.2.1-r1 and above while it works for 2.2.1
I dont know why exactly this breaks things but hopefully theres a simple
explanation.

Reproducible: Always
Steps to Reproduce:
1.create a doc in gedit
2.try to save it with utf-8 chracters in filename (using gkb)
3.

Actual Results:  
the dialog box will ignore non english characters

Expected Results:  
I should be able to save files with non english character names.

emerge info
Portage 2.0.48-r1 (default-x86-1.4, gcc-3.2.3, glibc-2.3.2-r1)
=================================================================
System uname: 2.4.20 i686 Intel(R) Pentium(R) 4 CPU 2.40GHz
GENTOO_MIRRORS="http://gentoo.oregonstate.edu
http://distro.ibiblio.org/pub/Linux/distributions/gentoo"
CONFIG_PROTECT="/etc /var/qmail/control /usr/kde/2/share/config
/usr/kde/3/share/config /usr/X11R6/lib/X11/xkb /usr/kde/3.1/share/config
/usr/share/config"
CONFIG_PROTECT_MASK="/etc/gconf /etc/env.d"
PORTDIR="/usr/portage"
DISTDIR="/usr/portage/distfiles"
PKGDIR="/usr/portage/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR_OVERLAY=""
USE="x86 oss 3dnow apm avi crypt cups encode gif jpeg libg++ libwww mikmod mmx
mpeg ncurses nls pdflib png quicktime spell truetype xml2 xmms xv zlib directfb
gtkhtml gdbm berkdb slang readline aalib bonobo svga tcltk guile X sdl gpm tcpd
pam ssl perl python esd imlib oggvorbis gnome gtk qt motif opengl mozilla ldap
cdr 3dfx dga dvd gb gtk2 lirc pcmcia pda radeon samba trusted vim-with-x -java
-arts -kde -alsa"
COMPILER="gcc3"
CHOST="i686-pc-linux-gnu"
CFLAGS="-march=pentium3 -O3 -pipe"
CXXFLAGS="-O2 -mcpu=i686 -pipe"
ACCEPT_KEYWORDS="x86 ~x86"
MAKEOPTS="-j2"
AUTOCLEAN="yes"
SYNC="rsync://rsync.gentoo.org/gentoo-portage"
FEATURES="sandbox ccache"
Comment 1 Alastair Tse (RETIRED) gentoo-dev 2003-07-14 08:50:55 UTC
you could remove the file in /etc/env.d/50glib and re-running env-update
Comment 2 Stanislav Brabec 2003-07-15 01:55:17 UTC
Are you using UTF-8 locale? Note that Gentoo does not yet install European UTF-8 locales by default.

If not, follow previous note.

G_BROKEN_FILENAMES forces locale charset for filenames (to be compatible with console applications).

Note that this issue should be solved more cleanly in GNOME 2.4.

For more information, please follow GNOME bugzilla entries:
http://bugzilla.gnome.org/show_bug.cgi?id=114068
http://bugzilla.gnome.org/show_bug.cgi?id=96531
Comment 3 Spundun Bhatt 2003-07-15 14:17:34 UTC
I got around the bug by patching the ebuild so that that env variable was no longer being set. Neverthless it looks like a bug that should be fixed somewhere. May be theres something wrong with glib that you guys could propagate upstream. 
I am not using any locale, i am just using the gkb applett to switch keyboard maps to enter texts in different encodings... the menus on all the windows and everything is still displayed in english.
Lemme know if I you need any further info. I dont have my machine with me right now but I will get it back soon.
Comment 4 Stanislav Brabec 2003-07-16 02:17:43 UTC
This is a more general problem - if you don't use any locale, you cannot read your filenames in other applications, which don't use glib2.

GLib currently offer only two behaviors:

G_BROKEN_FILENAMES unset - all filenames uses UTF-8. It brokes filename sharing for all users using non-UTF-8 locales - ls, mc, bash does not work properly with such names.

G_BROKEN_FILENAMES are set - filenames are locale specific. It is the default behavior of all applications except glib2. So I have decided to set it. In your case it forces ASCII for your filenames.

If you want to still use UTF-8 filenames, either generate required locale (for example for european languages, see man localedef or glibc sources) or use existing one. Then either create /etc/profile.d/99locale (gentoo does not support is yet) with
LANG=hi_IN.utf8
or set your locale in your GUI system (gdm, KDE control center).

If you want to stay in english, add:
LC_MESSAGES=C
to your .profile or to /etc/profile.d/99locale.

It should satisfy your requirements in both glib2 apps and console apps.

You can still observe problems in mc and some other applications, which are not UTF-8 ready.
Comment 5 Markus Bertheau (RETIRED) gentoo-dev 2003-07-16 12:26:44 UTC
> G_BROKEN_FILENAMES unset - all filenames uses UTF-8. It brokes filename sharing > for all users using non-UTF-8 locales - ls, mc, bash does not work properly with > such names.

How are bash and ls broken in UTF-8 locales?
Comment 6 Stanislav Brabec 2003-07-16 14:28:24 UTC
They are not broken. Bash, ls and mc only does not expect, that filenames are locale specific.

For example, in Czech Trash directory is named Kos (s is scaron). Suppose you create this directory in Nautilus without G_BROKEN_FILENAMES and then do a ls in ISO-8859-2 (LANG=cs_CZ) environment. The result is:
-rwx------    1 sb       users         122 2003-07-16 22:08 KoLA
(L is Lacute and A is Aogonek)

Reversely, if you do "mkdir Kos" (where s is scaron), you will not see it in Nautilus at all - you will get only message:
Gtk-Message: The filename "Kos" couldn't be converted to UTF-8 (try setting the environment variable G_BROKEN_FILENAMES): Invalid byte sequence in conversion input
Comment 7 Stanislav Brabec 2003-07-16 14:37:01 UTC
Sorry for mistake. Should be:

They are not broken. Bash, ls and mc only does expect, that filenames are locale
specific.
Comment 8 Spundun Bhatt 2003-07-17 11:30:04 UTC
> G_BROKEN_FILENAMES unset - all filenames uses UTF-8. It brokes filename sharing
> for all users using non-UTF-8 locales - ls, mc, bash does not work properly with
> such names.

I suppose "It brokes filename sharing for all users using non-UTF-8 locales" and "ls, mc, bash does not work properly with such names." are 2 seperate statements right?
I mean ls,mc,bash dont work with even UTF-8 filesnames, right?

Well my problem is a little more convoluted, I use more than one language, and I am sure there are lots and lots of people who do the same, certainly in india. e.g. I have files with name using UTF-8 hindi characters as well as files with names using UTF-8 gujarati characters.
From what I read in your comments, I dont think I can get both of them work at the same time without turning the BROKEN_FILE... env var off.
Hope theres some more flexible solution to this.
I think UTF-8 characters are converted in some ort of filename encoding, cant this encoding also contain an identifier for the type of encoding used? Its more of an upstream suggestion.
HTH.
Comment 9 Markus Bertheau (RETIRED) gentoo-dev 2003-07-17 11:50:35 UTC
ls and bash word with utf-8 filenames. You have to properly set up your UTF-8 environment for that of course.

> I dont think I can get both of them work at the same time without turning the BROKEN_FILE... env var off.

That depends on what you mean with "work".

You can run in an UTF-8 locale and have your filenames displayed correctly in every app that supports UTF-8 and the gujarati and hindi characters. G_BROKEN_FILENAMES only concerns filenames that are not valid UTF-8.
Comment 10 Stanislav Brabec 2003-07-17 16:01:33 UTC
ls and bash works every time with locale's charset.

glib-2.2 without G_BROKEN_FILENAMES uses only UTF-8 filenames for all locales.

glib-2.2 with G_BROKEN_FILENAMES uses locale's charset (i. e. the same as bash, ls). That's why I think that's wise for interoperability bash<->glib to turn it on. It means that G_BROKEN_FILENAMES has no effect for UTF-8 locales.

Note that glib-2.0 have had different behavior - only names invalid in UTF-8 was treated as locale-specific. (see upper mentioned bug report)

My opinion is - we should turn G_BROKEN_FILENAMES on by default. It improves interoperability with different applications.

If user wants UTF-8 filenames in non-UTF-8 locale, he/she can delete /etc/profile.d/99locale. In tis case he/she will lose interoperability with bash and console applications.

I hope it will be solved better in glib-2.4 (G_FILENAME_CHARSET) and even better in future after global switching to UTF-8.
Comment 11 Stanislav Brabec 2003-07-17 16:06:48 UTC
Especially for people in India:

Please try:

echo >etc/profile.d/99locale LANG=hi_IN.utf8

Then restart your session.

Nearly all applications (after proper setup) will start using UTF-8 for filenames and strings. You will only see small problems in mc and some other apps. GNOME2 should work perfectly.

If you don't like hindi locale messages, you can run:

echo -e >etc/profile.d/99locale 'LANG=hi_IN.utf8\nLC_MESSAGES=C'
Comment 12 Spundun Bhatt 2003-07-17 19:44:32 UTC
Okkey,
I dont know whats etc/profile.d/99locale
I suppose you mean /etc/env.d/99locale ? is that right? I dont have a /etc/profile.d directory

I tried creating a new file and adding two lines
LANG=hi_IN.utf8
LC_MESSAGES=C

in it. And also set the G_BROKEN_FILENAME variable as done in the glib2 ebuild.
Next time I login, I can see the filenames in the gedit open file dialog etc... BUT my gnome terminal is unusable. I see only vertical rectangles in place of characters, I still see numbers and symbols like @#$% fine.

Any guesses what wrong?
Comment 13 Markus Bertheau (RETIRED) gentoo-dev 2003-07-18 00:33:34 UTC
Spundun Bhatt: Try to set a terminal font that has the characters you need. The rectangles are shown for characters that the currently used font doesn't have.
Comment 14 Spundun Bhatt 2003-07-18 11:21:04 UTC
Well I need english characters!! :)
when I changed my locale to LANG=hi_IN.utf8, 
The bash stopped displaying english text, I mean
prompt is supposed to be spundun@hostname dirname> , this was all rectangles except for @ and >
ALso when I ls'ed in the directory with the indic filenames, only hindi filename was displayed and gujarati was not.
So if you still think I can fix this by changing terminal fonts, then err, could you tell me how to do it? ("google it" is a valid answer :) )

NOW, before I waste more of your time, I should tell you that the only non-ebuild package I have installed on this gentoo system is the indlinux package. The link is http://indlinux.org/downloads/downloads.php 
The package uses a little wierd installation method, but since its just a page of shellscript, you should be easily able to point of if theres anything broken in there, and then you or I could report it there.

Thanx
Spundun
Comment 15 Stanislav Brabec 2003-07-18 15:54:22 UTC
Yes, I meant /etc/env.d/99locale.

Gnome-terminal uses pango rendering library and it uses xft2.

Please look at /etc/pango.

pango.modules must not be empty

pangox.aliases must contain aliases and it must contain gujarati for your font type (try sans).

xtf:

/etc/fonts/* must contain useable values and must source all your fonts.
GENTOO BUG: You probably WILL NEED to edit it - Gentoo have had a bug and does not source bitmap chinese, korean and other default fonts here. If it is still present, please report it.

And last in resort, if it is bitmap font, it must be also present in X - try xfontsel or edit /etc/X11/XF86Config or your font server configuration.

Gnome terminal works for me
Comment 16 Markus Bertheau (RETIRED) gentoo-dev 2003-07-18 16:47:36 UTC
Be aware that the correct value for LANG is hi_IN.UTF-8, not hi_IN.utf8, X fails to regocnize the locale otherwise.
Comment 17 foser (RETIRED) gentoo-dev 2003-07-27 12:42:43 UTC
reporter, what is the status here ?
Comment 18 Spundun Bhatt 2003-07-28 11:29:07 UTC
I am not able to get the same behavior, that is when I login with hindi locale, gnome-terminal (or bash, I dont know) doesnt know how to print characters, I wanted to test that on an untouched gentoo installation, I was going to ask my friend, he hasnt updated his gentoo system in 6 months though, so its basically about me getting hold of him and making him run the update all night and then seeing if this problem is there on his machine also.

I am not able to track down this font configuration thing, who reads configuration  from where and stuff lik that, probably theres a font how to somewhere.

Basically my system works for me right now with the env variable disabled, for my purposes. But I wish it worked out of the box for everybody.
Comment 19 Spundun Bhatt 2003-08-19 10:23:12 UTC
Okkey,
My friend installed gentoo on a new machine and we tried to login using tamil locale into the gnome.
On the gnome terminal, I could not see any sensible text, it was all boxes. Thats all I have for right now. Isnt this something to e fixed?
This font thing, I cant make heads or tails of. I just dont understand how all these pieces are living togather. Is it described anywhere?
Comment 20 foser (RETIRED) gentoo-dev 2003-10-12 09:39:31 UTC
any of the UTF8 gurus around ? this is not my area and i don't know what
to do with it. Close it or do something, but don't leave it lingering around
forever.
Comment 21 foser (RETIRED) gentoo-dev 2004-08-19 14:43:01 UTC
incative and not clearly reproducable, closing.
Comment 22 Spundun Bhatt 2004-08-19 14:55:54 UTC
I am coming back to gentoo platform after about a year..... ( getting a new amd machine at work)... :) will report something again if I see it.

thanx