Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 70405 - Accented characters don't come out correctly in mutt when using utf-8
Summary: Accented characters don't come out correctly in mutt when using utf-8
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: x86 Linux
: High major
Assignee: Aron Griffis (RETIRED)
URL: http://www.lustosa.net/gentoo/mutt.png
Whiteboard:
Keywords:
: 94538 (view as bug list)
Depends on:
Blocks:
 
Reported: 2004-11-07 20:01 UTC by Bruno Lustosa
Modified: 2005-06-03 10:50 UTC (History)
2 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
And example mail (bad-mail.txt,7.39 KB, text/plain)
2004-11-16 05:16 UTC, Bruno Lustosa
Details
ebuild with integrated patch. should fix it. (mutt-1.5.8-r3.ebuild,4.40 KB, text/plain)
2005-03-17 00:13 UTC, Christopher Korn
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Bruno Lustosa 2004-11-07 20:01:16 UTC
Well, I've tried to solve this both in the mutt users list and in gentoo users list, and I couldn't get it solved in any way, so I'm filing this bug report.

My whole system is set to utf-8. In the mutt index view, I get "squares" instead of the accented characters. You can see the result at http://www.lustosa.net/gentoo/mutt-index.png.
When I go into message view, all accents come as "quoted" characters (like \246, and so on). I have a screenshot at http://www.lustosa.net/gentoo/mutt.png.
This happens both with a configuration file and without.
My locale variables seem to be fine, I'll post them in the additional info section.
If I hit "Reply", vim seems to be able to show them properly.

I've tried to compile ncurses, passing "--enable-widec" to its configure script, to get wide character support. This produced the "libncursesw.so". I symlinked it to libncurses.so, and when reemerging mutt, it linked properly.
Without this extra hack, the screen output in mutt used to come out garbled, with some columns shifted to the right because of the additional "hidden" characters in utf-8.

Reproducible: Always
Steps to Reproduce:
Check the 2 screenshots. I can get more if needed.
Actual Results:  
Screen garbled. Accents not displayed properly.


Output from locale (set by system, not from any .bashrc-like script):
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=

Some email that shows broken have broken content-type. However, I wasn't able to
get mutt to display unknown charsets as iso8859-1.
Please, contact me if any additional info is needed!!
Comment 1 Alain Bench 2004-11-11 00:59:39 UTC
Could you please give the example bad mail?
What gives "locale --all-locales"?
Why wasn't you able to get mutt to display unknown charsets as iso8859-1?
Comment 2 Bruno Lustosa 2004-11-16 05:16:15 UTC
Created attachment 44047 [details]
And example mail
Comment 3 Bruno Lustosa 2004-11-16 05:17:04 UTC
I'm sending an example bad mail as an attachment. From what I see, the mail's content-type is multipart/alternative, but it seems the text part is correctly defined as 'charset=iso8859-1'.
Moreover, as I said, I had to hand configure ncurses with '--enable-widec', and then configure mutt to link against it instead of slang. I also had to make a symlink from libncursesw.so to libncurses.so. I've been told the ABI is different, but it's working properly. Without this, the columns in the index view would come garbled on lines with one or more accents.
Output of locale --all-locales:
aa_DJ
aa_ER
aa_ER@saaho
aa_ET
af_ZA
am_ET
an_ES
ar_AE
ar_AE.utf8
ar_BH
ar_BH.utf8
ar_DZ
ar_DZ.utf8
ar_EG
ar_EG.utf8
ar_IN
ar_IQ
ar_IQ.utf8
ar_JO
ar_JO.utf8
ar_KW
ar_KW.utf8
ar_LB
ar_LB.utf8
ar_LY
ar_LY.utf8
ar_MA
ar_MA.utf8
ar_OM
ar_OM.utf8
ar_QA
ar_QA.utf8
ar_SA
ar_SA.utf8
ar_SD
ar_SD.utf8
ar_SY
ar_SY.utf8
ar_TN
ar_TN.utf8
ar_YE
ar_YE.utf8
az_AZ.utf8
be_BY
be_BY.utf8
bg_BG
bg_BG.utf8
bn_BD
bn_IN
br_FR
br_FR@euro
bs_BA
byn_ER
C
ca_ES
ca_ES@euro
ca_ES.utf8
cs_CZ
cs_CZ.utf8
cy_GB
cy_GB.utf8
da_DK
da_DK.iso885915
da_DK.utf8
de_AT
de_AT@euro
de_AT.utf8
de_BE
de_BE@euro
de_BE.utf8
de_CH
de_CH.utf8
de_DE
de_DE@euro
de_DE.utf8
de_LU
de_LU@euro
de_LU.utf8
el_GR
el_GR.utf8
en_AU
en_AU.utf8
en_BW
en_BW.utf8
en_CA
en_CA.utf8
en_DK
en_DK.utf8
en_GB
en_GB.iso885915
en_GB.utf8
en_HK
en_HK.utf8
en_IE
en_IE@euro
en_IE.utf8
en_IN
en_NZ
en_NZ.utf8
en_PH
en_PH.utf8
en_SG
en_SG.utf8
en_US
en_US.iso885915
en_US.utf8
en_ZA
en_ZA.utf8
en_ZW
en_ZW.utf8
es_AR
es_AR.utf8
es_BO
es_BO.utf8
es_CL
es_CL.utf8
es_CO
es_CO.utf8
es_CR
es_CR.utf8
es_DO
es_DO.utf8
es_EC
es_EC.utf8
es_ES
es_ES@euro
es_ES.utf8
es_GT
es_GT.utf8
es_HN
es_HN.utf8
es_MX
es_MX.utf8
es_NI
es_NI.utf8
es_PA
es_PA.utf8
es_PE
es_PE.utf8
es_PR
es_PR.utf8
es_PY
es_PY.utf8
es_SV
es_SV.utf8
es_US
es_US.utf8
es_UY
es_UY.utf8
es_VE
es_VE.utf8
et_EE
et_EE.iso885915
et_EE.utf8
eu_ES
eu_ES@euro
eu_ES.utf8
fa_IR
fi_FI
fi_FI@euro
fi_FI.utf8
fo_FO
fo_FO.utf8
fr_BE
fr_BE@euro
fr_BE.utf8
fr_CA
fr_CA.utf8
fr_CH
fr_CH.utf8
fr_FR
fr_FR@euro
fr_FR.utf8
fr_LU
fr_LU@euro
fr_LU.utf8
ga_IE
ga_IE@euro
ga_IE.utf8
gd_GB
gez_ER
gez_ER@abegede
gez_ET
gez_ET@abegede
gl_ES
gl_ES@euro
gl_ES.utf8
gu_IN
gv_GB
gv_GB.utf8
he_IL
he_IL.utf8
hi_IN
hr_HR
hr_HR.utf8
hu_HU
hu_HU.utf8
id_ID
id_ID.utf8
is_IS
is_IS.utf8
it_CH
it_CH.utf8
it_IT
it_IT@euro
it_IT.utf8
iw_IL
iw_IL.utf8
ja_JP.eucjp
ja_JP.utf8
ka_GE
kk_KZ
kl_GL
kl_GL.utf8
kn_IN
ko_KR.euckr
ko_KR.utf8
kw_GB
kw_GB.utf8
lg_UG
lo_LA
lt_LT
lt_LT.utf8
lv_LV
lv_LV.utf8
mi_NZ
mk_MK
mk_MK.utf8
ml_IN
mn_MN
mr_IN
ms_MY
ms_MY.utf8
mt_MT
mt_MT.utf8
nb_NO
nb_NO.utf8
ne_NP
nl_BE
nl_BE@euro
nl_BE.utf8
nl_NL
nl_NL@euro
nl_NL.utf8
nn_NO
nn_NO.utf8
no_NO
no_NO.utf8
oc_FR
om_ET
om_KE
pa_IN
pl_PL
pl_PL.utf8
POSIX
pt_BR
pt_BR.utf8
pt_PT
pt_PT@euro
pt_PT.utf8
ro_RO
ro_RO.utf8
ru_RU
ru_RU.koi8r
ru_RU.utf8
ru_UA
ru_UA.utf8
se_NO
sid_ET
sk_SK
sk_SK.utf8
sl_SI
sl_SI.utf8
so_DJ
so_ET
so_KE
so_SO
sq_AL
sq_AL.utf8
st_ZA
st_ZA.utf8
sv_FI
sv_FI@euro
sv_FI.utf8
sv_SE
sv_SE.iso885915
sv_SE.utf8
ta_IN
te_IN
tg_TJ
th_TH
th_TH.utf8
ti_ER
ti_ET
tig_ER
tl_PH
tr_TR
tr_TR.utf8
tt_RU.utf8
uk_UA
uk_UA.utf8
ur_PK
uz_UZ
uz_UZ@cyrillic
vi_VN
vi_VN.tcvn
wa_BE
wa_BE@euro
wa_BE.utf8
xh_ZA
xh_ZA.utf8
yi_US
zh_CN
zh_CN.gb18030
zh_CN.gbk
zh_CN.utf8
zh_HK
zh_HK.utf8
zh_SG
zh_SG.gbk
zh_TW
zh_TW.euctw
zh_TW.utf8
zu_ZA
zu_ZA.utf8
Comment 4 Alain Bench 2004-11-16 13:16:59 UTC
The example mail is not the same as the screenshots. Anyway garbled display confirmed. Note vim *seems* to show accents properly, and you then *seem* to can compose a reply properly too, but in fact the sent reply would be broken (vim is too much smart). Your locale is fine.

Bad mail contains invalid chars in header: Sending mailer is at fault. And perhaps a little bit Lynx config also.

Workaround in Mutt is possible doing in muttrc:

| unset strict_mime
| set assumed_charset=windows-1252
| alternative_order text/plain text/html

With this I see bad mail fully correct, in text, and can reply. Additionally you may want to configure Lynx to show html correctly converted, if possible. And remove single quotes around '%s' from the ~/.mailcap entry.
Comment 5 Bruno Lustosa 2004-11-17 19:08:43 UTC
Yes, vim does show everything correctly. If I send mail to myself, I get it fine.
The lynx configuration is fine I think. It happens both with text and html mails.
While trying to set the variables you suggested, I got the following errors:

Error in /home/lofofora/.muttrc, line 1444: strict_mime: unknown variable
Error in /home/lofofora/.muttrc, line 1446: assumed_charset: unknown variable

I searched the manual for those variables, and couldn't find them. Are you using some patch or some USE flags on your mutt?
Comment 6 Alain Bench 2004-11-18 07:19:59 UTC
Please send me privately a reply to example mail, with your original settings, quoting html part, and adding some accented words.

The body text part of example mail is fine, and therefore displays fine in Mutt, without any special setting.

Both $strict_mime and $assumed_charset variables come with the patch-1.5.6.tt.assumed_charset.1 or the patch-1.5.6.tt.ja.1 by Takashi Takizawa on <URL:http://www.emaillab.org/mutt/download15.html>. I read here in closed Gentoo bug 5643 that you can activate JA-patch with USE cjk and removing "--enable-default-japanese" from ./configure options. Not guaranteed: Someone can confirm?

Anyway the assumed_charset feature is a must-have to deal with invalid raw accented headers and unlabelled body text parts (non MIME conformant mails). A must-have for anyone, and especially for UTF-8 systems.
Comment 7 Alain Bench 2004-12-03 03:23:05 UTC
Reporter Bruno acknoweledged that patch-1.5.6.tt.assumed_charset.1 and proposed settings solved his problem. I suggested turning this bug into a wish to include patch, with flags:

| Hardware:     All
| OS:           All
| Severity:     enhancement
Comment 8 Christopher Korn 2005-03-17 00:13:30 UTC
Created attachment 53681 [details]
ebuild with integrated patch. should fix it. 

This is a ebuild with assumed_charset patch.
Comment 9 Aron Griffis (RETIRED) gentoo-dev 2005-03-17 14:57:03 UTC
ok, it's in portage as mutt-1.5.8-r2

Thanks for all the helpful responses in this bug.
Comment 10 Maurice van der Pot (RETIRED) gentoo-dev 2005-06-03 10:50:48 UTC
*** Bug 94538 has been marked as a duplicate of this bug. ***