Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 263333 - app-editor/vim-core-7.2 and 7.2.108 aren't correctly configured to manage the fileencoding vim option
Summary: app-editor/vim-core-7.2 and 7.2.108 aren't correctly configured to manage the...
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: High normal (vote)
Assignee: Jim Ramsay (lack) (RETIRED)
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 269508
  Show dependency tree
 
Reported: 2009-03-22 00:37 UTC by Grégoire Baron
Modified: 2009-07-23 22:10 UTC (History)
0 users

See Also:
Package list:
Runtime testing required: ---


Attachments
A patch to correct the current vimrc-r3 version about the fileencoding issue (vim.etc_vim_vimrc.fileencoding.patch,658 bytes, patch)
2009-03-22 00:41 UTC, Grégoire Baron
Details | Diff
vimrc-r4 (vimrc-r4,6.15 KB, text/plain)
2009-05-18 18:39 UTC, Jim Ramsay (lack) (RETIRED)
Details
vimrc-r4 (attempt #2) (vimrc-r4,6.61 KB, text/plain)
2009-05-19 19:03 UTC, Jim Ramsay (lack) (RETIRED)
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Grégoire Baron 2009-03-22 00:37:07 UTC
The current (7.2 and more) /etc/vim/vimrc configuration forces to use utf-8 as the "default" fileencoding in very common situations, without take in account your locale isn't necessary utf-8.

Indeed, the fileencodings vim option is as following after the /etc/vim/vimrc script execution:
 fileencodings=ucs-bom,utf-8,default
and could have a specific additional value (euc-kr, euc-jp, big5 or gb2312) if the current v:lang is specific (^ko, ^ja_JP, ^zh_TW, ^zh_CN) between ucs-bom and utf-8.

With this fileencodings vim option specification, when your don't have a specific v:lang variable (^ko, ^ja_JP, ^zh_TW, ^zh_CN), and in the following situtations, vim sets the fileencoding value to utf-8, without thinking the locale (default) should be different (like iso-8859-1):
 - when vim open an empty file (no BOM and no not valid utf-8 characters)
 - when vim open a file without BOM and with only ASCII characters

You're right, unicode/utf-8 is the more pertinent file encoding for the i18n. But we are free to choose or to keep the old :-) file encoding, iso-8859-1 & cie.

What do you think ?

Best Regards,
Gregoire Baron

Reproducible: Always

Steps to Reproduce:
Can be produce only when your v:lang isn't ^ko, ^ja_JP, ^zh_TW or ^zh_CN, and when your locale isn't utf-8:
In your shell:
1. $ touch abcd0.txt
2. $ vim abcd0.txt
In vim abcd0.txt:
3. :se fileencodings? will display fileencodings=ucs-bom,utf-8,default
4. :se fileencoding? will display fileencoding=utf-8
5. write and save somethings which aren't only ASCII characters, and quit
In your shell:
6. $ cat abcd0.txt will print something not understandable because your locale isn't utf-8. The aim was to write a file in your locale, wasn't it ?
7. $ vim abcd1.txt where abcd1.txt contains only ASCII characters, and no BOM
In vim abcd1.txt:
8. :se fileencoding? will display fileencoding=utf-8
9. write and save somethings which aren't ASCII characters, and quit
In your shell:
10. $ cat abcd1.txt will print something not understandable because your locale isn't utf-8. In which file encoding did you want to write this file ? in your locale or in utf-8 ?
Actual Results:  
The modified files aren't in the current locale (default):
 - an empty file is supposed to be in utf-8
 - a file without BOM and with only ASCII characters is supposed to be in utf-8

Expected Results:  
Vim should use the default file encoding:
 - an empty file should be supposed to be in the default file encoding
 - a file without BOM and with only ASCII characters can be supposed to be necessary in utf-8
Comment 1 Grégoire Baron 2009-03-22 00:41:24 UTC
Created attachment 185825 [details, diff]
A patch to correct the current vimrc-r3 version about the fileencoding issue

I use this patch to correct the current vimrc-r3 (/etc/vim/vimrc) about the fileencoding issue described here. It works fine.
Comment 2 Jim Ramsay (lack) (RETIRED) gentoo-dev 2009-05-18 18:39:53 UTC
Created attachment 191706 [details]
vimrc-r4

That patch throws up an error when you open a help page, since you may not set fileencoding on a read-only file.

I've come up with what I think is a better solution.  Could you please try the attached file instead of your current /etc/vim/vimrc and let me know if it does the "right thing" for you?  I've tested it as well as I can, but my default locale is utf-8, so I'd like to get your opinion on it.

If it works, I'll check this in as vimrc-r4.
Comment 3 Jim Ramsay (lack) (RETIRED) gentoo-dev 2009-05-18 18:42:38 UTC
By the way, this will have the side-effect of treating ascii-only files as the default encoding, not UTF-8.  But it should still detect UTF-8 files as UTF-8.  I think.
Comment 4 Grégoire Baron 2009-05-19 16:31:02 UTC
I have just used this vimrc-r4 file for my /etc/vim/vimrc.

After some tests, I noticed at least one bad situation. With this new configuration, which sets "default" just before "utf-8" in the fileencodings variable, the utf-8 files without any BOM could be supposed in the default encoding. Indeed, if the default fileencoding is something like "latin1" or "iso-8859-1" (not "utf-8" ...), this kind of fileencoding accepts any file without restriction. Also, if the default encoding isn't utf-8, the utf-8 files without any BOM aren't recognized, and aren't correctly loaded.

This behavior is explained with the fileencodings variable, in the vim documentation (http://vimdoc.sourceforge.net/htmldoc/options.html#%27fileencodings%27), with in particular those examples:
  WRONG VALUES:         WHAT'S WRONG:
  latin1,utf-8          "latin1" will always be used
  utf-8,ucs-bom,latin1  BOM won't be recognized in an utf-8 file

That's why I tried to correct dynamically the fileencoding variable in my patch.

Could you describe me the bad behavior you saw with my patch about the help pages ? Indeed, in my situation (default is "iso-8859-15"), I don't observe any bad behavior with them or with any readonly files (in utf-8 fileencoding or not) ...

Thanks a lot.

Best Regards.
Comment 5 Jim Ramsay (lack) (RETIRED) gentoo-dev 2009-05-19 19:03:08 UTC
Created attachment 191817 [details]
vimrc-r4 (attempt #2)

Ah yes, I see what you mean.

Okay, this version is a hybrid of your earlier patch, with a little more logic to fix the "Cannot set fenc on a read-only file" issue, as well as a shortcut that skips all the other checks for users with utf8 systems.

Please test and let me know how this one goes ;)
Comment 6 Jim Ramsay (lack) (RETIRED) gentoo-dev 2009-05-19 19:07:34 UTC
(In reply to comment #4)
> Could you describe me the bad behavior you saw with my patch about the help
> pages ? Indeed, in my situation (default is "iso-8859-15"), I don't observe any
> bad behavior with them or with any readonly files (in utf-8 fileencoding or
> not) ...

Certainly!

Notice how your original command will 'set fileencoding=default' ALL the time for people whose proper default encoding is actually utf-8.  Then imagine trying to open a help document which is non-modifiable... You get an ugly red error message :)
Comment 7 Jim Ramsay (lack) (RETIRED) gentoo-dev 2009-06-02 13:17:49 UTC
Have you had a chance to test this?

I've got a couple other things I'm working on and would love to roll this in with them for vim-7.2.191
Comment 8 Grégoire Baron 2009-07-21 23:16:31 UTC
Hi!

I'm so sorry for my very late answer.
I have just tested your new vimrc-r4 file, 2 months after your proposal ...

And, so great, it's exactly as I imagined. If I understand correctly, the g:added_fenc_utf8 variable is set at the installation (emerge) level, if the ebuild detects the current locale isn't utf-8, isn't it ?

It seems to work very well.
Also, if no one has restriction on that, this patch can be included in the vim-7.2.191 ebuild or more.

Thanks a lot.

Best Regards,

Grégoire Baron
Comment 9 Jim Ramsay (lack) (RETIRED) gentoo-dev 2009-07-22 14:06:55 UTC
(In reply to comment #8)
> I'm so sorry for my very late answer.
> I have just tested your new vimrc-r4 file, 2 months after your proposal ...
> 
> And, so great, it's exactly as I imagined. If I understand correctly, the
> g:added_fenc_utf8 variable is set at the installation (emerge) level, if the
> ebuild detects the current locale isn't utf-8, isn't it ?

Actually, this is all done at runtime (there's no way to know if the current user has the same locale as the root user who installed the package!) like this, earlier in the vimrc-r4 file:

" Always check for UTF-8 when trying to determine encodings.
if &fileencodings !~? "utf-8"
  " If we have to add this, the default encoding is not Unicode.
  " We use this fact later to revert to the default encoding in plaintext/empty
  " files.
  let g:added_fenc_utf8 = 1
  set fileencodings+=utf-8
endif

This takes advantage of the default values I know vim will set automatically based on the user's locale (see ':he fencs')

> It seems to work very well.
> Also, if no one has restriction on that, this patch can be included in the
> vim-7.2.191 ebuild or more.

Great!  I've just added it in [g]vim[-core]-7.2.238, coming soon to a tree near you.
Comment 10 Grégoire Baron 2009-07-23 22:10:59 UTC
> " Always check for UTF-8 when trying to determine encodings.
> if &fileencodings !~? "utf-8"
>   " If we have to add this, the default encoding is not Unicode.
>   " We use this fact later to revert to the default encoding in plaintext/empty
>   " files.
>   let g:added_fenc_utf8 = 1
>   set fileencodings+=utf-8
> endif

It's right, when I replied I have just checked a diff between the original vimrc and your proposal, and not exactly where you added the "let g:added_fenc_utf8 = 1" line. Shame on me ;-). Indeed, as vim initialises automatically fenc with at least the user's locale, it a very good way to interpret the situation.

Thanks again.