The current (7.2 and more) /etc/vim/vimrc configuration forces to use utf-8 as the "default" fileencoding in very common situations, without take in account your locale isn't necessary utf-8. Indeed, the fileencodings vim option is as following after the /etc/vim/vimrc script execution: fileencodings=ucs-bom,utf-8,default and could have a specific additional value (euc-kr, euc-jp, big5 or gb2312) if the current v:lang is specific (^ko, ^ja_JP, ^zh_TW, ^zh_CN) between ucs-bom and utf-8. With this fileencodings vim option specification, when your don't have a specific v:lang variable (^ko, ^ja_JP, ^zh_TW, ^zh_CN), and in the following situtations, vim sets the fileencoding value to utf-8, without thinking the locale (default) should be different (like iso-8859-1): - when vim open an empty file (no BOM and no not valid utf-8 characters) - when vim open a file without BOM and with only ASCII characters You're right, unicode/utf-8 is the more pertinent file encoding for the i18n. But we are free to choose or to keep the old :-) file encoding, iso-8859-1 & cie. What do you think ? Best Regards, Gregoire Baron Reproducible: Always Steps to Reproduce: Can be produce only when your v:lang isn't ^ko, ^ja_JP, ^zh_TW or ^zh_CN, and when your locale isn't utf-8: In your shell: 1. $ touch abcd0.txt 2. $ vim abcd0.txt In vim abcd0.txt: 3. :se fileencodings? will display fileencodings=ucs-bom,utf-8,default 4. :se fileencoding? will display fileencoding=utf-8 5. write and save somethings which aren't only ASCII characters, and quit In your shell: 6. $ cat abcd0.txt will print something not understandable because your locale isn't utf-8. The aim was to write a file in your locale, wasn't it ? 7. $ vim abcd1.txt where abcd1.txt contains only ASCII characters, and no BOM In vim abcd1.txt: 8. :se fileencoding? will display fileencoding=utf-8 9. write and save somethings which aren't ASCII characters, and quit In your shell: 10. $ cat abcd1.txt will print something not understandable because your locale isn't utf-8. In which file encoding did you want to write this file ? in your locale or in utf-8 ? Actual Results: The modified files aren't in the current locale (default): - an empty file is supposed to be in utf-8 - a file without BOM and with only ASCII characters is supposed to be in utf-8 Expected Results: Vim should use the default file encoding: - an empty file should be supposed to be in the default file encoding - a file without BOM and with only ASCII characters can be supposed to be necessary in utf-8
Created attachment 185825 [details, diff] A patch to correct the current vimrc-r3 version about the fileencoding issue I use this patch to correct the current vimrc-r3 (/etc/vim/vimrc) about the fileencoding issue described here. It works fine.
Created attachment 191706 [details] vimrc-r4 That patch throws up an error when you open a help page, since you may not set fileencoding on a read-only file. I've come up with what I think is a better solution. Could you please try the attached file instead of your current /etc/vim/vimrc and let me know if it does the "right thing" for you? I've tested it as well as I can, but my default locale is utf-8, so I'd like to get your opinion on it. If it works, I'll check this in as vimrc-r4.
By the way, this will have the side-effect of treating ascii-only files as the default encoding, not UTF-8. But it should still detect UTF-8 files as UTF-8. I think.
I have just used this vimrc-r4 file for my /etc/vim/vimrc. After some tests, I noticed at least one bad situation. With this new configuration, which sets "default" just before "utf-8" in the fileencodings variable, the utf-8 files without any BOM could be supposed in the default encoding. Indeed, if the default fileencoding is something like "latin1" or "iso-8859-1" (not "utf-8" ...), this kind of fileencoding accepts any file without restriction. Also, if the default encoding isn't utf-8, the utf-8 files without any BOM aren't recognized, and aren't correctly loaded. This behavior is explained with the fileencodings variable, in the vim documentation (http://vimdoc.sourceforge.net/htmldoc/options.html#%27fileencodings%27), with in particular those examples: WRONG VALUES: WHAT'S WRONG: latin1,utf-8 "latin1" will always be used utf-8,ucs-bom,latin1 BOM won't be recognized in an utf-8 file That's why I tried to correct dynamically the fileencoding variable in my patch. Could you describe me the bad behavior you saw with my patch about the help pages ? Indeed, in my situation (default is "iso-8859-15"), I don't observe any bad behavior with them or with any readonly files (in utf-8 fileencoding or not) ... Thanks a lot. Best Regards.
Created attachment 191817 [details] vimrc-r4 (attempt #2) Ah yes, I see what you mean. Okay, this version is a hybrid of your earlier patch, with a little more logic to fix the "Cannot set fenc on a read-only file" issue, as well as a shortcut that skips all the other checks for users with utf8 systems. Please test and let me know how this one goes ;)
(In reply to comment #4) > Could you describe me the bad behavior you saw with my patch about the help > pages ? Indeed, in my situation (default is "iso-8859-15"), I don't observe any > bad behavior with them or with any readonly files (in utf-8 fileencoding or > not) ... Certainly! Notice how your original command will 'set fileencoding=default' ALL the time for people whose proper default encoding is actually utf-8. Then imagine trying to open a help document which is non-modifiable... You get an ugly red error message :)
Have you had a chance to test this? I've got a couple other things I'm working on and would love to roll this in with them for vim-7.2.191
Hi! I'm so sorry for my very late answer. I have just tested your new vimrc-r4 file, 2 months after your proposal ... And, so great, it's exactly as I imagined. If I understand correctly, the g:added_fenc_utf8 variable is set at the installation (emerge) level, if the ebuild detects the current locale isn't utf-8, isn't it ? It seems to work very well. Also, if no one has restriction on that, this patch can be included in the vim-7.2.191 ebuild or more. Thanks a lot. Best Regards, Grégoire Baron
(In reply to comment #8) > I'm so sorry for my very late answer. > I have just tested your new vimrc-r4 file, 2 months after your proposal ... > > And, so great, it's exactly as I imagined. If I understand correctly, the > g:added_fenc_utf8 variable is set at the installation (emerge) level, if the > ebuild detects the current locale isn't utf-8, isn't it ? Actually, this is all done at runtime (there's no way to know if the current user has the same locale as the root user who installed the package!) like this, earlier in the vimrc-r4 file: " Always check for UTF-8 when trying to determine encodings. if &fileencodings !~? "utf-8" " If we have to add this, the default encoding is not Unicode. " We use this fact later to revert to the default encoding in plaintext/empty " files. let g:added_fenc_utf8 = 1 set fileencodings+=utf-8 endif This takes advantage of the default values I know vim will set automatically based on the user's locale (see ':he fencs') > It seems to work very well. > Also, if no one has restriction on that, this patch can be included in the > vim-7.2.191 ebuild or more. Great! I've just added it in [g]vim[-core]-7.2.238, coming soon to a tree near you.
> " Always check for UTF-8 when trying to determine encodings. > if &fileencodings !~? "utf-8" > " If we have to add this, the default encoding is not Unicode. > " We use this fact later to revert to the default encoding in plaintext/empty > " files. > let g:added_fenc_utf8 = 1 > set fileencodings+=utf-8 > endif It's right, when I replied I have just checked a diff between the original vimrc and your proposal, and not exactly where you added the "let g:added_fenc_utf8 = 1" line. Shame on me ;-). Indeed, as vim initialises automatically fenc with at least the user's locale, it a very good way to interpret the situation. Thanks again.