Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 189627 - app-editors/emacs-22.1 hangs while saving if it lacks permission to remove backup
Summary: app-editors/emacs-22.1 hangs while saving if it lacks permission to remove ba...
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: High normal (vote)
Assignee: Emacs project
URL: http://groups.google.com/group/gnu.em...
Whiteboard:
Keywords: REGRESSION
Depends on:
Blocks:
 
Reported: 2007-08-20 18:43 UTC by Martin von Gagern
Modified: 2007-08-24 07:35 UTC (History)
0 users

See Also:
Package list:
Runtime testing required: ---


Attachments
strace fragment of 22.1 with bug happening (strace.txt,1.99 KB, text/plain)
2007-08-20 21:29 UTC, Martin von Gagern
Details
emacs-22.1-backup-buffer.patch (emacs-22.1-backup-buffer.patch,481 bytes, patch)
2007-08-21 06:29 UTC, Ulrich Müller
Details | Diff
Patch to ebuild to incorporate the fix (emacs-22.1.ebuild.patch,841 bytes, patch)
2007-08-21 12:08 UTC, Martin von Gagern
Details | Diff
emacs-22.1-backup-buffer.patch (emacs-22.1-backup-buffer.patch,622 bytes, patch)
2007-08-22 05:00 UTC, Ulrich Müller
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Martin von Gagern 2007-08-20 18:43:40 UTC
When emacs can't remove a backup file (FILE~) for a file it wants to write, it will go into a loop. C-g gets it out of the loop, but I found no way to get the modified file actually written. This is especially annoing if you are editing in a foreign directory with only very specific access permissions.

Reproducible: Always

Steps to Reproduce:
1. mkdir emacs-bug
2. cd emacs-bug
3. echo foo > foo.txt
4. echo bar > foo.txt~
5. chmod u-w . foo.txt~
6. emacs foo.txt
7. Change contents from "foo" to "baz"
8. C-x C-s to save
Actual Results:  
Displays clock cursor, will stay like that until you abort the command pressing C-g. strace reports lots of stat and unlink kalls for "foo.txt~". The file contents still reads "foo".

Expected Results:  
Modifications to file foo.txt saved, probably some warning about the outdated backup, as you will get a warning if emacs fails to create a new backup in case there was none before.
Comment 1 DrChandra the Gentoo Person 2007-08-20 19:02:54 UTC
I reproduced this.

Promote upstream. This isn't a Gentoo bug. The file permissions system is
acting normally.
Comment 2 Martin von Gagern 2007-08-20 20:02:27 UTC
(In reply to comment #1)
> Promote upstream. This isn't a Gentoo bug. The file permissions system is
> acting normally.

I've written a mail to bug-gnu-emacs@ and will set URL as soon as it appears on suitable archives.

Just for the record: I've talked on the #emacs channel on freenode, and there someone was kind enough to try to reproduce this, without success. He reports his version like this:
"ERC Version 5.2 with GNU Emacs 22.1.50.3 (i686-pc-linux-gnu, X toolkit, Xaw3d scroll bars) of 2007-08-09, Ubuntu 6.06.1 LTS ("Dapper Drake")"
I couldn't find out what exactly the relationship between that version and our 22.1 sources is, but maybe there already is a fix somewhere along the way.

After running "M-x toggle-debug-on-quit" I could get a backtrace when pressing C-g. I've replaced all binary strings by "...". I guess the loop is in backup-buffer-copy.

  copy-file(".../foo.txt" ".../foo.txt~" nil t)
  byte-code("..." [from-name to-name nil (delete-file to-name) ((file-error)) copy-file t] 5)
  backup-buffer-copy(".../foo.txt" ".../foo.txt~" 420)
  byte-code("..." [file-precious-flag backup-by-copying modes real-file-name backup-by-copying-when-linked backup-by-copying-when-mismatch 0 logand 3072 file-writable-p file-name-directory file-nlinks 1 file-attributes 2 9 file-ownership-preserved-p backup-buffer-copy rename-file t backup-by-copying-when-privileged-mismatch attr backupname setmodes] 4)
  byte-code("..." [targets delete-old-versions real-file-name buffer-file-name modes buffer-backed-up t nil y-or-n-p format "Delete excess backup versions of %s? " file-modes (byte-code "..." [file-precious-flag backup-by-copying modes real-file-name backup-by-copying-when-linked backup-by-copying-when-mismatch 0 logand 3072 file-writable-p file-name-directory file-nlinks 1 file-attributes 2 9 file-ownership-preserved-p backup-buffer-copy rename-file t backup-by-copying-when-privileged-mismatch attr backupname setmodes] 4) ((file-error ...)) (byte-code "..." [targets delete-file] 2) ((file-error)) setmodes] 5)
  backup-buffer()
  basic-save-buffer-2()
  basic-save-buffer-1()
  basic-save-buffer()
  save-buffer(1)
  call-interactively(save-buffer)
Comment 3 Christian Faulhammer (RETIRED) gentoo-dev 2007-08-20 20:54:30 UTC
On the other hand I cannot reproduce it.  Neither with 22.1 nor CVS 22.1.50.
Comment 4 Martin von Gagern 2007-08-20 21:22:19 UTC
(In reply to comment #3)
> On the other hand I cannot reproduce it.  Neither with 22.1 nor CVS 22.1.50.

OK, so let's try find differences.
My USE flags for 22.1 are: X -Xaw3d alsa gif gtk -gzip-el -hesiod jpeg motif png -sound -source spell tiff -toolkit-scroll-bars xpm

I could reproduce this with an environment only containing DISPLAY and XAUTHORITY, or nothing but TERM, so environment settings like locale don't seem to play a role, and neither does X vs. tty. Passing -Q makes no difference either, so it's not the init scripts. At least if under the same conditions you still can't reproduce the file.
Comment 5 Martin von Gagern 2007-08-20 21:29:04 UTC
Created attachment 128715 [details]
strace fragment of 22.1 with bug happening

This strace fragment shows what was going on when the bug happened. I used
env -i TERM=$TERM strace -f -s 10240 -o ../strace /usr/bin/emacs-22 -Q foo.txt
to strace emacs. I stripped the path before emacs-bug, to keep lines short.

I have an ltrace as well, but that is less clean because of all those memory addresses. Also if I run ltrace with -f, at some point it tries to attach to PID 1, and is generally busy printing error messages. Somehow emacs doesn't play nice with ltrace.
Comment 6 Martin von Gagern 2007-08-20 21:50:55 UTC
I guess i puzzled out how the Lisp code in backup-buffer-copy actually works here (files.el, 3120 - 3130):

	  (while (condition-case ()
		     (progn
		       (condition-case nil
			   (delete-file to-name)
			 (file-error nil))
		       (copy-file from-name to-name nil t)
		       nil)
		   (file-already-exists t))
	    ;; The file was somehow created by someone else between
	    ;; `delete-file' and `copy-file', so let's try again.
	    nil))

condition-case is something like a try-catch, progn something like a block.
So first it does delete the file, catching any file errors. Whatever the outcome of that inner condition-case, the progn simply discards it and goes on with the copy-file, which will throw a file-already-exists. This will cause the whole argument of while to become t instead of nil as it would if copy-file had succeeded. And as long as the argument of while is t, the loop continues.

The code seems to make two wrong assumptions:
1. If I can't delete the file, it didn't exist in the first place
2. If the file exists after I tried delete it, someone else recreated it

The solution should probably be a more elaborate handler for file-error. If the error is because the file doesn't exist, that's no problem at all, and the command may continue. If, however, the file does exist and cannot be deleted, this has to break the loop and probably be handled as an error or at leas a warning.

An alternative would be to only try to delete the file if it does exist; in this case a file-error would always be a real error, not an expected situation.

I still wonder how this same code could cause the bug /not/ to occur. Or maybe your code didn't run backup-buffer-copy at all, but for some reason chose a different one? Can you do "M-X debug-on-entry backup-buffer-copy"? If you save the file, you should get a debug frame if that function is used at all.
Comment 7 Ulrich Müller gentoo-dev 2007-08-21 06:02:20 UTC
I can reproduce it with all versions 22.1 or newer, i.e. 22.1 as well as the currect CVS trunk (22.1.50), unicode-2 branch (23.0.0-r6), and multi-tty branch (23.0.51).

Interestingly, the error does not occur for 21.4. For 21.4, Emacs displays the following message:
Cannot write backup file; backing up in %backup%~

Comparing files.el, I noticed that function backup-buffer-copy is new in Emacs 22. And the corresponding condition-case for Emacs 21 (occurs twice, files.el lines 2199ff and 2219ff) contains an extra test for the case that the file exists but is not writable by us:

  (condition-case ()
      (copy-file real-file-name backupname t t)
    (file-error
     ;; If copying fails because file BACKUPNAME
     ;; is not writable, delete that file and try again.
     (if (and (file-exists-p backupname)
	      (not (file-writable-p backupname)))
	 (delete-file backupname))
     (copy-file real-file-name backupname t t)))
Comment 8 Ulrich Müller gentoo-dev 2007-08-21 06:29:28 UTC
Created attachment 128742 [details, diff]
emacs-22.1-backup-buffer.patch

Does the attached patch fix the problem for you?
Comment 9 Martin von Gagern 2007-08-21 08:59:05 UTC
(In reply to comment #8)
> Created an attachment (id=128742) [edit]
> emacs-22.1-backup-buffer.patch
> 
> Does the attached patch fix the problem for you?

Not yet. First I patched the files.el and ran byte-compile-file as root on it. Got one warning, and still the same error. Next I included the patch in the ebuild. Still the same issue. I noticed that files.elc hat a change date Jun 2, earlier than the patched files.el. Byte-compiling the file again gave me the same warning, corrected the date, but still the problem persists.

Debugging the function shows me that the bytecode doesn't actually call file-exists-p at all. I guess the old version of the function got inlined into some other bytecode file. There seem to be quite a few of these in the source tarball. Is there some way to get the ebuild to rebuild all its .elc files?
Comment 10 Martin von Gagern 2007-08-21 09:50:54 UTC
(In reply to comment #9)
Even after adding "emake -C lisp compile-always" to the src_compile, it still doesn't seem to call file-exists-p, and it still loops.
Comment 11 Ulrich Müller gentoo-dev 2007-08-21 10:55:15 UTC
(In reply to comment #9)
> Is there some way to get the ebuild to rebuild all its .elc files?

Either you bootstrap:

   emake CC="$(tc-getCC)" bootstrap

which is sort of overkill here, or use the method of emacs-21.4-r12.ebuild, i.e. add the following three lines at the end of src_compile():

    einfo "Recompiling patched lisp files..."
    (cd lisp; emake recompile) || die
    emake CC="$(tc-getCC)" || die
Comment 12 Martin von Gagern 2007-08-21 12:08:59 UTC
Created attachment 128764 [details, diff]
Patch to ebuild to incorporate the fix

(In reply to comment #11)
> use the method of emacs-21.4-r12.ebuild,

Works now. I'd never have thought of running make in the top level dir again.
I've attached a patch to the ebuild. A bit different in formulation from what you originally suggested, but essentially the same.
Comment 13 Ulrich Müller gentoo-dev 2007-08-21 12:27:58 UTC
(In reply to comment #12)
> I've attached a patch to the ebuild. A bit different in formulation from
> what you originally suggested, but essentially the same.

The reason not to use option -C was that is a GNU make extension, and the ebuild should also run under FreeBSD (I don't know if -C exists there; it is a non-POSIX extension). Better play it safe, especially if there is a trivial equivalent like (cd; make).
Comment 14 Ulrich Müller gentoo-dev 2007-08-21 12:34:30 UTC
Just wanted to post the patch to bug-gnu-emacs, but you were faster. ;)
Lets wait for upstream's opinion about it.
Comment 15 Ulrich Müller gentoo-dev 2007-08-22 05:00:26 UTC
Created attachment 128827 [details, diff]
emacs-22.1-backup-buffer.patch

For whatever reason, upstream prefers a different approach.

This is taken from the EMACS_22_BASE branch. (I've only included the first hunk since the rest are only comment changes.) Can you test if this works for you?
Comment 16 Martin von Gagern 2007-08-22 19:03:45 UTC
(In reply to comment #15)
> Can you test if this works for you?

Does work for my test case, yes.
As you are aware, there is still some discussion on bug-gnu-emacs, where I posted a comment regarding that fix as well. Might perhaps be too filesystem-specific.
Comment 17 Ulrich Müller gentoo-dev 2007-08-24 07:35:02 UTC
RMS locutus causa finita. ;-)
Fixed in 22.1-r1, using my first patch from attachment #128742 [details, diff].

Martin, thank you for reporting and for the good work.