Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 72968 - Problems with writing to samba shares on kernel 2.6.9-r6
Summary: Problems with writing to samba shares on kernel 2.6.9-r6
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: x86 All
: High major (vote)
Assignee: Daniel Drake (RETIRED)
URL:
Whiteboard: 2.6.10-2005.0
Keywords: InVCS
: 66318 80023 (view as bug list)
Depends on:
Blocks:
 
Reported: 2004-11-30 14:21 UTC by Maciej J. Woloszyk
Modified: 2005-07-28 04:40 UTC (History)
9 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Maciej J. Woloszyk 2004-11-30 14:21:03 UTC
I've upgraded my kernel to 2.6.9-r6 (from 2.6.9-r1) both on my workstation and server. Together with it I upgraded samba from 3.0.7-r1 to 3.0.8. After that I found problems writing to samba shares (both server->ws and ws->server). When I try to write a file writing program hangs for quite some time (some 15 seconds) and then completes operation correctly or issue some I/O warnings. First I thought there was something wrong with the new samba version so I downgraded samba on one of the computers. Unfortunatelly it did not resolved the problems. When I looked into kernel log I found out that when the write starts kernel issues the following:
smb_trans2: invalid data, disp=0, cnt=0, tot=0, ofs=0
After that it waits and issues standard smb timeout:
smb_add_request: request [f082ae80, mid=4] timed out!

After this I restarted one of the machines with 2.6.9-r1 and the problem disappeared. So now I've downgraded both of them and it seems to work fine.

M.
Comment 1 Michael Glauche (RETIRED) gentoo-dev 2004-12-02 00:38:51 UTC
did you try "use sendfile = no", as advertised in the ebuild ? 
Comment 2 Maciej J. Woloszyk 2004-12-02 01:42:14 UTC
Sure I did. This was the case with previous version of samba (after upgrade
to 3.0.7). Now it's something different - IMO kernel related (I found info
that some security issues might cause this effect and when I looked in kernel
sources, the patch I found here: http://marc.theaimsgroup.com/?l=linux-kernel&m=110117804203865&w=2
somehow differs from what I found in there), but of course I could be terribly
wrong.

M.
Comment 3 Michael Glauche (RETIRED) gentoo-dev 2004-12-02 06:51:29 UTC
there are a lot of troubles with smbfs and 2.6.x kernels lately :(

As a current workaround, you can use the cifs filesystem, which basicly does
the same job (and usually works more stable)

Comment 4 Maciej J. Woloszyk 2004-12-02 06:58:18 UTC
Well... For a moment I don't really need a workaround. I can wait a few more
days with 2.6.9-r1 kernel instead of -r6. But anyway - thanks for a hint. I'll
take a look at the CIFS.
Comment 5 Vitaly Harisov 2004-12-08 05:00:07 UTC
I have the same problem with 2.6.9-r6 and 2.6.9-r9, but all works fine with 2.6.9-r4.
Comment 6 Ulrich Plate (RETIRED) gentoo-dev 2004-12-15 03:05:41 UTC
Just a quick "me, too" here. I'm getting the exact same errors and messages as Maciej with a 2.6.9-gentoo-r9, no problems (at least not this one...) with gentoo-dev-sources 2.6.9-gentoo-r5.
Comment 7 Ulrich Plate (RETIRED) gentoo-dev 2004-12-18 10:30:56 UTC
Upgraded to 2.6.9-gentoo-r10 yesterday and samba-3.0.10 today, the error persists. I'm also getting those I/O errors when I save documents from OOo to a samba share, for example, but refardless of those errors, the document is in fact 100 percent correct and accessible after saving. Go figure.

dmesg shows:

smb_trans2: invalid data, disp=0, cnt=0, tot=0, ofs=0
smb_add_request: request [cbbb7080, mid=70] timed out!

I'm reverting to 2.6.9-gentoo-r5 until someone figures out what's going on...
Comment 8 Frank Meier 2004-12-26 09:39:43 UTC
I have the same problem here. It also remains with gentoo-dev-sources-2.6.10-r1.
but with development-sources-2.6.10 everything works fine. I guess there's something wrong with the smbfs patch in the gentoo-dev-sources (since 2.6.9-r6: ChangeLog: Add smbfs security fix, bug 65877). Perhaps Bugdescription should be changed to "...on kernel >=gentoo-dev-sources-2.6.9-r6".

I tested this behavior on three different machines (one with amd64-kernel). Perhaps there's something misconfigured on the server's side. It's strange it works perfectly with other kernel on the client side.

mfg Franky
Comment 9 Ulrich Plate (RETIRED) gentoo-dev 2005-01-03 02:18:50 UTC
Same error still there with gentoo-dev-sources 2.6.10-r2. Since we all seem to think this is a kernel issue that has nothing to do with the Samba package, could this bug be reassigned to the right people?
Comment 10 Ulrich Plate (RETIRED) gentoo-dev 2005-01-10 06:13:54 UTC
The error persists with 2.6.10-gentoo-r3 and -r4. Copying a file to samba shares works error-free, while moving takes several seconds and triggers the error messages described above.
Comment 11 Daniel Drake (RETIRED) gentoo-dev 2005-01-12 09:30:44 UTC
Can someone please try 2.6.11-rc1?
Comment 12 Daniel Drake (RETIRED) gentoo-dev 2005-01-12 10:37:09 UTC
I have done some googling and it looks like that adding

use sendfile = no
large readwrite = no
max xmit = 16644

to your smb.conf will workaround this kernel problem. I really now need someone to test 2.6.11-rc1 without the above modification to see if the kernel bug is fixed there. If it isn't, could someone open a bug at http://bugzilla.kernel.org so that this does not get forgotten about? Thanks.
Comment 13 Ulrich Plate (RETIRED) gentoo-dev 2005-01-12 11:45:16 UTC
Sorry, I'm on dial-up, I'll test it within the hour.

Now, for the smb.conf settings - I was confusing things for some time, I thought samba had something to do with it, but it really has no influence on this bug whether I even have samba installed or not, I believe, except for smbmount, right? I don't run samba as such on any of my machines affected by this bug, none of them has it in any of its runlevels. I'm not talking about the server configuration at all, I'm only concerned about the smbfs clients, and I've got two of those behaving identically when they write to samba shares on a remote server. There's no way I could influence the server settings even if I wanted to, since it's a corporate machine running FreeBSD 4.6.2-RELEASE with samba 2.2.7a. Although I can read the relevant files, I don't have admin rights. Nobody has made any significant changes to the smb.conf on that thing in a year or so, the last insignificant change was in September 2004.

Downloading the patches for 2.6.10 --> 2.6.11-rc1 as we speak, I'll be back in a bit.
Comment 14 Daniel Drake (RETIRED) gentoo-dev 2005-01-12 12:15:25 UTC
I got in contact with a samba developer who informed me that samba 3.0.10 fixes this on the server side (by disabling sendfile support by default).
Samba 3.0.11 will have a working server-side sendfile implementation, and the kernel interface might be improved sometime too (but that is not vital).

So, it seems the fix for this is to edit the samba-server config and add the lines referenced above. Or update the server to the latest version (3.0.10).
Comment 15 Daniel Drake (RETIRED) gentoo-dev 2005-01-12 12:39:12 UTC
Argh, I'm fairly sure that the security "fix" from this bug is the cause..
http://bugs.gentoo.org/65877

I will look into getting it removed
Comment 16 Ulrich Plate (RETIRED) gentoo-dev 2005-01-12 13:14:16 UTC
If the absence of error messages when moving files to the same samba shares while using a freshly compiled vanilla 2.6.11-rc1 is any indication of your being right, then I guess you're right... :)

All the actions that did have problems - moving files to, or downloading to a smbmount'ed directory, or saving files from applications like OOo to a samba share  - now work without any error messages whatsoever.
Comment 17 Thierry Carrez (RETIRED) gentoo-dev 2005-01-13 04:04:29 UTC
It appears that the original smbfs patch breaks things.

Here is the Chuck Ebbert fix, used in -ac and RedHat products :
http://lwn.net/Articles/112514/?format=printable

Here is a another proposed patch from Marcus Meissner at SuSE :
===========================================================================
diff -u linux-2.6.8/fs/smbfs/request.c linux-2.6.8/fs/smbfs/request.c
--- linux-2.6.8/fs/smbfs/request.c	2004-11-10 12:27:58.000000000 +0100
+++ linux-2.6.8/fs/smbfs/request.c	2004-11-10 12:27:58.000000000 +0100
@@ -588,12 +588,5 @@
 	data_count  = WVAL(inbuf, smb_drcnt);
 
-	/* Modify offset for the split header/buffer we use */
-	if (data_offset < hdrlen)
-		goto out_bad_data;
-	if (parm_offset < hdrlen)
-		goto out_bad_parm;
-	data_offset -= hdrlen;
-	parm_offset -= hdrlen;
 
 	if (parm_count == parm_tot && data_count == data_tot) {
 		/*
@@ -603,21 +596,52 @@
 		 * case. It may be a server error to not return a
 		 * response that fits.
 		 */
+		/* _count = 0 is a special case, where data_offset is
+		 * not used.
+		 */
+		if (data_count != 0) {
+			if (data_offset < hdrlen)
+				goto out_bad_data;
+			/* Modify offset for the split header/buffer we use */
+			data_offset -= hdrlen;
+			if (data_offset + data_count > req->rq_rlen)
+				goto out_bad_data;
+			req->rq_ldata = data_count;
+			req->rq_data = req->rq_buffer + data_offset;
+		} else {
+			req->rq_data  = NULL;
+			req->rq_ldata = 0;
+		}
+
+		if (parm_count != 0) {
+			if (parm_offset < hdrlen)
+				goto out_bad_parm;
+			/* Modify offset for the split header/buffer we use */
+			parm_offset  -= hdrlen;
+			if (parm_offset + parm_count > req->rq_rlen)
+				goto out_bad_parm;
+			req->rq_lparm = parm_count;
+			req->rq_parm  = req->rq_buffer + parm_offset;
+		} else {
+			req->rq_lparm = 0;
+			req->rq_parm  = NULL;
+		}
+
 		VERBOSE("single trans2 response  "
 			"dcnt=%d, pcnt=%d, doff=%d, poff=%d\n",
 			data_count, parm_count,
 			data_offset, parm_offset);
-		req->rq_ldata = data_count;
-		req->rq_lparm = parm_count;
-		req->rq_data = req->rq_buffer + data_offset;
-		req->rq_parm = req->rq_buffer + parm_offset;
-		if (parm_offset + parm_count > req->rq_rlen)
-			goto out_bad_parm;
-		if (data_offset + data_count > req->rq_rlen)
-			goto out_bad_data;
 		return 0;
 	}
 
+	if (data_offset < hdrlen)
+		goto out_bad_data;
+	if (parm_offset < hdrlen)
+		goto out_bad_parm;
+	parm_offset -= hdrlen;
+	data_offset -= hdrlen;
+
+
 	VERBOSE("multi trans2 response  "
 		"frag=%d, dcnt=%d, pcnt=%d, doff=%d, poff=%d\n",
 		req->rq_fragment,
============================================================================

The Chuck Ebbert one is probably better for us.
Comment 18 Daniel Drake (RETIRED) gentoo-dev 2005-01-13 10:19:12 UTC
Thanks Koon! We'll try that patch as a replacement in the next release.
Comment 19 Daniel Drake (RETIRED) gentoo-dev 2005-01-15 15:53:15 UTC
Fixed in gentoo-dev-sources-2.6.10-r5. Thanks for reporting.
Comment 20 MikeM 2005-01-30 07:32:40 UTC
*** Bug 80023 has been marked as a duplicate of this bug. ***
Comment 21 Seemant Kulleen (RETIRED) gentoo-dev 2005-07-28 04:40:42 UTC
*** Bug 66318 has been marked as a duplicate of this bug. ***