Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 230950 - dmraid RAID1: reads go to only one disk
Summary: dmraid RAID1: reads go to only one disk
Status: RESOLVED UPSTREAM
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: AMD64 Linux
: High normal (vote)
Assignee: Stefan Schweizer (RETIRED)
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-07-06 16:06 UTC by Timothy Miller
Modified: 2010-06-29 20:10 UTC (History)
2 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Timothy Miller 2008-07-06 16:06:15 UTC
I have a system with an Intel ICH9R and two drives configured as a RAID1 array.  I've been doing some reading on this, and my system isn't behaving as it should.  

First, I understand that this is basically just software RAID, so this is relying on the device mapper to identify the underlying devices.  The Linux kernel is responsible for scheduling all accesses to the drives.  Second, Linux RAID1 is designed to distribute reads across drives.  In particular, if the reads are entirely random, it should balance the load across the disks.

I have a program that does an incredible amount of random access reading.  While it's running, I monitor system activity with "iostat -d 2".  What I observe is that while writes go to both drives (of course), reads go ONLY to the first drive.  So something's not setup right.  The Gentoo wiki has a very nice page that explains how to set up with device-mapper and dmraid, and there was no mention of configuration options that would pertain to this.

Reproducible: Always

Steps to Reproduce:
1.  Do random access disk reads
2.
3.

Actual Results:  
Add reads go to one drive

Expected Results:  
Reads should be balanced across drives

I'm using 2008.0-beta and ~amd64.
Comment 1 Timothy Miller 2008-07-06 16:08:04 UTC
Typo:
I said "Add reads go to one drive"
I meant to say:  "All reads to go one drive"
Comment 2 Jeroen Roovers (RETIRED) gentoo-dev 2008-07-06 20:45:10 UTC
Just for posterity, what does /proc/mdstat tell you?

Oh, and are you sure it isn't a bug in sysstat (iostat)? Which version of app-admin/sysstat do you use? Which kernel do you use?
Comment 3 Timothy Miller 2008-07-06 21:30:15 UTC
# cat /proc/mdstat
Personalities : [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
unused devices: <none>

That's probably because I'm using fakeraid, not md.

# iostat -V
sysstat version 8.0.4
(C) Sebastien Godard (sysstat <at> orange.fr)

And
app-admin/sysstat-8.0.4-r1  

I posted to LKML on this.  Here's my post:
http://www.ussg.iu.edu/hypermail/linux/kernel/0807.0/2407.html
Other responses follow.

Arjan suggested that since my program was single-threaded, that would explain why all the IO is going to one drive.  So I rewrote my program to read files in four threads.  No change.  All reads are still going to only one disk.
Comment 4 Timothy Miller 2008-07-07 01:38:18 UTC
I have a little more info on this.  I'm not at all experienced with kernel hacking, so I'm kinda stabbing in the dark, but here's what I found.

You would think that since dmraid and mdraid are basically the same thing (the difference is where the disk layout data is stored and that the BIOS allows booting from the array), then they would use almost entirely the same code.

But they don't.

Under "linux/drivers/md", there are two raid1 modules.  "raid1.c" appears to be the mdraid module for RAID1.  "dm_raid1.c" appears to be the dmraid module for RAID1.  Since they both make the same kinds of low-level accesses to the hardware, there doesn't seem to be much reason to have them both.  The md version has a sophisticated read load balancer, while the dm version lacks one entirely.  In fact, the dm version is basically a stripped down, neglected version of the md driver.

Basically, dmraid users are disadvantaged for no particularly good reason.

I've posted to LKML a bit about this, but I have only gotten one response from Arjan, and it was inaccurate.  If the Gentoo developers (who, like myself, value efficiency) could push on this a bit, that would be fantastic.
Comment 5 Timothy Miller 2008-07-07 20:43:41 UTC
Here's my conclusion to this.

I would recommend keeping this bug open in some way as a nagging reminder to the kernel maintainers, because the fakeraid implementation is actually kinda broken.  But don't expect them to fix it any time soon.  

It's not generally documented well enough that fakeraid is a cheap compatibility hack and does not have the same features as softraid (even though it could).  So I might suggest putting some notes in your documentation on softraid and fakeraid that explains this and urges fakeraid to be avoided unless you NEED it, because its read performance is lousy.

Thanks!
Comment 6 Ian Stakenvicius (RETIRED) gentoo-dev 2009-06-04 17:10:04 UTC
..this functionality isn't part of the dmraid package, it's part of the dm-mirror kernel module (or maybe device-mapper package?).  Wouldn't this make more sense to be reassigned as a kernel bug / feature request?

Comment 7 Ian Stakenvicius (RETIRED) gentoo-dev 2009-06-04 18:36:17 UTC
Also, I just checked the code for gentoo-sources-2.6.29-r5, and there is a round-robin implementation for load balancing on reads now.

This bug can be closed.
Comment 8 Ian Stakenvicius (RETIRED) gentoo-dev 2009-06-04 19:25:06 UTC
(In reply to comment #7)
> Also, I just checked the code for gentoo-sources-2.6.29-r5, and there is a
> round-robin implementation for load balancing on reads now.

errm..  nvm. i misread rotation-on-errors for rotation-for-load-balancing.

Comment 9 Thomas Sachau gentoo-dev 2009-09-16 20:56:13 UTC
Nothing we can do about it, so i will mark it resolved/upstream after talking with Ian.
Comment 10 Timothy Miller 2009-09-17 01:40:44 UTC
There IS something you can do about this:  Mention this issue in the docs so that people don't accidentally choose dmraid when they should be using mdraid.
Comment 11 Ian Stakenvicius (RETIRED) gentoo-dev 2009-09-17 16:39:45 UTC
Makes sense...but since dm-mirror is a kernel module, it's the kernel help that would need updating to mention this.  Still deferring to upstream.

In the meantime:  http://en.gentoo-wiki.com/wiki/RAID/NVRAID_with_dmraid#About_the_Install
Comment 12 Timothy Miller 2009-09-18 00:33:37 UTC
Sweet.  You guys rock!
Comment 13 Sven E. 2010-06-29 20:06:16 UTC
Just wanted to mention:

The same is true for mdraid. I set up an md device as mirror of two SSDs, reading from the md device completely hits one of the disks (it doesn't seem to be determined which one, but as long as the process runs all reads are redirected to the same disk).
If we assume that the device is selected by some round robin mechanism, it only seems to choose one disk and keeps it for subsequent reads (maybe on a per process basis).
Yet, a working md mirror target should by itself properly distribute the reads on the underlying mirror devices, even for a single thread.

This is true for 2.6.33.5, I did not try .34 yet though.


(In reply to comment #12)
> Sweet.  You guys rock!
> 

Comment 14 Timothy Miller 2010-06-29 20:10:33 UTC
I don't know what you're looking at, but I've tested mdraid, and it definitely does round-robin.  If you do iotop while reading, you'll see that half the reads go to one disk and half to the other.  And also, I've looked at the kernel source, and there is code in there to do it.

On the other hand, dmraid has specific comments stating that they don't but should do round-robin.