Bug 406759

Summary:	sys-fs/mdadm-3.1.4: silent failure with large volumes and metadata 0.90
Product:	Gentoo Linux	Reporter:	Patrick Lauer <patrick>
Component:	New packages	Assignee:	Gentoo's Team for Core System packages <base-system>
Status:	CONFIRMED ---
Severity:	normal	CC:	1i5t5.duncan, alexander
Priority:	Normal
Version:	unspecified
Hardware:	All
OS:	Linux
Whiteboard:
Package list:		Runtime testing required:	---

Description Patrick Lauer gentoo-dev

2012-03-03 12:32:56 UTC

# mdadm --version
mdadm - v3.1.4 - 31st August 2010

Sorry for not providing logfiles (once I got things to work I had no time to recreate the situation, and now that hardware is in use), but here's the steps to reproduce:

* take GPT disks with partitions over 2TB
  in this case: 3TB disks x2
* try to create an md with metadata 0.90 (kernel autoassembly mandates it)

Result: No visible error as if mdadm had completed properly, but /proc/mdstat shows no new device

Once --metadata=0.90 is removed from the mdadm command the new raid is created as expected:

md127 : active raid1 sda4[0] sdb4[1]
      2909189439 blocks super 1.2 [2/2] [UU]

Comment 1 SpanKY gentoo-dev

2012-03-03 19:49:06 UTC

and did you test with the latest ~arch version ?

Comment 2 Hank Leininger 2012-04-01 20:29:28 UTC

> and did you test with the latest ~arch version ?

Not the original poster, but I've just experienced the same thing using sys-fs/mdadm-3.2.3-r1.  Also reproduced with the latest mdadm git snapshot (fd324b0), running kernel gentoo-sources 3.1.0-r1.

3TB disk, gpart partition table, attempting to make a raid1 volume (two-disk, but with one of them missing initially; preparing to replace a pile of 500GB disks in my workstation).

This runs without any output:

# mdadm --create -l1 -n2 --metadata=0.90 /dev/md14 /dev/sdj4 missing
#

...But /dev/md14 is not created nor does an entry for it exist in /dev/.mdadm/map.

The same command with --metadata=1.2 does result in the array being created.
Also if I repartition the 3TB device to have two 1.5TB partitions, I can make v0.90 arrays on each of them. 

Autodetect is what I'm after, but from the mdadm man page, autodetect on gpart+v0.90 metadata probably wouldn't work anyway:

"--auto-detect
[snip]
Arrays can be auto-detected by the kernel if all the components are in primary MS-DOS partitions with partition type FD,  and  all  use  v0.90 metadata."

Note 'MS-DOS partitions of type FD', but for such large disks we must use gpt partitions.

But I think mdadm silently exiting with no error message is a(n upstream) bug.  Although it does at least exit(1), rather than exit(0).

Comment 3 Duncan 2012-04-30 17:48:14 UTC

(In reply to comment #2)
> 3TB disk, gpart partition table, attempting to make a raid1 volume
> (two-disk, but with one of them missing initially; preparing to replace a
> pile of 500GB disks in my workstation).
> 
> This runs without any output:
> 
> # mdadm --create -l1 -n2 --metadata=0.90 /dev/md14 /dev/sdj4 missing
> #
> 
> ...But /dev/md14 is not created nor does an entry for it exist in
> /dev/.mdadm/map.
> 
> The same command with --metadata=1.2 does result in the array being created.
> Also if I repartition the 3TB device to have two 1.5TB partitions, I can
> make v0.90 arrays on each of them. 
> 
> Autodetect is what I'm after, but from the mdadm man page, autodetect on
> gpart+v0.90 metadata probably wouldn't work anyway:

FWIW, I believe I read somewhere, probably on the btrfs list since I'm reading it these days (but not using it yet, it's missing the N-way raid1 I need still tho the feature is planned, and is otherwise still immature), that md/raid v0.90 metadata does not handle >2TB, and likely never will.  Of course it should spit out an error to that effect instead of being silent on it, but perhaps that'll be fixed with an update.

Meanwhile, I'm using v0.90 metadata here as well, but split my mds up on partitions, so don't have anything that big (by far).  I'm on v0.90 since root is on md and I don't want to fool with an initr*, but I don't actually do autodetect, because I only want the kernel to initially assemble the md containing root from the commandline, not all of them.  So I have autodetect turned off, and use md=3,/dev/sda6,/dev/sdb6,/dev/sdc6,/dev/sdd6 root=/dev/md3p1 on the (compiled-in) kernel-commandline instead.

What I'm not sure of is whether that md= on the kernel commandline will work with v1.x metadata as well, since I'm then not autodetecting, but rather assembling from the given commandline option, or not.  On the one hand, I see everything saying v0.90 is necessary for autodetect, and think since I'm not using that v1.x would be fine.  On the other, I see everything saying direct kernel assembly needs v0.90, and wonder if that assumes auto-detect or if it means even with devices specified.

I'd love to have someone confirm one way or the other on that.  I likely will myself whenever I reconfigure my raid, but for now, I don't know.  It may well be that supplying the devices to assemble on the commandline is all that's needed for v1.x and that it's only autodetect that's missing, but I simply don't know at this point.

Comment 4 Steven Peckins 2013-02-26 20:49:35 UTC

This is in the mdadm(8) man page for 3.1.4 (under --metadata).

"...the original 0.90 format superblock... limits arrays to 28 component devices and limits component devices of levels 1 and greater to 2 terabytes."

(But if it's going to fail, especially with a non-zero exit status, it probably should print a message, too.)

It seems like people are using the 0.90 superblocks because they want kernel autoassembly.  Presumably this is because they want root-on-RAID.  FYI, this is possible without autoassembly and without an initramfs if you supply the right kernel parameters (and md is not compiled as a module):

E.g., kernel /linux-2.6.39 root=/dev/md127 md=127,/dev/sda,/dev/sdb,/dev/sdc raid=noautodetect

This will work with the newer 1.* superblock formats.  See /usr/src/linux/Documentation/md.txt for more details.