Gentoo Cluster File System Selection Guide Brian Jackson Clusters benefit from advanced file systems. This guide offers a quick run-down on the possible file systems available. 1.0 2004-12-28 Selection of a File System for a Cluster
Introduction

For purposes of this document we will assume there are 2 types of clusters.

  • High Availability (HA) clusters
  • High Performance Computing (HPC) clusters

To select the proper fs it is important to know the features of each filesystem and the features that you need out of a filesystem.

NFS

Classic network filesystem.

Very stableMany SPOFsWidely used
Advantages Disadvantages
OpenGFS

multimaster read/write shared storage filesystem.

No single masterStill has a SPOF
Advantages Disadvantages
Oracle Cluster File System

shared storage filesystem

Strong commercial backingNot useful as general purpose file system yet
Advantages Disadvantages
Lustre

Lustre is a novel storage and file system architecture and implementation suitable for very large clusters.

Well suited for very large clustersStill quite newNot all redundancy features are implemented yet
Advantages Disadvantages
Coda

Advanced network filesystem with origins in AFS2

Client side caching (i.e. disconnected operation)it's crapServer replication
Advantages Disadvantages
Intermezzo

Very similar feature-wise to coda.

What File System to Choose?

NFS is probably the best choice for most cluster's just because of it's pervasiveness, stability, and relative speed.

For a high availability cluster you can either use a regular single node fs (reiserfs,ext3,etc.) that is only mounted on a single node at a time.

For HPC clusters lustre or some mixture of OpenGFS and NFS is a good choice

For those inbetween type clusters (lvs, etc.) NFS is a good choice for most, but you could also use any of the above or a mixture dpending on your requirements.

Most people are aware that NFS has no type of redundancy built into it, but there are things you can do to make a single NFS box more highly available. Use redundant power supplies, some redundant level of RAID, quality hardware, traditionaly speaking scsi drives have been better suited to round the clock operation, etc.