Bradley Ward Allen < ulmo@Q.Net> wrote:
Ideas include:(So far, all I've done is suggest getting rid of the tools and putting them into the kernel; that's how I feel about it, this is a filesystem, not a toy.)
- Boot-up parameters to tell the kernel which devices are to be MD devices (no more ``
mdadd
'')- Making MD transparent to ``
mount
''/``umount
'' such that there is no ``mdrun
'' and ``mdstop
''- Integrating ``
ckraid
'' entirely into the kernel, and letting it run as needed
- Deal with arrays that can easily survive N disks going out simultaneously or at separate moments, where N is a whole number > 0 settable by the administrator
- Handle kernel freezes, power outages, and other abrupt shutdowns better
- Don't disable a whole disk if only parts of it have failed, e.g., if the sector errors are confined to less than 50% of access over the attempts of 20 dissimilar requests, then it continues just ignoring those sectors of that particular disk.
- Bad sectors:
- A mechanism for saving which sectors are bad, someplace onto the disk.
- If there is a generalized mechanism for marking degraded bad blocks that upper filesystem levels can recognize, use that. Program it if not.
- Perhaps alternatively a mechanism for telling the upper layer that the size of the disk got smaller, even arranging for the upper layer to move out stuff from the areas being eliminated. This would help with a degraded blocks as well.
- Failing the above ideas, keeping a small (admin settable) amount of space aside for bad blocks (distributed evenly across disk?), and using them (nearby if possible) instead of the bad blocks when it does happen. Of course, this is inefficient. Furthermore, the kernel ought to log every time the RAID array starts each bad sector and what is being done about it with a ``
crit
'' level warning, just to get the administrator to realize that his disk has a piece of dust burrowing into it (or a head with platter sickness).- Software-switchable disks:
- ``disable this disk''
would block until kernel has completed making sure there is no data on the disk being shut down that is needed (e.g., to complete an XOR/ECC/other error correction), then release the disk from use (so it could be removed, etc.);
- ``enable this disk''
would
mkraid
a new disk if appropriate and then start using it for ECC/whatever operations, enlarging the RAID5 array as it goes;- ``resize array''
would respecify the total number of disks and the number of redundant disks, and the result would often be to resize the size of the array; where no data loss would result, doing this as needed would be nice, but I have a hard time figuring out how it would do that; in any case, a mode where it would block (for possibly hours (kernel ought to log something every ten seconds if so)) would be necessary;
- ``enable this disk while saving data''
which would save the data on a disk as-is and move it to the RAID5 system as needed, so that a horrific save and restore would not have to happen every time someone brings up a RAID5 system (instead, it may be simpler to only save one partition instead of two, it might fit onto the first as a gzip'd file even); finally,
- ``re-enable disk''
would be an operator's hint to the OS to try out a previously failed disk (it would simply call disable then enable, I suppose).
Other ideas off the net:
- finalrd analog to initrd, to simplify root raid.
- a read-only raid mode, to simplify the above
- Mark the RAID set as clean whenever there are no "half writes" done. -- That is, whenever there are no write transactions that were committed on one disk but still unfinished on another disk. Add a "write inactivity" timeout (to avoid frequent seeks to the RAID superblock when the RAID set is relatively busy).