Always use pure vanilla kernel-sources from http://www.kernel.org/ to compile an openMosix kernel! Please be kind enough to download the kernel using a mirror near to you and always try and download patches to the latest kernel sources you do have instead of downloading the whole thing. This is going to be much appreciated by the Linux community and will greatly increase your geeky Karma ;-) Be sure to use the right openMosix patch depending on the kernel-version. At the moment I write this, the latest 2.4 kernel is 2.4.20 so you should download the openMosix-2.4.20-x.gz patch, where the "x" stands for the patch revision (ie: the greater the revision number, the most recent it is). Do not use the kernel that comes with any Linux-distribution: it won't work. These kernel sources get heavily patched by the distribution-makers so, applying the openMosix patch to such a kernel is going to fail for sure! Been there, done that: trust me ;-)
Download the actual version of the openMosix patch and move it in your kernel-source directory (e.g. /usr/src/linux-2.4.20). If your kernel-source directory is other than "/usr/src/linux-[version_number]" at least the creation of a symbolic link to "/usr/src/linux-[version_number]" is required. Supposing you're the root user and you've downloaded the gzipped patch file in your home directory, apply the patch using (guess what?) the patch utility:
mv /root/openMosix-2.4.20-2.gz /usr/src/linux-2.4.20 cd /usr/src/linux-2.4.20 zcat openMosix-2.4.20-2.gz | patch -Np1 |
mv /root/openMosix-2.4.20-2.gz /usr/src/linux-2.4.20 cd /usr/src/linux-2.4.20 gunzip openMosix-2.4.20-2.gz cat openMosix-2.4.20-2 | patch -Np1 |
mv /root/openMosix-2.4.20-2.gz /usr/src/linux-2.4.20 cd /usr/src/linux-2.4.20 gunzip openMosix-2.4.20-2.gz patch -Np1 < openMosix-2.4.20-2 |
... CONFIG_MOSIX=y # CONFIG_MOSIX_TOPOLOGY is not set CONFIG_MOSIX_UDB=y # CONFIG_MOSIX_DEBUG is not set # CONFIG_MOSIX_CHEAT_MIGSELF is not set CONFIG_MOSIX_WEEEEEEEEE=y CONFIG_MOSIX_DIAG=y CONFIG_MOSIX_SECUREPORTS=y CONFIG_MOSIX_DISCLOSURE=3 CONFIG_QKERNEL_EXT=y CONFIG_MOSIX_DFSA=y CONFIG_MOSIX_FS=y CONFIG_MOSIX_PIPE_EXCEPTIONS=y CONFIG_QOS_JID=y ... |
make config | menuconfig | xconfig |
Now compile it with:
make dep bzImage modules modules_install |
Reboot and your openMosix-cluster-node is up!
Before starting openMosix, there has to be an /etc/openmosix.map configuration file which must be the same on each node.
The standard is now /etc/openmosix.map, /etc/mosix.map and /etc/hpc.map are old standards, but the CVS-version of the tools is backwards compatible and looks for /etc/openmosix.map, /etc/mosix.map and /etc/hpc.map (in that order).
The openmosix.map file contains three space separated fields:
openMosix-Node_ID IP-Address(or hostname) Range-size |
1 node1 1 2 node2 1 3 node3 1 4 node4 1 |
1 192.168.1.1 1 2 192.168.1.2 1 3 192.168.1.3 1 4 192.168.1.4 1 |
1 192.168.1.1 4 |
If a node has more than one network-interface it can be configured with the ALIAS option in the range-size field (which equals to setting the range-size to 0) e.g.
1 192.168.1.1 1 2 192.168.1.2 1 3 192.168.1.3 1 4 192.168.1.4 1 4 192.168.10.10 ALIAS |
Always be sure to run the same openMosix version AND configuration on each of your Cluster's nodes!
Start openMosix with the "setpe" utility on each node :
setpe -w -f /etc/openmosix.map |
Alternatively, you can grab the "openmosix" script which can be found in the scripts directory of the userspace-tools, copy it to the /etc/init.d directory, chmod 0755 it, then use the following commands as root:
/etc/init.d/openmosix stop /etc/init.d/openmosix start /etc/init.d/openmosix restart |
Installation is finished now: the cluster is up and running :)
First of all, the CONFIG_MOSIX_FS option in the kernel configuration has to be enabled. If the current kernel was compiled without this option, then recompilation with this option enabled is required.
Also the UIDs (User IDs) and GIDs (Group IDs) on each of the clusters' nodes file-systems must be the same. You might want to accomplish this using openldap. The CONFIG_MOSIX_DFSA option in the kernel is optional but of course required if DFSA should be used. To mount oMFS on the cluster there has to be an additional fstab-entry on each node's /etc/fstab.
in order to have DFSA enabled:
mfs_mnt /mfs mfs dfsa=1 0 0 |
mfs_mnt /mfs mfs dfsa=0 0 0 |
[device_name] [mount_point] mfs defaults 0 0 |
With the help of some symbolic links all cluster-nodes can access the same data e.g. /work on node1
on node2 : ln -s /mfs/1/work /work on node3 : ln -s /mfs/1/work /work on node3 : ln -s /mfs/1/work /work ... |
The following special files are excluded from the oMFS:
the /proc directory
special files which are not regular-files, directories or symbolic links (e.g. /dev/hda1)
Creating links like:
ln -s /mfs/1/mfs/1/usr |
ln -s /mfs/1/mfs/3/usr |
The following system calls are supported without sending the migrated process (which executes this call on its home (remote) node) going back to its home node:
read, readv, write, writev, readahead, lseek, llseek, open, creat, close, dup, dup2, fcntl/fcntl64, getdents, getdents64, old_readdir, fsync, fdatasync, chdir, fchdir, getcwd, stat, stat64, newstat, lstat, lstat64, newlstat, fstat, fstat64, newfstat, access, truncate, truncate64, ftruncate, ftruncate64, chmod, chown, chown16, lchown, lchown16, fchmod, fchown, fchown16, utime, utimes, symlink, readlink, mkdir, rmdir, link, unlink, rename
Here are situations when system calls on DFSA mounted file-systems may not work:
different mfs/dfsa configuration on the cluster-nodes
dup2 if the second file-pointer is non-DFSA
chdir/fchdir if the parent dir is non-DFSA
pathnames that leave the DFSA-filesystem
when the process which executes the system-call is being traced
if there are pending requests for the process which executes the system-call
Next to the /mfs/1/ /mfs/2/ and so on files you will find some other directories as well.
Table 4-1. Other Directories
/mfs/here | The current node where your process runs |
/mfs/home | Your home node |
/mfs/magic | The current node when used by the "creat" system call (or an "open" with the "O_CREAT" option) - otherwise, the last node on which an oMFS magical file was successfully created (this is very useful for creating temporary-files, then immediately unlinking them) |
/mfs/lastexec | The node on which the process last issued a successful "execve" system-call. |
/mfs/selected | The node you selected by either your process itself or one of its ancestor's (before forking this process), writing a number into "/proc/self/selected". |
Note that these magic files are all ``per process''. That is their content is dependent upon which process opens them.
A last not about openMFS is that there are versions around that return faultive results when you run "df" on those filesystems. Don't be surpised if you suddenlty have about 1.3 TB available on those systems.