I’m in the middle of reading Jef Raskin’s (started a project in the 70s called Apple Macintosh) book, The Humane Interface. It’s a terrific read that tells us all about what’s wrong with software as it stands today. Although Jef died in 2005, his son Aza is carrying on the work because it’s just too good to leave unfinished. Aza formed a company called Humanized.com and their product is called Enso (Windows only for now). It sits between the user and the system and applies the speed, power and efficiency of the command line with the elegance of the GUI. The closest thing I’ve seen to it is Firefox Ubuiquity (Also written by Aza Raskin), only this applies to the entire system.  While it’s not perfectly in line with Jef Raskin’s ideals, it’s a pretty nice step to getting there.  Check out the link to the demo video above to get a peek into what could be possible.   Enso is a command line for the non-technical person.  Computers don’t have to be unfriendly.


Difficulty Level: Moderate

What is a Network Block Device and Why Would I Want One?

Let me start this entry out by explaining just what a block device is, in case you’re a newer Linux user and you’re not sure. I didn’t know what one was at one point and a quick explanation would have been helpful. In short, block devices are things like hard drives, flash drives, floppy disks, CD-ROMs and even DVDs. On a more complex level, they are devices which get their input/output in the form of data blocks of a certain size in bits or bytes. For the sake of this discussion, we’ll just be thinking of the devices I listed above.

The Linux kernel among it’s many modules (which can be thought of as drivers) has a particular module called ‘nbd’ which stands for Network Block Device. What this means is that you can take almost any block device and present it to the network. This differs from standard Windows file sharing, or Unix NFS in that you’re not presenting a set of files to the network, but the raw device itself. (iSCSI is another way of accomplishing the same thing with a greater degree of reliability and is accessible to Linux through the Linux iSCSI Target and Linux iSCSI Initiator projects.  NOTE: Think of the target as the server, and the initiator as the client.) Although there are other ways of doing this, network block device support is still nothing to ignore. It may not be robust, but it’s still highly useful for non-mission critical tasks where the expense of a SAN is unwarranted.

I originally found out about nbd when I was first starting to work with Xen virtualization. Their project documentation suggested that a convenient way of being able to store virtual machine images on a server was to use nbd support. When I understood that this was the way to a “poor man’s SAN” since Linux software RAID and LVM volumes could be exported with nbd, I then wondered, what else could I export? I decided to experiment and find out. While it’s not perfect because of some I/O control limitations, it’s still quite handy and simple to implement.

Making NBD Work for You:

The first step towards preparing your system for NBD is determining if you need to recompile your kernel.  Fortunately, most of the popular Linux distributions tend to build most of the optional support as kernel modules by default.  You can think of modules as “drivers” for different functionality and hardware support.  In most cases, you should have access to NBD support.  The simplest way to find out is to get to a shell prompt and type: ‘modprobe nbd’.  If you get no error and simply return to the prompt, then type: ‘lsmod | grep -v grep | grep nbd’.  If you see a line indicating that the nbd module is loaded, your kernel has NBD support built in.

However, if you get an error with the ‘modprobe’ command about the module not existing, or you don’t get anything in return for the ‘lsmod’ command, then it’s likely you’ll need to recompile your kernel.  Recompiling the kernel is beyond the scope of this blog entry, but I will link to some resources to get you started in the right direction at the end of this entry.

Three Components to NBD:

There are three components that make up the entirety of NBD for Linux.  The first is the kernel module which allows the kernel to provide an interface to the device for export to/from the network.  The second component is the ‘nbd-server’ application which handles the exporting of the device over TCP/IP.  And finally, the third part is the ‘nbd-client’ application which imports the device on another machine and presents it as /dev/nbX where ‘X’ is a number.  Depending on the distribution you use, you may be able to find a specific package to install the applications.  If not, there is always the source code from the main NBD project site.

Once you have the kernel module loaded and the applications built here is all you need to do to test it and see if it will work for what you need.  This is an example of exporting a raw unpartitioned IDE hard drive over TCP/IP and then importing it on a remote system:

1. On the system containing the hard drive, run the nbd-server command as follows  (Syntax: nbd-server -r <tcp port> /dev/xxx):

nbd-server -r 2000 /dev/hda

2. On the remote system where you wish to import the devicem run the nbd-client command as follows (Syntax: /nbd-client <ip address of system running nbd-server> <matching tcp port number> /dev/nbd0)

nbd-cliet 192.168.1.1 2000 /dev/nbd0

You should then be able to treat /dev/nbd0 on the system where you imported the device as if it were local.  Use ‘fdisk’ to partition the device, format it for use as a file system, or even a paging file.  I’ve used this successfully with remote flash drives, raw hard drives, partitions on hard drives, LVM logical volumes, and even DVD drives for playing movies on devices that don’t have DVD drives.

A few extra notes:

1. Depending on the kernel version, the NBD device nodes might be /dev/nbdX or it might be /dev/nbX where ‘X’ is always a number.

2. There is a one-to-one relationship between the exported device and your chosen TCP port.  That is to say, that if you use port 2000 for /dev/hda and want to export /dev/hdb simultaneously, you’ll need to increment to the next free port.

3. Before randomly chosing ports, it’s a good idea to take a look at the commonly used ports listed in /etc/services as well as run a ‘netstat -an’ to see what ports your particular systems are using.

4. The performance of the DVD export over an 802.11bg link is quite good after an initial buffer period for something like Xine.

5. It’s quite possible to use an imported NBD device as part of a mirror set if you want a pseudo instant copy on a separate machine, but it’s not highly recommended.

Recompiling the Linux kernel (which seems to be a dying art):

http://www.digitalhermit.com/linux/Kernel-Build-HOWTO.htm


What Is This All About?

Every pioneering, epic journey worth taking from point A to point B requires asking for directions, wrong turns, heading in the completely wrong direction, or even getting lost. After all, you’re the scout. The first person to be sent out with enough knowledge to try and find your way to the highly desired point B. Your chances of success are greatly improved if you have some important skills and tools to take along with you. With that introduction, my intended goal of sharing some of my knowledge and tools with you regarding the world of Linux distributions can begin. Of course, before I proceed I should make it clear that I’m no expert and won’t pretend to be one. I’m just a keen experimenter, and a fan of what Linux distributions have enabled me to do. With that I begin my “From Here to There” series.

I was originally going to write a blog post on using Network Block Devices in Linux, when I kept running into the same problem over and over. My hope was that I could write an article simple enough for anyone to grasp. However, each time I went back to the entry, I discovered that there was yet another thing that I take for granted, that many Linux users might not understand. Each time I attempted to remedy that situation with some general guidelines, I found that I had to cover exceptions for the multitude of distributions out there which would really make the blog entry far too long and far too complicated for my intended audience: Linux newbies, and the highly curious users of other OSes.

So I sat down for a while and tried putting my finger on how it is that I can move between the Linux distros fairly easily and get work done. I’m no computer expert. I’m not a programmer. I’m just a basic system admin who went from Mac (pre OS X) to Windows (3.1 through XP) and got involved with Linux in 1997 out of necessity. While thinking, I realized that there are many things that we computer folks take for granted about what we do normally that fall into the category of “missing knowledge” for the average user. Without that knowledge, understanding how to use a computer is next to impossible. Many people before me have written about people who simply learn how to use specific brands of software, and do not acquire the application literacy needed to be able to use any software of the same category (ie. word processors). That’s not what the “Here to There” series will be about.

Instead, it will be my intention to try and share some of the more basic skills that I take for granted, when I use computers, that others might be unaware of. Everyone who is at an intermediate level or higher, with regard to computer use, has skills like these. But we rarely if ever discuss them because to us they are second nature. So the “Here to There” series will attempt to get people from point A to point B with a clear understanding as to how and why the journey is worth it. Since my experience is largely with Linux and Free/Open software, that is where some of my focus will be. But, in all honesty and with only a few exceptions, there really is little difference between OSes and applications today. As a result, much of what I share should apply to many OSes and applications. I will be attempting to post the first in the series this week.  (Correction: Life, as usual, has become fairly busy.  So I will be starting the series sometime soon.  The first part should appear when it’s ready…)


Ingredients:
– One X server running on the machine in front of you. (Think of an X server as a virtual video monitor with thousands of input jacks and the remote applications as different video sources)
– One OpenSSH server on the system you are connecting to, that must be configured to (tunnel) forward X protocol

NOTE: If you are on the Windows platform, you can either purchase a commercial X server, or use Cygwin’s excellent free X server. Again… try to imagine that the X server on the machine in front of you “virtualizes” your monitor and makes it “networkable”. When you run the X server, it opens a connection on your machine listening for traffic coming from either the network, or (on *nix) shared memory. When you run an X application, it is instructed to connect to the X server so the Xserver can display it’s output. (Just like plugging in a game console to one of your TV/Monitor’s composite inputs)

Today we will talk about running X applications remotely using OpenSSH. Normally if you run X applications remotely, your X protocol traffic is going over your network connection out in the open. This is all well and good if you can trust the network that your X traffic is travelling on. But, what if you can’t? This is where ssh and X make a pretty good team. You still run your X server on the machine in front of you like usual. But instead of instructing the remote application to connect directly to it, you use OpenSSH’s X Protocol Forwarding so that all X traffic is sent through an encrypted TCP tunnel.

SSH Server Side Preparations
You will need to edit your sshd_config file which controls how your SSH server works. You would make these changes on the machine you are connecting to. First, find your sshd_config file. Typically, it’s in /usr/local/etc if you compiled the OpenSSH suite yourself or if the package maintainer went with defaults. In other cases it could be in /etc/openssh or /usr/local/etc/openssh. To verify for your distribution, you can run the ‘find’ command:

find /usr -name sshd_config

Once you’ve located it, make sure you add these lines to it for TCP and X Forwarding:

AllowTcpForwarding yes
X11Forwarding yes
X11DisplayOffset 10
X11UseLocalhost yes

You will then need to restart your ssh server so that it reads the new configuration.

SSH Client Side Preparations:

A quick aside about the .ssh directory in your home directory. Not everyone is familiar with the purpose of this directory, but to simplify using OpenSSH, it’s another essential tool. If you don’t already have one, create a text file in ~/.ssh called ‘config’ and add something like this to it, and save it:

host work
hostname 192.168.1.10
User george

Now, if I type ‘ssh work’ it will automatically try to connect to 192.168.1.10 passing ‘george’ as the user name. Obviously, you will need to adjust this appropriately to your correct IP and username. Combine that with a shared key for passwordless connection, and your life with OpenSSH becomes a lot easier. Now back to the task at hand.

Testing from the workstation in front of you:
Now we can test and see if we can forward a simple X app over the ssh tunnel. Assume the existence of ‘~/.ssh/config’ file and connection profile I create above. Also assume that the remote system has the simple X application ‘xeyes’ in /usr/bin:

ssh -X work “/usr/bin/xeyes”

We should now be seeing the familiar googly eyes peeking at us. All the application execution is happening remotely, but displaying in front of us and it’s coming over an encrypted ssh tunnel to boot!

Add X Forwarding to Your ~/.ssh/config File:
Instead of having to type ‘ssh -X work [some app]’, you can instead enable X forwarding from your ~/.ssh/config profiles. For each connection profile you create, you can add:

ForwardX11 yes

This means that all you would have to do to run a remote X app is either log into a shell using that profile and type the name of the X app you want to use, or… You can create a script to run the app using ssh and make an icon for it on your Windows Quick Launch bar or Gnome Panel. A sample script in Linux would be:

#!/bin/bash

ssh work “/usr/local/bin/gimp”

Pretty simple, huh? In Windows, you can write a CMD file using about the same syntax:

C:\cygwin\usr\local\bin\ssh work -F “C:\Documents and Settings\george\.ssh\config” “/usr/local/bin/gimp”

You can argue that this is a form of “application publishing” to use a friendly term. But it’s really a way of exploiting the features of X in a more secure way and without needing to open anything other than port 22 for OpenSSH. Once everything is configured, it works pretty seamlessly as well.

Compression:
This X traffic can take up a good deal of bandwidth since it is quite chatty back and forth, and I personally don’t use it unless I have a fast connection (DSL 1.5M or better). In the past I used to prefer ‘vnc’ over ssh for most instances and these days I use Nomachine NX protocol (which I will discuss at a later date) for remote desktop access. However, there is something you can do which might help out a little in terms of speed with X if you really don’t have any other options. You can compress your ssh traffic. Just add these lines to each host profile in your ~/.ssh/config file:

Compression yes
CompressionLevel 9

You can set your CompressionLevel to anything between 1 and 9 with ‘1’ being the fastest but worst compression, and 9 being slow but better. There is a slight improvement in X application performance. This compression applies across the board to any ssh traffic for that connection profile though, so it’s handy to add it to your slower connections.

Final Words:
Again, I don’t pretend to know everything there is about ssh or X and I am sure there are other ways that this can be done better. If you know of any, I am hoping that some of you will have more suggestions that readers here can share.  Please comment.


Any of us who are familiar with OpenSSH assume that everyone who wants to use it knows how to. But every day, there is some new thing about it that we learn and there are plenty of people who know little about it. I am hoping to share some of what I consider to be essential knowledge about the OpenSSH client as well as dispel some misunderstandings. It is quite a powerful tool when you really delve deeply into it.

Q1. Isn’t OpenSSH just an encrypted telnet program?
A1. Short answer, no. The more complete answer is that it’s a suite of programs that provide:
-remote shell access (ie. like telnet)
-remote execution of programs (both text and gui)
-remote data pipes for programs that use standard in/out
-data compression
-TCP tunneling
-ftp-like file transfers
-rcp-like file copying
-public key encryption (of all data passed between client and server)/authentication (no need for passwords)
-GUI login prompt for remote execution of X applications with ‘gnome-ssh-askpass’
-More recently, VPN functionality by way of the Linux tun/tap virtual network device driver

And I’m sure there’s more… I’m kind of an intermediate user of OpenSSH.

Q2. Setting up tunnels is a pain. What’s up with this Local/Remote Forward stuff?
A2. Actually, ‘man ssh_config’ is your friend. If you become familiar with the ~/.ssh/config file, you will find yourself not needing to type much to make connections with OpenSSH. Nearly every command line option for the ssh client can be controlled in this file. For example, I’ve set up some parameters in my ~/.ssh/config file and called the profile “home”. Now I just type: ‘ssh home’ and I’m in with all the client options in place. The following is an example of some useful things to put in your ~/.ssh/config file:

-Assume that I am connecting from internally at my home (192.168.1.0)
-192.168.1.5 is my web server at home
-192.168.1.2 is my workstation at home
-We’ll pretend my workstation at work has TCP port 5022 accessible on the internet for it’s OpenSSH server and that it’s internet routeable address is 192.168.50.1 (which we know is in private address space. Just pretend…)

Example ‘~/.ssh/config’:


# My ‘webserver’ connection profile. All I
# need to do to ssh into the web server now
# is type ‘ssh webserver’. I am automatically
# prompted for the password for ‘george’.
host webserver
hostname 192.168.1.5
User george


# My ‘work’ connection profile with non
# standard port for ssh (5022).
#
# I’ve also included one LocalForward line to
# forward port 80 from a web server at work to
# port 4080 on my workstation in front of me.
# So… if I connect with ‘ssh work’ and log in,
# and point my browser here at home to
# 127.0.0.1:4080, I see the internal web site
# at work here at home.
#
# The RemoteForward line works in reverse. It
# sends specified TCP ports from my workstation
# in front of me at home to my workstation at work. In
# this case, OpenSSH is pushing port 5900 (vnc)
# from 192.168.1.2 (here at home) to my
# workstation at work. If I leave this connection
# up and go to work. I can run ‘vncviewer
# 127.0.0.1:0’ in my office at work and log into my workstation at home
# with vnc if needed.
host work
hostname 192.168.50.1
port 5022
User george
LocalForward 4080:privateintranet.employersnet.com:80
RemoteForward 5900:192.168.1.2:5900

Q3. Yeah… but I use Windows and I don’t have time to mess with Linux. So how does this help me?

A3. The best way to get OpenSSH (client and server) going under Windows and enjoy all the benefits is to install and use Cygwin (click here to download the installer). It’s pretty straightforward if you aren’t the sort who is afraid to get into a little *nix command line on your Windows box. You have the option of either installing a full Cygwin environment or just installing the needed base components, OpenSSL, OpenSSH, and some admin utilities to run OpenSSH as a Windows service. There are a few sets of instructions on the Internet to get you started, this being one of the better ones. At one point there was a Windows installer for OpenSSH, but it is no longer maintained and so is too out of date to consider at this point. The recommended path is Cygwin. Finally, if you’re the kind of person who uses Windows and Linux and you compile stuff from source, that is also an option with Cygwin. Just make sure you have the GNU tool chain installed in Cygwin.

Q4. Why are the docs about OpenSSH on the net so hard to understand?
A4. It took me a good deal of digging to try and find some useful info for tunneling and Public Key info for OpenSSH when I was first starting out. So, yes, the documentation could use some major improvements for people who are a little less technically inclined. To be honest, I think a nice GUI framework around OpenSSH would go a long way to getting more people to use it. There is PuTTY for Windows and it can be made to work in the context of what we’re talking about here, but it has the problem of just presenting itself as a telnet-like client and immediately turning people off who don’t need “telnet”. For example, “I never use shells or command line, why would I need an ssh telnet client”? Trying to convey the fact that telnet and OpenSSH are not the same thing, is difficult. If there was just an OpenSSH standalone “Tunneling” GUI app, I think more interest in OpenSSH would grow on the Windows side. Still, when it comes to the basics of OpenSSH on Unix, the man pages are currently the best resource. The best places to look are:


‘man sshd_config’ – Tweak your ssh server to do exactly what you need
‘man ssh_config’ – Find out what else you can do with ~/.ssh/config to minimize your command strings
‘man scp’ – Learn how to copy files AND directories using ‘scp’
‘man sftp’ – A command line FTP-like interface for putting and getting files viw OpenSSH (I used to use this all the time, but have since moved to ‘scp’.)
‘man ssh’ – Check out the less frequently used options.

Q5. Someone mentioned that I can use ‘ssh’ in combination with ‘tar’ or ‘rsync‘ for remote backup. Is that true?
A5. More or less, depending on what you consider a useful backup. I’ve used ssh and tar for “imaging” Linux boxes. It works well, but has expected limitations. A quick example of using ssh, tar, and gzip to “image” BoxA to BoxB. Assume that we have set up ~/.ssh/config to include all needed info for username and hostname:

Backing up ‘/’ on BoxA to BoxB, intitiated from BoxB:
ssh BoxA “tar -cf – / –exclude=/proc/* –exclude=/var/tmp/*” | gzip -c > /home/admin/images/BoxA.tar.gz

Restoring ‘/’ to BoxC from the archive on BoxB, intiated from BoxC:
ssh BoxB “gunzip -c /home/admin/images/BoxA.tar.gz” | (cd / ; tar -xvf -)

You can also use the excellent ‘rsync’ command to synchronize two directories on two different machines and with the ‘-e ssh’ option tunnel it all through OpenSSH for encryption. I use this method to backup the family photos from the file server at home to a file server at my parent’s house on a nightly basis.

Using ‘rsync’:
rsync -auvlxHS -e “ssh george@remoteserver:/remotedirectory /localdirectory
Remember ‘man rsync’ is your friend…

Q6. Did you say something about VPN?

A6. Yes. With a big note saying that I’ve not tried this yet: As of version 4.3 of OpenSSH, when used in combination with the Linux kernel (or the Windows TAP/TUN Win-32 driver) tap/tun module allows for real IP tunnels between both endpoints. These tunnels are capable of relaying TCP, UDP, ICMP traffic. This is not port forwarding as discussed above, but instead true VPN. I will likely post some more info once I do try it out. I’ve been using the OpenSSL based OpenVPN for my VPN needs and have been fairly happy with it. However, it might be nice to standardize on OpenSSH for VPN.

Well… that’s it for now. There’s more to come. I know that there are people more equipped to discuss this than I am, but I am hoping to attract the curious who haven’t had the time or energy to try. I encourage anyone who is curious to give it a try no matter what platform you’re on. In the long run it’s quite a valuable tool.


The Scenario:
You’re on a system with a much older version of NIX on it than you should be using. In the middle of doing your work, you find the certain utilities are missing, or are old enough that they have some key features missing. What do you do? Tell your boss the system sucks, and go buy a new one? Wipe the box and all the important data you’re supposed to be working on with a new installation of some free *nix? Throw up your hands and do everything manually? Give up?

The Solution:
In most cases, if you’re working on a box like this, it’s likely that your dealing with text output. That was what I was dealing with and the version of egrep was too old to parse the following:

egrep '(cn:|inetUserStatus:|mailUserStatus:)'

So what did I do? I took advantage of the newer egrep on my Linux workstation using ssh:

/opt/iplanet/server5/shared/bin/ldapsearch -D "cn=MailAdmin" -w t1ck3tsplz -b "o=child.org, o=parent.org" "uid=*" | ssh george@10.0.1.25 "egrep '(cn:|inetUserStatus:|mailUserStatus:)' -" | less

The end result is that the text stream from the old *nix box is sent via ssh to my workstation where I run the egrep using STDIO. Works like a charm! And it will work with any command that outputs text that you need to process somehow. Hope this helps someone else in a similar bind.


Enter Commercial Xen (VirtualIron):

Last year, I was planning a migration away from a mail system I wasn’t happy with to the Zimbra groupware system (which looks and works great). However, I really wanted nearly unstoppable uptime even in the event of hardware failure. I knew that Xen’s live migration capability would offer me that. Using the freely distributable open source software I ran into several issues during implementation. VirtualIron was the product that finally came in to solve all of those problems.

When I first set out to virtualize Zimbra, I tried installing it on a RedHat Enterprise Linux (RHEL) paravirtualized machine running on top of Gentoo Linux with a Xen kernel. As soon as I tried to install it, Zimbra complained that I needed to have NPTL (native POSIX thread library) support in the kernel. This was not possible with Xen in paravirtualized mode. The only options I had were to run RHEL on bare metal, which would not afford me the unstoppable uptime, or to run it in a Xen HVM (full virtualization) environment. For my second attempt, I chose the second route. I was unaware of these issues when I gave Xen a second try. I mentioned in part 1 that HVM has network and disk I/O issues, but they are resolved by VirtualIron’s VSTools software.

False Starts:

I got a system that I could test with and set up a TEST Zimbra environment with CentOS 5 as Domain0 and RHEL 5 as an HVM domain. Fortunately, I ran into problems pretty quickly while testing. The first big one, was that fully virtualized Xen HVM domains CANNOT be live migrated or paused. The second issue was, of course, the bottleneck in the RAM utilization on Domain0. If your HVM domain’s disk and network I/O is very high for an application, you’ll likely wipe out all the RAM in the Domain0 and performance will suffer as your disk and network I/O attempt to work via swapping or experience long wait states. The third point, which wasn’t really an issue but more a concern from experience, was that this is whan I found out that Xen’s fully virtualized environment was really a specialized QEMU process. This was a bit worrying to me given that I wasn’t too impressed with QEMU’s performance at that point in time. Specifically because of the disk and network I/O issues mentioned above. But, I didn’t yet know the details as to the cause.

So after seeing poor disk and network performance, I did more research and more digging around for other possible approaches. I briefly considered the OpenVZ project which doesn’t really virtualize, but is more akin to chroot. While it’s quite useful and can do many of the same things that Xen can do, it’s a completely different approach and one that I wasn’t fully comfortable with. Specifically because all virtual environments are running under one Linux kernel. Then I found a blog entry comparing virtualization techniques and noted a reference to VirtualIron’s Xen base product that explained the limitations of Xen 3.x’s HVM domains and how VirtualIron worked around them. Knowing what I knew now, I recommended that we purchase VirtualIron for production. For my third attempt, we bought VirtualIron’s version of Xen which turned out to be very nice. I was expecting “your grandfather’s virtualization techniques”, but I was completely mistaken as I would find out later.

Learning VirtualIron:

One of VirtualIron’s big points is that they don’t use paravirtualization at all. This isn’t really a good or bad thing, it’s just their way of approaching virtualization. They have also been contributing back to the Xen project, so good on them! Instead, they chose to focus on the special version of QEMU included with Xen to bring it up to speed for their product. So they made sure it could do live migration! They also worked around the disk and net I/O issues by creating custom drivers and management software (The aforementioned VSTools) to be installed in the guest after you have the OS running. This limits your choice of OS to run in HVM domains unless you’re willing to build your own VSTools from their recently opened source. They currently support Windows guests up to Windows 2003 Server, and many of the most common “big name” Linux distros. Previously, VirtualIron was using a different proprietary virtualization technology which gave them a chance to develop their management tools into the robust system they are today. They moved over to Xen later but kept their management methodology which works quite nicely with Xen’s best features.

For our solution, we got two big servers for physically hosting our VMs. HP servers with 32 gigs of RAM each, and two dual-core Xeon 64-bit CPUs each. They also have fiber channel interfaces that connect to an HP SAN back-end where we store our HVM domain images. Original I had assumed that I would install VirtualIron on each of these boxes over top of a normal Linux distro, just as I did with Xen kernel installations or any other typical virtualization technology. I did just that and was lost for a bit since this was the completely wrong approach. All it seemed to do was install a DHCP server, a TFTP server, and a Java App server (Jetty if you’re curious) and no Xen kernel. Where was the Xen kernel for the systems to boot into? Digging into their online documentation provided me with a diagram of the architecture of VirtualIron which clarified things considerably. The Java based management interface for VirtualIron contains a “walk through” set up document in a pane on the right hand side of the interface. THAT is where I finally learned and understood the actual architecture and layout. My assumptions were originally completely incorrect. I should have read the manual first!

To use VirtualIron Enterprise (we didn’t go with their Single Server product which DOES work like VMWare and others) you need at least one “management server” and one “managed node”. The management server can be one of a few supported Linux distros, or Windows. The fact that the manager could be a Windows box really confused me at first, because I couldn’t understand how they would get a Xen kernel installed under an already existing Windows installation. (Yes Virtual Iron’s manager can easily be installed on a Windows server to manage the raw hardware nodes.) Again, I was still doubtful about their approach and so was wrong in that line of thinking. But, once I understood the architecture, I was both in awe and very eager to see this thing work. So I proceeded…

The VirtualIron Way:

In my case, I have two managed nodes (those monster servers with 32 gigs each) and one manager (a Xeon dual CPU 32-bit system with 2 gigs of RAM and dual NICs). The manager is running CentOS 4.5, which is supported by VirtualIron as a host for the Enterprise manager. Once I had that installed and had the management network up (you need a separate LAN dedicated to the manager and each node that you can consider “out of band”), I set one of my managed nodes to do a PXE network boot off of the manager. That’s correct, you DON’T need to install a single thing on the managed nodes, they boot via the network from Xen images stored on the manager. It’s all diskless booting via the NICs. The TFTP server and the DHCP server give this box an IP address, and point it to a pre-configured Xen microkernel boot image. Their boot image is a Xen hypervisor with a very stripped down Suse Linux Enterprise 10 (SLES10) on it. So stripped down that the managed nodes can run headless as there is ZERO interaction on those boxes other than the power button.

Once the managed node loads it’s boot from the network, it shows up in the Java management interface and you’re ready to create VMs and assign them RAM, CPU, network and storage. (NOTE: You need to check their hardware compatibility list to see if your intended server hardware is supported) In our case, the SLES10 image has drivers for our Emulex LightPulse fiberchannel HBAs, so LUNs presented by the SAN are fully accessible from within the VirtualIron manager (storage connects to the virtualization nodes, not the manager). Once VirtualIron was up, I was off and running installing RHEL 4.5 for my Zimbra installation and the special drivers and software to improve disk and network performance as well as enable live migration for HVM domains. Performance of the virtual machine was definitely very impressive. Also, keep in mind that the managed nodes that host your HVM domains don’t need to have anything installed on them in any way at all. No OS, no kernel, no boot loader, absolutely nothing. This makes the nodes essentially hot swapable as long as you keep your VM utilization across both low enough that one node can host all VMs. Beyond that, not only do they run headless, but you don’t need ANY storage in them at all if you don’t want it and have a SAN or other remote storage like iSCSI. All VM configuration resides on the managing server. So that’s the system you want backed up reliably. But there is literally nothing on the virtualization nodes at all.

Giving it a Go:

After I got it all up and running and had a Zimbra TEST instance running with VSTools installed, it was time to try a live migration. I opened up the groupware web based client and started some work in it to mimic a standard user. Then I opened the VirtualIron manager and located the Zimbra TEST instance in the Virtual Machines tree. I clicked on it and dragged it to the other physical host to initiate the migration. Then I went back and started working in my Zimbra session while I watched the migration proceed. At this point, the manager tells Xen on the source node to dump the contents of the memory for the domain that will be moved. That memory state is copied to the destination node (the other physical host) and then a final sync is done before the copy is brought up to a live state on the destination and original domain on the source is extinguished.

As this was happening, my Zimbra client continued to function completely normally. An end user wouldn’t notice a thing. For extra points, I actually shut down the physical host that originally contained the migrated domain. Absolutely no effect. It was like the HVM domain was never shutdown or moved. I’ve since made use of the live migration feature for a variety of situtions where I didn’t want down time for services but needed to make a change to hardware. After almost a year of using VirtualIron, I’m very satisfied with it from a performance perspective. VirtualIron is quite powerful and relatively inexpensive compared to other high end solutions. It can bring the power of Xen virtualization to anyone who wants it, even if they’ve never touched Linux at all. I find that to be quite an amazing thing considering Xen’s complexity. So if you have a need to run an unmodified OS (Windows or a supported Linux distribution) on Xen, I would highly recommend the VirtualIron product. However, if you are like I am at home, and run Linux nearly exclusively on the server side, consider the open source version of Xen itself using paravirtualization. Paravirtualization does not have the disk and network performance issues that HVM does.