I am sure some of you are wondering how to migrate or move individual SmartOS Virtual Machines from one SmartOS Server Node to another.
This may be necessary in certain situations such as:
- Backup and Disaster Recovery purposes: In situations that require you have a standby node available with a backup of your running VM’s that can be brought up quickly should a failure occur.
- SmartOS server load balancing: In situations where one SmartOS Node is overloaded and you need to move VM’s to another server node that has more capacity available.
- Deploying Clones of your SmartOS VM’s to new server nodes.
- Moving a Virtual Machine to another geographic zone for lower latency or to adhere to legalities that requires IT infrastructure reside physically in a certain country.
Folks, before we get started it is important to mention that this multi step manual method of accomplishing this will probably not be required in later releases of SmartOS. I am told that members of team at Joyent are planning on integrating this directly into “VMADM” at some point in the future. In addition these instructions are specifically for KVM branded Virtual Machines. There are tools in SmartOS such as “zonecfg” which is used to manage all types of zones, but as far as I can tell they mainly work with Joyent branded zones and did not work for me with KVM Virtual Machines. Please let me know if anyone has either managed to use them successfully or has a more elegant way of accomplishing this.
In the mean time the below steps worked perfectly for me and is confirmed to work even after a node reboot.
The Steps Involved
On the Source SmartOS Node:
zfs snapshot -r zones/cb18400e-0df6-40a1-b6d8-f99d6f53e9b0@migration zfs snapshot -r zones/cb18400e-0df6-40a1-b6d8-f99d6f53e9b0-disk0@migration zfs send -R zones/cb18400e-0df6-40a1-b6d8-f99d6f53e9b0@migration | ssh root@10.1.1.63 zfs recv zones/cb18400e-0df6-40a1-b6d8-f99d6f53e9b0 zfs send -R zones/cb18400e-0df6-40a1-b6d8-f99d6f53e9b0-disk0@migration | ssh root@10.1.1.63 zfs recv zones/cb18400e-0df6-40a1-b6d8-f99d6f53e9b0-disk0 scp /etc/zones/cb18400e-0df6-40a1-b6d8-f99d6f53e9b0.xml root@10.1.1.63:/etc/zones/
On the Target SmartOS Node:
echo 'cb18400e-0df6-40a1-b6d8-f99d6f53e9b0:installed:/zones/cb18400e-0df6-40a1-b6d8-f99d6f53e9b0:cb18400e-0df6-40a1-b6d8-f99d6f53e9b0' >> /etc/zones/index vmadm boot cb18400e-0df6-40a1-b6d8-f99d6f53e9b0
Migration Completed
Thats it we are done. It would be pretty easy to to script this. Example: “vm-migrate UUID TARGET-IP”
If anyone does indeed write such a script I hope you would consider sharing it?
Nov 19, 2012 @ 12:36:00
Not sure if it works with KVM (my machine at home is AMD, so I can’t run KVM), but I did make this little vmadm wrapper – vmsend:
https://github.com/bixu/vmsend
Oct 01, 2013 @ 18:14:00
Hi blake, sorry for the delayed response 😉
You have probably figured this out by now, but incase you are not aware, there are AMD images floating around that support KVM properly. There is a UNI in Australia using them quite extensively, so that gives me faith that they are production ready.
Apr 04, 2013 @ 20:48:00
there’s some documentation there: https://github.com/joyent/smartos-live/blob/master/src/vm/README.migration
The low level method is the only one working, I really hope to see the other working one day
Jul 15, 2013 @ 12:58:00
Here’s a run at one such script: https://gist.github.com/jim80net/5997596
It presumes zsnapper is installed, but this can be simply commented out if irrelevent.
Also, it presumes mbuffer is installed in /opt/local/bin/mbuffer.
It works to zfs send the VM on the target and also perform an increment, if supplied with the original snapname, which is dumped to the screen.
Jul 18, 2013 @ 17:36:00
Jim, thanks for this and for letting me know about it. I am having a look at it now. Looks pretty substantial.
Aug 03, 2013 @ 09:38:00
I followed this method to migrate 3-KVM machines, as well as 3-Zones to a new local(same machine) zpool mirror that I would replace /zones with. A total of ~500GB of data in all. Essentially, I followed this process so that I could upgrade the 1TB /zones mirror I used for testing on the initial install, to a 2TB mirror that will be used permanently on this machine. The process went very smoothly so thanks for the step-by-step instructions as this is only my second week away from XenServer, and running SmartOS. 🙂
Also, for anyone facing a similar situation to mine. In order to complete the process after the above instructions I had to reboot the machine and edit the kernel line in the bootloader from “smartos=true” to “standalone=true,noimport=true”. To do this simply press “e” when prompted.
Then once booted I imported the 2TB pool in as /zones, and rebooted. Voila! Currently I have 7 hours of uptime with the new mirror installed, and have had zero issues. A ‘zpool scrub’ came up clean as well. Again, this extra step was just so I can do a local replacement of /zones to a larger mirror. I’d be curious to know if I could have done this simpler!
Especially without needing to reboot the box as I couldn’t cleanly or forcifully unmount /zones as the device was busy, regardless of what I tried.
Btw every week I love SmartOS that much more. Not to mention its community. 🙂
Cheers,
TMinus36
Oct 01, 2013 @ 18:11:00
Its an awesome community, of very smart friendly folks. Welcome 🙂
Dec 17, 2013 @ 22:37:00
If you get an error about `cannot receive: local origin for clone zones/ does not exist.` you can `zfs promote zones/` to make the snapshot self-contained.
Dec 18, 2013 @ 05:40:00
Hi Matt,
Thanks for the tip.
Apr 01, 2014 @ 16:00:24
You can also send the snapshots from which the disk0 filesystems originates from, that way you can keep the hierarchy (and avoid wasting disk space), I am still tinkering with this but this seems to do the trick well:
zfs send -i zones/${IMAGE}@dataset zones/${IMAGE}@${UUID}-disk0 | ssh ${REMOTE} zfs receive zones/${IMAGE}
where IMAGE is the uuid of the base image and UUID is the id of the virtual machine you are moving, on latest SmartOS releases it looks like the @dataset is now @final.
PS: for this to work you need to have the base image already on your remote host.
Apr 01, 2014 @ 16:30:49
Hi Julien
Thanks for the info. Yes, incremental snapshots are great and do work. They can be a bit of a pain sometimes if the target receive location changes in any way, e.g. you “ls” a directory then the next incremental send will fail. A good thing to do in this situation is set the zfs property on the receive location to read-only to avoid this occurring.
Apr 11, 2014 @ 04:21:21
If the *disk0 zvol is sent and already available on the destination, one can import the JSON payload through vmadm create, without any need to send the zones/uuid dataset for a vm, provided you add “nocreate”: true in the JSON.
Although your approach is much better and avoid creation of the zone in first place, which might be useful if there are lots of VMs being moved.
Also, since I found no tool to systematically keep VM snapshots/replication in sync, I wrote a small Python app, just to do that. It has error checking and basic reporting built-in. Hope, someone finds it helpful.
https://github.com/bassu/bzman
Apr 12, 2014 @ 20:41:00
Hi there Bassu,
Thanks for the JSON tip.
In addition, thanks for the link to your tool, I have bookmarked it, looks EXTREMELY useful. Have you been using it for a long time, and if so any issues with the destination changing (e.g a “ls” of the directory) or does your script set the destination to read only to avoid this?
Apr 12, 2014 @ 23:10:31
Hi there Mark,
I have been using bzman in production at my day job, for quite some time now. It was written mainly for no-hands incremental and systematic backups with error checking, as the description says. There’s a smartos directory in the repo containing a wrapper script. I think I have the same, only difference is, I recently added rsync to it to sync /etc/zones/ at the end.
About modification of destination, it is by default detected by zfs and the increment fails and because bzman handles errors quite well, it dispatches an email to root (changeable in script) in addition to logging to syslog/dmesg and stdout. After all, the script was written to have this sort of error checking. See bzman -h for full details.
Cheers.