OpenStack + KVM + HA

A lazy reading on the web seems to suggest that the grass isn’t any more greener in the OpenStack camp when handling KVM host failures. See this earlier blog post for context.

According to the OpenStack Hypervisor feature support matrix, there is a feature called evacuate to deal with failed KVM hosts and the documentation reads as below:

As cloud administrator, while you are managing your cloud, you may get to the point where one of the cloud compute nodes fails. For example, due to hardware malfunction. At that point you may use server evacuation in order to make managed instances available again.

Evacuation seems to be a manual operational task – A large public/private cloud deployment will have nightmares maintaining SLAs (yes, some cloud providers do have SLAs and their ops teams have even tighter internal SLAs) every time a KVM host fails. I suspect that commercial OpenStack distributions (RackSpace/Mirantis/PistonCloud) and OpenStack public clouds (HP/RackSpace) have custom patches for automatic evacuation.

Alternatively, I have not Googled enough to find references for “stock” OpenStack handling KVM host failures automatically.

By Shanker Balan

Shanker Balan is a devops and infrastructure freelancer with over 14 years of industry experience in large scale Internet systems. He is available for both short term and long term projects on contract. Please use the Contact Form for any enquiry.

1 comment

Leave a comment

Your email address will not be published. Required fields are marked *