oVirt HA mechanism

Nov 08, 2017

oVirt enables you to establish a hyper-converged infrastructure (HCI), namely to setup and manage virtual machines, virtual network, and storage resources on top of a distributed KVM-based Linux system. In order to provide high availability, There are three features for different entities respectively:

1. Engine HA: oVirt engine is the management part of oVirt, which records and manages all resources. oVirt natively supports the “self-hosted engine”, which is to install oVirt engine inside a VM so oVirt engine can manage that VM where itself is running (also called Engine VM). If the Engine VM goes down, hosts (hypervisor nodes) within the same oVirt cluster will cooperate to select a proper host to rerun the Engine VM.

2. Guest HA: As long as a guest VM is marked by the administrator as “highly available”, when that VM is down unexpectedly (not normally shut down by an end user or administrator), oVirt engine will keep trying to start it again on either the previous host or another host.

3. Host HA: If a host (hypervisor node) becomes non-responsive due to power failure, network outage, etc, oVirt engine will fence that host by sending an IPMI stop request through a dedicated network channel to the power management unit of the host, and then start it again after it becomes down successfully.

The following is my study about three cases above, which adopts self-hosted engine and clustered storage (such as Ceph or GlusterFS).

Leave a comment