Feb 232020
 

For a while I had a oVirt server in a hyperconverged setup which means that the hypervisor was also running a GlusterFS storage server and that the oVirt management virtual machine was inside the oVirt cluster (self-hosted engine). On top of oVirt I was running a OKD cluster with a containerized GlusterFS cluster that could be used for persistent volumes by the container workload. All of this was running on a single hypervisor with a single SSD which unfortunately gave up on me after a few years in operation. Recently I stumbled upon the oVirt 4.4-alpha. Next to initial support for running it on CentOS 8 also a proper support for ignition that is used by (Fedora and Red Hat) CoreOS and therefore OpenShift/OKD 4 attracted my attention. Why not give it a try and see how far I come…? After a few hours of tinkering I succeeded the installation:

And now I’m going to describe what was necessary to do so. Not everything that I’ll mention is brand new. My past experience of setting up such a system is more or less based on the guide Up and Running with oVirt 4.0 and Gluster Storage that I was following a few years ago, so I’m also highlighting a few things that have changed since then.

If you think about repeating this installation on your hardware please let me remind you: This software is currently in alpha status. This means there are likely still many bugs and rough edges and if you succeed to install it successfully there is no guarantee for updates not breaking everything again. Please don’t try this anywhere close to production systems or data. I won’t be able to assist you in any way if things turn out badly.

I split this field report into two parts, the first one discussing the GlusterFS storage setup and the second one explaining my challenges when setting up the oVirt self-hosted-engine.

Hypervisor disk layout

I was using a minimal install of a CentOS 8 on a bare-metal server. Make sure you either have two disks or create a separate partition when installing CentOS so that the GlusterFS storage can life on its own block device. My disk layout looks something like this:

[root@loki ~]# lsblk
NAME                    MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda                       8:0    0 931.5G  0 disk 
├─sda1                    8:1    0   100M  0 part 
├─sda2                    8:2    0   256M  0 part /boot
├─sda3                    8:3    0    45G  0 part 
│ ├─vg_loki-slash       253:0    0    10G  0 lvm  /
│ ├─vg_loki-swap        253:1    0     4G  0 lvm  [SWAP]
│ ├─vg_loki-var         253:2    0     5G  0 lvm  /var
│ ├─vg_loki-home        253:3    0     2G  0 lvm  /home
│ └─vg_loki-log         253:4    0     2G  0 lvm  /var/log
└─sda4                    8:4    0   500G  0 part

The CentOS installation is placed on a LVM volume group (vg_loki on /dev/sda3) with some individual volumes for dedicated mount points and then there is /dev/sda4 which is an empty disk partition that will be used by GlusterFS later. If you wonder how to do such a setup with the CentOS 8 installer… I don’t know. I tried for a moment to somehow configure this setup in the installer, but eventually I gave up, manually partitioned the disk and created the volume group and then used the pre-generated setup in the installer which perfectly detected what I was doing on the shell.

Install software requirements

First you need to enable the oVirt 4.4-alpha package repository by installing the corresponding release package:

# dnf install https://resources.ovirt.org/pub/yum-repo/ovirt-release44-pre.rpm

Recently a lot of effort was invested into incorporating the oVirt setup into the Cockpit Web interface and it’s now even the recommended installation method for the downstream Red Hat Virtualization (RHV). When I setup my previous hyperconverged oVirt 4.0 this wasn’t available back then so of course I’m going to try this. To setup Cockpit and the oVirt integration the following packages need to be installed:

# dnf install cockpit cockpit-ovirt-dashboard glusterfs-server

After logging into Cockpit that runs on the hypervisor host on port 9090 there is a dedicated oVirt tab with two entries:

If you continue with the hyperconverged setup, there now even is a dedicated option to install a single node only GlusterFS “cluster”!

This was a big positive surprise to me because the previously used gdeploy tool was insisting on a three node GlusterFS cluster years ago.

Running the GlusterFS wizard

After this revelation the GlusterFS setup is supposedly straight forward. Still I ran into some issues that I could probably have avoided by carefully reading the installation instructions for oVirt 4.3. Nonetheless I’m quickly going to mention a few points here in case other people are struggling with the same and search the Web for these error messages:

  • On the first screen I had an error that the setup cannot proceed because "gluster-ansible-roles is not installed on Host":

    However, the related package including the Ansible roles from gluster-ansible was clearly there:

    # rpm -q gluster-ansible-roles
    gluster-ansible-roles-1.0.5-7.el8.noarch
    

    Eventually I found that my sudo rules for my unprivileged user account are not properly picked up by Cockpit so I restarted the setup by using the root account which then successfully detected the Ansible roles.

  • When selecting the brick setup the “Raid Type” must be changed to “JBOD” and the device name that was reserved for the GlusterFS storage must be entered:Eventually the wizard will create a dedicated LVM volume group for GlusterFS bricks and a LVM thin-pool volume if you wish so:
    # lvs gluster_vg_sda4
      LV                               VG              Attr       LSize    Pool                             Origin Data%  Meta%  Move Log Cpy%Sync Convert
      gluster_lv_data                  gluster_vg_sda4 Vwi-aot---  125.00g gluster_thinpool_gluster_vg_sda4        3.35                                   
      gluster_lv_engine                gluster_vg_sda4 -wi-ao----   75.00g                                                                                
      gluster_lv_vmstore               gluster_vg_sda4 Vwi-aot---  300.00g gluster_thinpool_gluster_vg_sda4        0.05                                   
      gluster_thinpool_gluster_vg_sda4 gluster_vg_sda4 twi-aot--- <421.00g                                         1.03   0.84
  • Before the Ansible playbook that will setup the storage is executed the generated inventory file will be displayed. Because I'm very familiar with Ansible anyway, I love this part! It also makes it easier to understand which playbook command to run when troubleshooting something on the command line where the Cockpit Web interface doesn't make sense to be involved anymore:
    The "Enable Debug Logging" is definitely worth to be enabled especially if you run this the first time on your server. It gives you much more insight in what Ansible is actually doing on your hypervisor.
  • When finally running the "Deploy" step it didn't take long to fail the playbook with the following error:
    TASK [Check if provided hostnames are valid] ***********************************
    task path: /usr/share/cockpit/ovirt-dashboard/ansible/hc_wizard.yml:29
    fatal: [loki.oasis.home]: FAILED! => {"msg": "The conditional check 'result.results[0]['stdout_lines'] > 0' failed. The error was: Unexpected templating type error occurred on ({% if result.results[0]['stdout_lines'] > 0 %} True {% else %} False {% endif %}): '>' not supported between instances of 'list' and 'int'"}
    

    The involved code is untouched since a long time. So I reported the issue at RHBZ #1806298. I'm not sure how this could ever work but the fix is also trivial. After changing the following lines this task was successfully passing:

    --- /usr/share/cockpit/ovirt-dashboard/ansible/hc_wizard.yml.orig       2020-02-18 14:48:33.678471259 +0100
    +++ /usr/share/cockpit/ovirt-dashboard/ansible/hc_wizard.yml    2020-02-18 14:48:55.810456470 +0100
    @@ -30,7 +30,7 @@
           assert:
             that:
               - "result.results[0]['rc'] == 0"
    -          - "result.results[0]['stdout_lines'] > 0"
    +          - "result.results[0]['stdout_lines'] | length > 0"
             fail_msg: "The given hostname is not valid FQDN"
           when: gluster_features_fqdn_check | default(true)

    Btw. the error message can not only be seen in the Web UI but is also written to a log file: /var/log/cockpit/ovirt-dashboard/gluster-deployment.log

  • There was another issue that cost me more effort to track down. The deployment playbook was failing to add the firewalld rules for GlusterFS:
    TASK [gluster.infra/roles/firewall_config : Add/Delete services to firewalld rules] ***
    task path: /etc/ansible/roles/gluster.infra/roles/firewall_config/tasks/main.yml:24
    failed: [loki.oasis.home] (item=glusterfs) => {"ansible_loop_var": "item", "changed": false, "item": "glusterfs", "msg": "ERROR: Exception caught: org.fedoraproject.FirewallD1.Exception: INVALID_SERVICE: 'glusterfs' not among existing services Permanent and Non-Permanent(immediate) operation, Services are defined by port/tcp relationship and named as they are in /etc/services (on most systems)"}

    Indeed firewalld doesn't know about a 'glusterfs' service:

    # rpm -q firewalld
    firewalld-0.7.0-5.el8.noarch
    # firewall-cmd --get-services | grep glusterfs
    #
    

    Is this an old version? From where do I get the 'glusterfs' firewalld service definition? The solution to this is as simple as embarrassing. I found that the service definition is packaged as part of the glusterfs-server RPM which was still missing on my server. After installing it also this issue was solved.

Eventually the deployment succeeded and the CentOS 8 host was converted into a single node GlusterFS storage cluster:

# gluster volume status
Status of volume: data
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick loki.oasis.home:/gluster_bricks/data/
data                                        49153     0          Y       23806
 
Task Status of Volume data
------------------------------------------------------------------------------
There are no active volume tasks
 
Status of volume: engine
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick loki.oasis.home:/gluster_bricks/engin
e/engine                                    49152     0          Y       23527
 
Task Status of Volume engine
------------------------------------------------------------------------------
There are no active volume tasks
 
Status of volume: vmstore
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick loki.oasis.home:/gluster_bricks/vmsto
re/vmstore                                  49154     0          Y       24052
 
Task Status of Volume vmstore
------------------------------------------------------------------------------
There are no active volume tasks

Conclusion

I'm super pleased with the installation experience so far. The new GlusterFS Ansible roles had no issues setting up the bricks and volumes. The Cockpit Web-GUI was easy to use and always clearly communicated what was going on. There is now a supported configuration of a one node hyperconverged oVirt setup which makes me happy too. Kudos to everyone involved with this, great work!

Now let's continue with the oVirt self-hosted engine setup.

  One Response to “Testing oVirt 4.4-alpha on CentOS 8: Part 1 – Setup GlusterFS”

  1. […] report about installing the oVirt 4.4-alpha release on CentOS 8 in a hyperconverged setup. In the first part I was focusing on setting up the GlusterFS storage cluster and now I’m going to describe my […]

 Leave a Reply

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

(required)

(required)