Virtualization – linuxmonk.ch

Testing oVirt 4.4-alpha on CentOS 8: Part 2 – Setup Self-Hosted Engine

Feb 242020

This is the second part of my field report about installing the oVirt 4.4-alpha release on CentOS 8 in a hyperconverged setup. In the first part I was focusing on setting up the GlusterFS storage cluster and now I’m going to describe my experience of the self-hosted engine installation.

If you think about repeating this installation on your hardware please let me remind you: This software is currently in alpha status. This means there are likely still many bugs and rough edges and if you succeed to install it successfully there is no guarantee for updates not breaking everything again. Please don’t try this anywhere close to production systems or data. I won’t be able to assist you in any way if things turn out badly.

Cockpit Hosted-Engine Wizard

Before we can start installing the self-hosted engine, we need to install a few more packages:

# dnf install ovirt-engine-appliance vdsm-gluster

Similar to the GlusterFS setup, also the hosted engine setup can be done from the Cockpit Web interface:

The wizard is also here pretty self-explanatory. There are a few options missing in Web-UI compared to the commandline installer (hosted-engine --deploy) e.g. you cannot customize the name of the libvirt domain which is called ‘HostedEngine’ by default. You must give the common details such as hostname, VM resources, some network settings, credentials for the VM and oVirt and that’s pretty much it:

Before you start deploying the VM there is also a quick summary of the settings and then an answer file will be generated. While the GlusterFS setup created a regular Ansible inventory the hosted engine setup has its own INI-format. It’s a useful feature that even when the deployment aborts, it can always be restarted from the Web interface without the need to fill in the form again and again. Indeed, I used this to my advantage a lot because it took me at least 20 attempts before the hosted engine VM was setup successfully.

Troubleshooting hosted-engine issues

Once VM deployment was running I found that the status output in the Cockpit Web interface heavily resembled Ansible output. It seems that a big part of the deployment code in the hosted-engine tool was re-implemented now using the ovirt-ansible-hosted-engine-setup Ansible roles in the background. If you’re familiar with Ansible this definitely simplifies troubleshooting and also allows a better understanding what is going on. Unfortunately there is still a layer of hosted-engine code above Ansible so that I couldn’t figure out, if it’s possible to run a playbook from the shell that would do the same setup.

Obviously it didn’t take long for an issue to pop-up:

The first error was that Ansible couldn’t connect to the hosted-engine VM that was freshly created from the oVirt appliance disk image. The error output in the Web interface is rather limited but also here a log file exists that can be found at a path like /var/log/ovirt-engine/setup/ovirt-engine-setup-20200218154608-rtt3b7.log. In the log file I found:

2020-02-18 15:32:45,938+0100 DEBUG ansible on_any args localhostTASK: ovirt.hosted_engine_setup : Wait for the local VM kwargs 
2020-02-18 15:35:52,816+0100 ERROR ansible failed {
    "ansible_host": "localhost",
    "ansible_playbook": "/usr/share/ovirt-hosted-engine-setup/ansible/trigger_role.yml",
    "ansible_result": {
        "_ansible_delegated_vars": {
            "ansible_host": "ovirt.oasis.home"
        },
        "_ansible_no_log": false,
        "changed": false,
        "elapsed": 185,
        "msg": "timed out waiting for ping module test success: Using a SSH password instead of a key is not possible because Host Key checking is enabled and sshpass does not support this.  Please add this host's fingerprint to your know
n_hosts file to manage this host."
    },
    "ansible_task": "Wait for the local VM",
    "ansible_type": "task",
    "status": "FAILED",
    "task_duration": 187
}

A manual SSH login with the root account on the VM was possible after accepting the fingerprint. Maybe this is still a bug or I missed a setting somewhere, but the easiest way to solve this was to create a ~root/.ssh/config file on the hypervisor host with the following content. The hostname is the hosted-engine FQDN:

Host ovirt.oasis.home
    StrictHostKeyChecking accept-new

Each installation attempt will make sure that the previous host key is deleted from the known_hosts file so no need to worry about changing keys on multiple installation tries. The deployment could simply be restarted by pressing the “Prepare VM” button once again.

During the next run the connection to the hosted-engine VM succeeded and it nearly completed all of the setup task within the VM but then failed when trying to restart the ovirt-engine-dwhd service:

2020-02-18 15:48:45,963+0100 INFO otopi.plugins.ovirt_engine_setup.ovirt_engine_dwh.core.service service._closeup:52 Starting dwh service
2020-02-18 15:48:45,964+0100 DEBUG otopi.plugins.otopi.services.systemd systemd.state:170 starting service ovirt-engine-dwhd
2020-02-18 15:48:45,965+0100 DEBUG otopi.plugins.otopi.services.systemd plugin.executeRaw:813 execute: ('/usr/bin/systemctl', 'start', 'ovirt-engine-dwhd.service'), executable='None', cwd='None', env=None
2020-02-18 15:48:46,005+0100 DEBUG otopi.plugins.otopi.services.systemd plugin.executeRaw:863 execute-result: ('/usr/bin/systemctl', 'start', 'ovirt-engine-dwhd.service'), rc=1
2020-02-18 15:48:46,006+0100 DEBUG otopi.plugins.otopi.services.systemd plugin.execute:921 execute-output: ('/usr/bin/systemctl', 'start', 'ovirt-engine-dwhd.service') stdout:


2020-02-18 15:48:46,006+0100 DEBUG otopi.plugins.otopi.services.systemd plugin.execute:926 execute-output: ('/usr/bin/systemctl', 'start', 'ovirt-engine-dwhd.service') stderr:
Job for ovirt-engine-dwhd.service failed because the control process exited with error code. See "systemctl status ovirt-engine-dwhd.service" and "journalctl -xe" for details.

2020-02-18 15:48:46,007+0100 DEBUG otopi.context context._executeMethod:145 method exception
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/otopi/context.py", line 132, in _executeMethod
    method['method']()
  File "/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine-dwh/core/service.py", line 55, in _closeup
    state=True,
  File "/usr/share/otopi/plugins/otopi/services/systemd.py", line 181, in state
    service=name,
RuntimeError: Failed to start service 'ovirt-engine-dwhd'
2020-02-18 15:48:46,008+0100 ERROR otopi.context context._executeMethod:154 Failed to execute stage 'Closing up': Failed to start service 'ovirt-engine-dwhd'
2020-02-18 15:48:46,009+0100 DEBUG otopi.context context.dumpEnvironment:765 ENVIRONMENT DUMP - BEGIN
2020-02-18 15:48:46,010+0100 DEBUG otopi.context context.dumpEnvironment:775 ENV BASE/error=bool:'True'
2020-02-18 15:48:46,010+0100 DEBUG otopi.context context.dumpEnvironment:775 ENV BASE/exceptionInfo=list:'[(, RuntimeError("Failed to start service 'ovirt-engine-dwhd'",), )]'
2020-02-18 15:48:46,012+0100 DEBUG otopi.context context.dumpEnvironment:779 ENVIRONMENT DUMP - END

Fortunately I was able to login to the hosted-engine VM and found the following blunt error:

-- Unit ovirt-engine-dwhd.service has begun starting up.
Feb 18 15:48:46 ovirt.oasis.home systemd[30553]: Failed at step EXEC spawning /usr/share/ovirt-engine-dwh/services/ovirt-engine-dwhd/ovirt-engine-dwhd.py: Permission denied
-- Subject: Process /usr/share/ovirt-engine-dwh/services/ovirt-engine-dwhd/ovirt-engine-dwhd.py could not be executed

Indeed, the referenced script was not marked executable. Fixing it manually and restarting the service showed that this would succeed. But there is one problem. This change is not persisted. On the next deployment run, the hosted-engine VM will be deleted and re-created again. When searching for a nicer solution I found that this bug is actually already fixed in the latest release of ovirt-engine-dwh-4.4.0-1.el8.noarch.rpm but the appliance image (ovirt-engine-appliance-4.4-20200212182535.1.el8.x86_64) was only including ovirt-engine-dwh-4.4.0-0.0.master.20200206083940.el7.noarch and there is no newer appliance image. That’s part of the experience when trying alpha releases but it’s not a blocker. Eventually I found that there is a directory where you can place an Ansible tasks file which will be executed in the hosted-engine VM before the setup is run. So I created the file hooks/enginevm_before_engine_setup/yum_update.yml in the /usr/share/ansible/roles/ovirt.hosted_engine_setup/ directory with the following content:

---
- name: Update all packages
  package:
  name: '*'
  state: latest

From then on each deployment attempt was first updating the packages including ‘ovirt-engine-dwh’ in the VM before the hosted-engine would continue to configure and restart the service.

The next issue was suddenly appearing when I tried to re-run the deployment. The Ansible code would fail early with an error that it cannot update the routing rules on the hypervisor:

2020-02-18 16:17:38,330+0100 DEBUG ansible on_any args  kwargs 
2020-02-18 16:17:38,664+0100 INFO ansible task start {'status': 'OK', 'ansible_type': 'task', 'ansible_playbook': '/usr/share/ovirt-hosted-engine-setup/ansible/trigger_role.yml', 'ansible_task': 'ovirt.hosted_engine_setup : Add IPv4 outbo
und route rules'}
2020-02-18 16:17:38,664+0100 DEBUG ansible on_any args TASK: ovirt.hosted_engine_setup : Add IPv4 outbound route rules kwargs is_conditional:False 
2020-02-18 16:17:38,665+0100 DEBUG ansible on_any args localhostTASK: ovirt.hosted_engine_setup : Add IPv4 outbound route rules kwargs 
2020-02-18 16:17:39,214+0100 DEBUG var changed: host "localhost" var "result" type "" value: "{
    "changed": true,
    "cmd": [
        "ip",
        "rule",
        "add",
        "from",
        "192.168.222.1/24",
        "priority",
        "101",
        "table",
        "main"
    ],
    "delta": "0:00:00.002805",
    "end": "2020-02-18 16:17:38.875350",
    "failed": true,
    "msg": "non-zero return code",
    "rc": 2,
    "start": "2020-02-18 16:17:38.872545",
    "stderr": "RTNETLINK answers: File exists",
    "stderr_lines": [
        "RTNETLINK answers: File exists"
    ],
    "stdout": "",
    "stdout_lines": []
}"

So I was checking the rules manually and yes, they were already there. I thought that’s an easy case, that must be a simple idempotency issue in the Ansible code. But when looking up the code there was already a condition in place that should prevent this case from happening. Even after multiple attempts to debug this code, I couldn’t find the reason why this check is failing. Eventually I found the GitHub pull request #96 where someone was already refactoring this code with a commit message “Hardening existing ruleset lookup”. So I forward-ported the patch to the release 1.0.35 which fixed the problem. The PR is already open for more than a year with no indication that it would be merged soon, so I still reported the issue in ovirt-ansible-hosted-engine-setup #289.
I only found out about ovirt-hosted-engine-cleanup a few hours later, so with its help you can easily work-around this issue by cleaning up the installation before another retry.

Another though to debug but easy to fix issue popped up after the hosted-engine VM setup completed and the Ansible role was checking the oVirt events for errors:

2020-02-19 01:46:53,723+0100 ERROR ansible failed {
    "ansible_host": "localhost",
    "ansible_playbook": "/usr/share/ovirt-hosted-engine-setup/ansible/trigger_role.yml",
    "ansible_result": {
        "_ansible_no_log": false,
        "changed": false,
        "msg": "The host has been set in non_operational status, deployment errors:   code 4035: Gluster command [] failed on server .,    code 10802: VDSM loki.oasis.home command GlusterServersListVDS failed: The method does not exist or is not available: {'method': 'GlusterHost.list'},   fix accordingly and re-deploy."
    },
    "ansible_task": "Fail with error description",
    "ansible_type": "task",
    "status": "FAILED",
    "task_duration": 0
}

This error is not in the Ansible code anymore but the engine itself fails to query the GlusterFS status on the hypervisor. This is done via VDSM, a daemon that runs on each oVirt hypervisor and manages the hypervisor configuration and status. Maybe the VDSM log (/var/log/vdsm/vdsm.log) reveals more insights:

2020-02-19 01:46:45,786+0100 INFO  (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC call Host.getCapabilities succeeded in 3.33 seconds (__init__:312)
2020-02-19 01:46:45,981+0100 INFO  (jsonrpc/1) [jsonrpc.JsonRpcServer] RPC call GlusterHost.list failed (error -32601) in 0.00 seconds (__init__:312)

Seems that regular RPC calls to VDSM are successful but only the GlusterFS query is failing. I tracked down the source code of this implementation and found that there is a CLI command that can be used to run this query:

# vdsm-client --gluster-enabled -h
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/vdsmclient/client.py", line 276, in find_schema
    with_gluster=gluster_enabled)
  File "/usr/lib/python3.6/site-packages/vdsm/api/vdsmapi.py", line 156, in vdsm_api
    return Schema(schema_types, strict_mode, *args, **kwargs)
  File "/usr/lib/python3.6/site-packages/vdsm/api/vdsmapi.py", line 142, in __init__
    with io.open(schema_type.path(), 'rb') as f:
  File "/usr/lib/python3.6/site-packages/vdsm/api/vdsmapi.py", line 95, in path
    ", ".join(potential_paths))
vdsm.api.vdsmapi.SchemaNotFound: Unable to find API schema file, tried: /usr/lib/python3.6/site-packages/vdsm/api/vdsm-api-gluster.pickle, /usr/lib/python3.6/site-packages/vdsm/api/../rpc/vdsm-api-gluster.pickle

Ah, that’s better. I love such error messages. Thanks to that it was not so hard to find, that I actually overlooked to install the vdsm-gluster package on the hypervisor.

That’s it after that the deployment completed successfully:

And finally a screenshot of the oVirt 4.4-alpha administration console. Yes, it works:

Conclusion

At the end the of the day most of the issues happened because I was not very familiar with the setup procedure and at the same time refused to follow any setup instructions for an older release. There were a minor bug with the ovirt-engine-dwh restart issue, that was already fixed upstream but didn’t made it yet into the hosted-engine appliance image. Something that is expected in an alpha release.

I also quickly setup some VMs to test the basic functionality of oVirt and couldn’t find any major issues so far. I guess most people using oVirt are much more experienced with it than me anyway, so there shouldn’t be any concerns in trying oVirt 4.4-alpha yourself. To me that was an interesting experience and I’m very happy about the Ansible integration that this project is pushing. It was also a nice experience to use Cockpit and I believe that’s definitely something that makes this product more appealing to setup and use for a wide range of IT professionals. As long as it can be done via command line too, I’ll be happy.

Testing oVirt 4.4-alpha on CentOS 8: Part 1 – Setup GlusterFS

Fedora, Virtualization 1 Response »

Feb 232020

For a while I had a oVirt server in a hyperconverged setup which means that the hypervisor was also running a GlusterFS storage server and that the oVirt management virtual machine was inside the oVirt cluster (self-hosted engine). On top of oVirt I was running a OKD cluster with a containerized GlusterFS cluster that could be used for persistent volumes by the container workload. All of this was running on a single hypervisor with a single SSD which unfortunately gave up on me after a few years in operation. Recently I stumbled upon the oVirt 4.4-alpha. Next to initial support for running it on CentOS 8 also a proper support for ignition that is used by (Fedora and Red Hat) CoreOS and therefore OpenShift/OKD 4 attracted my attention. Why not give it a try and see how far I come…? After a few hours of tinkering I succeeded the installation:

And now I’m going to describe what was necessary to do so. Not everything that I’ll mention is brand new. My past experience of setting up such a system is more or less based on the guide Up and Running with oVirt 4.0 and Gluster Storage that I was following a few years ago, so I’m also highlighting a few things that have changed since then.

If you think about repeating this installation on your hardware please let me remind you: This software is currently in alpha status. This means there are likely still many bugs and rough edges and if you succeed to install it successfully there is no guarantee for updates not breaking everything again. Please don’t try this anywhere close to production systems or data. I won’t be able to assist you in any way if things turn out badly.

I split this field report into two parts, the first one discussing the GlusterFS storage setup and the second one explaining my challenges when setting up the oVirt self-hosted-engine.

Hypervisor disk layout

I was using a minimal install of a CentOS 8 on a bare-metal server. Make sure you either have two disks or create a separate partition when installing CentOS so that the GlusterFS storage can life on its own block device. My disk layout looks something like this:

[root@loki ~]# lsblk
NAME                    MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda                       8:0    0 931.5G  0 disk 
├─sda1                    8:1    0   100M  0 part 
├─sda2                    8:2    0   256M  0 part /boot
├─sda3                    8:3    0    45G  0 part 
│ ├─vg_loki-slash       253:0    0    10G  0 lvm  /
│ ├─vg_loki-swap        253:1    0     4G  0 lvm  [SWAP]
│ ├─vg_loki-var         253:2    0     5G  0 lvm  /var
│ ├─vg_loki-home        253:3    0     2G  0 lvm  /home
│ └─vg_loki-log         253:4    0     2G  0 lvm  /var/log
└─sda4                    8:4    0   500G  0 part

The CentOS installation is placed on a LVM volume group (vg_loki on /dev/sda3) with some individual volumes for dedicated mount points and then there is /dev/sda4 which is an empty disk partition that will be used by GlusterFS later. If you wonder how to do such a setup with the CentOS 8 installer… I don’t know. I tried for a moment to somehow configure this setup in the installer, but eventually I gave up, manually partitioned the disk and created the volume group and then used the pre-generated setup in the installer which perfectly detected what I was doing on the shell.

Install software requirements

First you need to enable the oVirt 4.4-alpha package repository by installing the corresponding release package:

# dnf install https://resources.ovirt.org/pub/yum-repo/ovirt-release44-pre.rpm

Recently a lot of effort was invested into incorporating the oVirt setup into the Cockpit Web interface and it’s now even the recommended installation method for the downstream Red Hat Virtualization (RHV). When I setup my previous hyperconverged oVirt 4.0 this wasn’t available back then so of course I’m going to try this. To setup Cockpit and the oVirt integration the following packages need to be installed:

# dnf install cockpit cockpit-ovirt-dashboard glusterfs-server

After logging into Cockpit that runs on the hypervisor host on port 9090 there is a dedicated oVirt tab with two entries:

If you continue with the hyperconverged setup, there now even is a dedicated option to install a single node only GlusterFS “cluster”!

This was a big positive surprise to me because the previously used gdeploy tool was insisting on a three node GlusterFS cluster years ago.

Running the GlusterFS wizard

After this revelation the GlusterFS setup is supposedly straight forward. Still I ran into some issues that I could probably have avoided by carefully reading the installation instructions for oVirt 4.3. Nonetheless I’m quickly going to mention a few points here in case other people are struggling with the same and search the Web for these error messages:

On the first screen I had an error that the setup cannot proceed because "gluster-ansible-roles is not installed on Host":

However, the related package including the Ansible roles from gluster-ansible was clearly there:
```
# rpm -q gluster-ansible-roles
gluster-ansible-roles-1.0.5-7.el8.noarch
```
Eventually I found that my sudo rules for my unprivileged user account are not properly picked up by Cockpit so I restarted the setup by using the root account which then successfully detected the Ansible roles.

When selecting the brick setup the “Raid Type” must be changed to “JBOD” and the device name that was reserved for the GlusterFS storage must be entered:

Eventually the wizard will create a dedicated LVM volume group for GlusterFS bricks and a LVM thin-pool volume if you wish so:

# lvs gluster_vg_sda4
  LV                               VG              Attr       LSize    Pool                             Origin Data%  Meta%  Move Log Cpy%Sync Convert
  gluster_lv_data                  gluster_vg_sda4 Vwi-aot---  125.00g gluster_thinpool_gluster_vg_sda4        3.35                                   
  gluster_lv_engine                gluster_vg_sda4 -wi-ao----   75.00g                                                                                
  gluster_lv_vmstore               gluster_vg_sda4 Vwi-aot---  300.00g gluster_thinpool_gluster_vg_sda4        0.05                                   
  gluster_thinpool_gluster_vg_sda4 gluster_vg_sda4 twi-aot--- <421.00g                                         1.03   0.84

Before the Ansible playbook that will setup the storage is executed the generated inventory file will be displayed. Because I'm very familiar with Ansible anyway, I love this part! It also makes it easier to understand which playbook command to run when troubleshooting something on the command line where the Cockpit Web interface doesn't make sense to be involved anymore:
The "Enable Debug Logging" is definitely worth to be enabled especially if you run this the first time on your server. It gives you much more insight in what Ansible is actually doing on your hypervisor.

When finally running the "Deploy" step it didn't take long to fail the playbook with the following error:

TASK [Check if provided hostnames are valid] ***********************************
task path: /usr/share/cockpit/ovirt-dashboard/ansible/hc_wizard.yml:29
fatal: [loki.oasis.home]: FAILED! => {"msg": "The conditional check 'result.results[0]['stdout_lines'] > 0' failed. The error was: Unexpected templating type error occurred on ({% if result.results[0]['stdout_lines'] > 0 %} True {% else %} False {% endif %}): '>' not supported between instances of 'list' and 'int'"}

The involved code is untouched since a long time. So I reported the issue at RHBZ #1806298. I'm not sure how this could ever work but the fix is also trivial. After changing the following lines this task was successfully passing:

--- /usr/share/cockpit/ovirt-dashboard/ansible/hc_wizard.yml.orig       2020-02-18 14:48:33.678471259 +0100
+++ /usr/share/cockpit/ovirt-dashboard/ansible/hc_wizard.yml    2020-02-18 14:48:55.810456470 +0100
@@ -30,7 +30,7 @@
       assert:
         that:
           - "result.results[0]['rc'] == 0"
-          - "result.results[0]['stdout_lines'] > 0"
+          - "result.results[0]['stdout_lines'] | length > 0"
         fail_msg: "The given hostname is not valid FQDN"
       when: gluster_features_fqdn_check | default(true)

Btw. the error message can not only be seen in the Web UI but is also written to a log file: /var/log/cockpit/ovirt-dashboard/gluster-deployment.log

There was another issue that cost me more effort to track down. The deployment playbook was failing to add the firewalld rules for GlusterFS:

TASK [gluster.infra/roles/firewall_config : Add/Delete services to firewalld rules] ***
task path: /etc/ansible/roles/gluster.infra/roles/firewall_config/tasks/main.yml:24
failed: [loki.oasis.home] (item=glusterfs) => {"ansible_loop_var": "item", "changed": false, "item": "glusterfs", "msg": "ERROR: Exception caught: org.fedoraproject.FirewallD1.Exception: INVALID_SERVICE: 'glusterfs' not among existing services Permanent and Non-Permanent(immediate) operation, Services are defined by port/tcp relationship and named as they are in /etc/services (on most systems)"}

Indeed firewalld doesn't know about a 'glusterfs' service:

# rpm -q firewalld
firewalld-0.7.0-5.el8.noarch
# firewall-cmd --get-services | grep glusterfs
#

Is this an old version? From where do I get the 'glusterfs' firewalld service definition? The solution to this is as simple as embarrassing. I found that the service definition is packaged as part of the glusterfs-server RPM which was still missing on my server. After installing it also this issue was solved.

Eventually the deployment succeeded and the CentOS 8 host was converted into a single node GlusterFS storage cluster:

# gluster volume status
Status of volume: data
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick loki.oasis.home:/gluster_bricks/data/
data                                        49153     0          Y       23806
 
Task Status of Volume data
------------------------------------------------------------------------------
There are no active volume tasks
 
Status of volume: engine
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick loki.oasis.home:/gluster_bricks/engin
e/engine                                    49152     0          Y       23527
 
Task Status of Volume engine
------------------------------------------------------------------------------
There are no active volume tasks
 
Status of volume: vmstore
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick loki.oasis.home:/gluster_bricks/vmsto
re/vmstore                                  49154     0          Y       24052
 
Task Status of Volume vmstore
------------------------------------------------------------------------------
There are no active volume tasks

Conclusion

I'm super pleased with the installation experience so far. The new GlusterFS Ansible roles had no issues setting up the bricks and volumes. The Cockpit Web-GUI was easy to use and always clearly communicated what was going on. There is now a supported configuration of a one node hyperconverged oVirt setup which makes me happy too. Kudos to everyone involved with this, great work!

Now let's continue with the oVirt self-hosted engine setup.

Getting started with LXD on Fedora

Fedora, Open Source, Virtualization 2 Responses »

Dec 202016

Since a long time I’m using and following the development of the LXC (Linux Container) project. I feel that it unfortunately never really had the success it deserved and in the recent years new technologies such as Docker and rkt pretty much redefined the common understanding of a container according to their own terms. Nonetheless LXC still claims its niche as full Linux operating system container solution especially suited for persistent pet containers, an area where the new players on the market are still in the stage of figuring out how to implement this properly according to their concept. LXC development hasn’t stalled, quite the contrary, they extended the API with a HTTP REST interface (served via Linux Container Daemon, LXD), implemented support for container live-migration, added container image management and much more. This means that there are a lot of reasons why someone, including me, would want to use Linux containers and LXD.

Enable LXD COPR repository
LXD is not officially packaged for Fedora. Therefore I spent the last few weeks by creating some community packages via their COPR build system and repository service. Similar to the better known Ubuntu PPA (Personal Package Archive) system, COPR provides a RPM package repository which can easily be consumed by Fedora users. To use the LXD repository, all you need to do is enabling it via dnf:

# dnf copr enable ganto/lxd

Please note that COPR packages are not reviewed by the Fedora package maintainers therefore you should only install packages where you trust the author. For this reason I also provide a Github repository with the RPM spec files, so that everyone could also build the RPMs on their own if they feel uncomfortable using the pre-built RPMs from the repository.

Install and start LXD
LXD is split into multiple packages. The important ones are lxd, the Linux Container Daemon and lxd-client, the LXD client binary called lxc. Install them with:

# dnf install lxd lxd-client

Unfortunately I didn’t had time to figure out the correct SELinux labels for LXD yet, therefore you need to disable SELinux prior to starting the daemon. LXD supports user namespaces to map the root user in a container to an unprivileged user ID on the container host. For this you need to assign an UID range on the host:

# echo "root:1000000:65536" >> /etc/subuid
# echo "root:1000000:65536" >> /etc/subgid

If you don’t do this, user namespaces won’t be used which is indicated by a message such as:

lvl=warn msg="Error reading idmap" err="User \"root\" has no subuids."
lvl=warn msg="Only privileged containers will be able to run"

Eventually start LXD with:

# systemctl start lxd.service

LXD configuration
LXD doesn’t have a configuration file. Configuration properties must be set and retrieved via client commands. Here you can find a list of all supported configuration properties. Most tutorials will suggest to initially run lxd init which would generate a basic configuration. However there is only a limited set of configuration options available via this command and therefore I prefer to set the properties via LXD client. A normal user account can be used to manage LXD via client when it’s a member of the lxd POSIX group:

# usermod --append --groups lxd myuser

By default LXD will store its images and containers in directories under /var/lib/lxd. Alternatives storage back-ends such as LVM, Btrfs or ZFS are available. Here I will show an example how to use LVM. Similar to the recommended Docker setup on Fedora it will use LVM thin volumes to store images and containers. First create a LVM thin pool. For this we still need some space available on the default volume group. Alternatively you can use a second disk with a dedicated volume group. Replace vg00 with the volume group name you want to use:

# lvcreate --size 20G --type thin-pool --name lxd-pool vg00

Now we set this thin pool as storage back-end in LXD:

$ lxc config set storage.lvm_vg_name vg00
$ lxc config set storage.lvm_thinpool_name lxd-pool

For each image which is downloaded LXD will create a thin volume storing the image. If a new container is instantiated a new writeable snapshot will be created from which you can create an image again or make further snapshots for fast roll-back. By default the container file system will be ext4. If you prefer XFS, it can be set with the following command:

$ lxc config set storage.lvm_fstype xfs

Also for networking various options are available. If you ran lxd init, you may have already created a lxdbr0 network bridge. Otherwise I will show you how to manually create one in case you want a dedicated container bridge or attach LXD to an already existing bridge which would be configured through an external DHCP server.

To create a dedicated network bridge where the traffic will be NAT‘ed to the outside, run:

$ lxc network create lxdbr0

This will create a bridge device with the given name and also start-up a dedicated instance of dnsmasq which will act as DNS and DHCP server for the container network.

A big advantage of LXD in comparison to plain LXC is a feature called container profiles. There you can define settings which should be applied to a new container instance. In our case, we now want containers to use the network bridge created before or any other network bridge which was created independently. For this it will be added to the “default” profile which is applied by default when creating a new container:

$ lxc network attach-profile lxdbr0 default eth0

The eth0 is the network device name which will be used inside the container. We could also add multiple network bridges or create multiple profiles (lxc profile create newprofile) with different network settings.

Create a container
Finally we have the most important pieces together to launch a container. A container is always instantiated from an image. The LXC projects provides an image repository with a big number of prebuilt container images pre-configured under the remote name images:. The images are regular LXC containers created via upstream lxc-create script using the various distribution templates. To list the available images run:

$ lxc image list images:

If you found an image you want to run, it can be started as following. Of course in my example I will use a Fedora 24 container (unfortunately there are no Fedora 25 containers available yet, but I’m also working on that):

$ lxc launch images:fedora/24 my-fedora-container

With the following command you can create a console session into the container:

$ lxc exec my-fedora-container /bin/bash

I hope this short guide made you curious to try LXD on Fedora. I’m glad to hear some feedback via comments or Email if you find this guide or the my COPR repository useful or if you have some corrections or found some issues.

Further reading
If you want to know more about how to use the individual features of LXD, I can recommend the how-to series of Stéphane Graber, one of the core developers of LXC/LXD:

Migrate 32bit physical host on RAID 1 to KVM/Xen virtual machine

Debian, GNU/Linux, Virtualization No Responses »

Dec 012015

I recently had the challenge to migrate a physical host running Debian, installed on a ancient 32bit machine with a software RAID 1, to a virtual machine (VM). The migration should be as simple as possible so I decided to attach the original disks to the virtualization host and define a new VM which would boot from these disks. This doesn’t sound too complicated but it still included some steps which I was not so familiar with, so I write them down here for later reference. As an experiment I tried the migration on a KVM and a Xen host.

Prepare original host for virtualized environment

Enable virtualized disk drivers
The host I wanted to migrate was running a Debian Jessie i686 installation on bare metal hardware. For booting it uses an initramfs which loads the required disk controller drivers. By default the initramfs only includes the drivers for the current hardware, so the paravirtualization drivers for KVM (virtio-blk) or Xen (xen-blkfront) are missing. This can be changed by adjusting the /etc/initramfs-tools/conf.d/driver-policy file to state that not the currently dependent (dep) modules should be loaded, but most:

# Driver inclusion policy selected during installation
# Note: this setting overrides the value set in the file
# /etc/initramfs-tools/initramfs.conf
MODULES=most

Afterwards the initramfs must be rebuilt by running the following command (adjust the kernel version according to the kernel you are running):

# dpkg-reconfigure linux-image-3.16.0-4-686-pae

Now the initramfs contains all required modules to successfully boot the host from a KVM virtio or a Xen blockfront disk device. This can be checked with:

# lsinitramfs /boot/initrd.img-3.16.0-4-686-pae | egrep -i "(xen|virtio).*blk.*\.ko"
lib/modules/3.16.0-4-686-pae/kernel/drivers/block/virtio_blk.ko
lib/modules/3.16.0-4-686-pae/kernel/drivers/block/xen-blkback/xen-blkback.ko
lib/modules/3.16.0-4-686-pae/kernel/drivers/block/xen-blkfront.ko

Enable serial console
To easily access the server console from the virtualization host, it’s required to setup a serial console in the guest, which must be defined in /etc/default/grub:

[...]
GRUB_CMDLINE_LINUX="console=tty0 console=ttyS0,115200n8"
GRUB_SERIAL_COMMAND="serial --unit=0 --speed=115200 --word=8 --parity=no --stop=1"
[...]
# Uncomment to disable graphical terminal (grub-pc only)
GRUB_TERMINAL="console serial"

After regenerating the Grub configuration, the boot loader and local console will then also be accessible via virsh console:

# grub-mkconfig -o /boot/grub/grub.cfg

Run the host on a KVM hypervisor

KVM is nowadays the de-facto default virtualization solution for running Linux on Linux and is very well integrated in all major distributions and a wide range of management tools. Therefore it’s really simple to boot the disks into a KVM VM:

Make sure the disks are not already assembled as md-device on the KVM host. I did this by uninstalling the mdadm utility.

Run the following virt-install command to define and start the VM:

# virt-install --connect qemu:///system --name myhost --memory 1024 --vcpus 2 --cpu host --os-variant debian7 --arch i686 --import --disk /dev/disk/by-id/ata-WDC_WD10EFRX-68PJCN0_WD-WCC4J4527214 --disk /dev/disk/by-id/ata-WDC_WD10EFRX-68PJCN0_WD-WCC4J4366412 --network bridge=virbr0 --graphics=none

To avoid confusion with the disk enumeration of the KVM host kernel, the disk devices are given as unique device paths instead of e.g. /dev/sdb and /dev/sdc. This clearly identifies the two disk devices which will then be assembled to a RAID 1 by the guest kernel.

Run the host on a Xen hypervisor

Xen has been a topic in this blog a few times before as I’m still using it since its early days. After a long time of major code refactorings it also made huge steps forward within the last few years, so that nowadays it’s nearly as easy to setup as KVM and again well supported in the large distributions. Because its architecture is a bit different and Linux is not the only supported platform some things work not quite the same as under KVM. Let’s find out…

Prepare Xen Grub bootloader
Xen virtual machines can be booted in various different ways, also depending whether the guest disk is running on an emulated disk controller or as paravirtualized disk. In simple cases, this is done with help of PyGrub loading a Grub Legacy grub.conf or a Grub 2 grub.cfg from a guests /boot partition. As my grub.cfg and kernel is “hidden” in a RAID 1 a fully featured Grub 2 is required. It is perfectly able to load its configuration from advanced disk setups such as RAID arrays, LVM volumes and more. With Xen such a setup is only possible when using a dedicated boot loader disk image which must be in the target architecture of the virtual machine. As I want to run a 32bit virtual machine, a i386-xen Grub 2 flavor needs to be compiled as basis for the boot loader image:

Clone the Grub 2 source code:

$ git clone git://git.savannah.gnu.org/grub.git

Configure and build the source code. I was running the following commands on a Gentoo Xen server without issues. On other distributions you might need to install some development packages first:
```
$ cd grub
$ ./autogen.sh
$ ./configure --target=i386 --with-platform=xen
$ make -j3
```
Prepare a grub.cfg for the Grub image. The Grub 2 from the Debian guest host was installed on a mirrored boot disk called /dev/md0 which was mounted to /boot. Therefore the Grub image must be hinted to load the actual configuration from there:
```
# grub.cfg for Xen Grub image grub-md0-i386-xen.img
normal (md/0)/grub/grub.cfg
```

Create the Grub image using the previously compiled Grub modules:

# grub-mkimage -O i386-xen -c grub.cfg -o /var/lib/libvirt/images/grub-md0-i386-xen.img -d ~user/grub/grub-core ~user/grub/grub-core/*.mod

A more generic guide about generating Xen Grub images can be found at Using Grub 2 as a bootloader for Xen PV guests.

Define the virtual machine
virt-install can also be used to import disk (images) as Xen virtual machines. This works perfectly fine for e.g. importing Atomic Host raw images, but unfortunately the setup now is a bit more complicated. It requires me to specify the custom Grub 2 image as guest kernel for which I couldn’t find a corresponding virt-install argument (using v1.3.0). Therefore I first created a plain Xen xl configuration file and then converted it to a libvirt XML file. The initial configuration /etc/xen/myhost.cfg looks as following:

name = "myhost"
kernel = "/var/lib/libvirt/images/grub-md0-i386-xen.img"
memory = 1024
vcpus = 2
vif = [ 'bridge=virbr0' ]
disk = [ '/dev/disk/by-id/ata-WDC_WD10EFRX-68PJCN0_WD-WCC4J4527214,raw,xvda,rw', '/dev/disk/by-id/ata-WDC_WD10EFRX-68PJCN0_WD-WCC4J4366412,raw,xvdb,rw' ]

Eventually this configuration can be converted to a libvirt domain XML with the following command:

# virsh domxml-from-native --format xen-xl --config /etc/xen/myhost.cfg > /etc/libvirt/libxl/myhost.xml
# virsh --connect xen:/// define /etc/libvirt/libxl/myhost.xml

Summary

At the end, I successfully converted the bare metal host to a KVM and/or Xen virtual machine. Once again Xen ended up being a bit more troublesome, but no nasty hacks were required and both configurations were quite straight forward. The boot loader setup for Xen could definitely be a bit simpler, but still it’s nice to see how well Grub 2 and Xen play together now. It’s also pleasing to see, that no virtualization specific configurations had to be made within the guest installation. The fact that the disk device names were changing from /dev/sd[ab] (bare metal) to /dev/vd[ab] (KVM) and /dev/xvd[ab] (Xen) was completely absorbed by the mdadm layer, but even without software RAID the /etc/fstab entries are nowadays generated by the installers in a way, that such migrations are easy possible.

Thanks for reading. If you have some comments, don’t hesitate to write a note below.