Enabling KVM virtualization for Raspberry Pi 2


As I wrote on my previous post, Enabling HYP mode on the Raspberry Pi 2, the newest machine from the Raspberry Pi Foundation features a Cortex-A7 with Virtualization Extensions, but it isn’t possible to make use of such feature out of the box.

In that article I showed that it was possible to start the kernel in HYP mode. Now, I’ll cover the rest of steps needed for enabling KVM virtualization and running your first guest OS.

Isolating a core

With KVM on ARM, interrupts are dealt in a very particular way. If a hardware interrupt is received when a core in running in guest mode, the execution is interrupted, a full world switch takes place returning to the kernel in host mode, then interrupts are reenabled and the core traps again, this time following the usual route.

Apparently, the BCM2836 (the core of the RPi2) doesn’t like the behavior, and if you try to run a guest on a core that can receive physical interrupts, you’ll find your RPi2 hangs within a few seconds. Without a JTAG of a full development board, debugging this problem is very, very hard.

Luckily, there’s a pretty simple workaround. Using the kernel option isolcpus when can isolate a core, so Linux doesn’t assign any tasks (including IRQ lines) to it.

In our case, we’re going to isolate the core number 3 (starting from zero). This is pretty straightforward, just edit /boot/cmdline.txt and add isolcpus=3 at the of the line. A kernel with VGIC emulation for the RPi2

On my previous article about RPi2 emulation, I wrote about the BCM2836 and its lack of a GIC, which makes emulation a bit harder. To work around this issue, I’ve implemented VGIC emulation inside the Linux kernel.

To be able to use both the customized bootloader and this modified version of the kernel, we need to generate a bundle with both components, and the dtb concatenated at end.

The easy way. Using a prebuilt image

I’ve uploaded a prebuilt image here (md5sum: 356788d260e1a93376c5b8acbb63da13), which contains the bootloader, some padding (to put the zImage at 0x8000), the kernel and the dtb. Simply replace your kernel with it (save a copy first!):

mv /boot/kernel7.img /boot/kernel7.img.bak
cp kernel7.img /boot

Then you’ll need to add the kernel_old=1 option to your config.txt:

echo "kernel_old=1" >> /boot/config.txt

That’s it! On the next boot, Linux should say something like this:

[    0.154131] CPU: All CPU(s) started in HYP mode.
[    0.154158] CPU: Virtualization extensions available.

Building it yourself (Part I: the bootloader)

If you want to build the image by yourself, you’ll need to grab an ARM cross toolchain. I’m using the one from the GCC ARM Embedded project, which works just fine. Then add it to your $PATH, grab the code for my repo, and build it:

export PATH=$PATH:/home/slp/sources/gcc-arm-none-eabi-4_9-2014q4/bin
git clone https://github.com/slp/rpi2-hyp-boot.git
cd rpi2-hyp-boot

That should produce a bootblk.bin, with contains the boot code and 32k of padding.

Building it yourself (Part II: kernel and dtb)

You can find a complete guide for building the kernel on a variety of host systems here Raspberry Pi Kernel Compilation. Read that and the grab the sources from my repo and make sure the following options are enabled:

  * `Patch physical to virtual translations at runtime`
  * `General setup -> Control Group support`
  * `System Type -> Support for Large Physical Address Extension`
  * `Boot options -> Use appended device tree blob to zImage (EXPERIMENTAL)`
  * `Boot options -> Supplement the appended DTB with traditional ATAG information`
  * `Device Drivers -> Block devices -> Loopback device support`
  * `Virtualization`
  * `Virtualization -> Kernel-based Virtual Machine (KVM) support (NEW)`
  * **DISABLE** `Virtualization -> KVM support for Virtual GIC`
  * **ENABLE** `Virtualization -> KVM support for Emulated GIC`

Now build both kernel and dtb:

make ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- zImage dtbs

Building it yourself (Part III: bundling all pieces)

This is the easy part, just concatenate them:

cat rpi2-hyp-boot/bootblk.bin linux/arch/arm/boot/zImage linux/arch/arm/boot/dts/bcm2709-rpi-2-b.dtb > kernel7.img

Now you can use it in the same way as the prebuilt image provided above.

Patched QEMU installation

To be able to launch our first Guest, we need a recent and patched (to ensure the guest is running in the isolated core) QEMU. The one that comes with Raspbian is quite old, so you have to build a newer one:

Download the build dependencies:

pi@raspberrypi:~$ sudo apt-get build-dep qemu
pi@raspberrypi:~$ sudo apt-get install libpixman-1-dev

Then, create a directory for the sources, download QEMU 2.2 and uncompress it:

pi@raspberrypi:~$ mkdir srcs
pi@raspberrypi:~$ cd srcs
pi@raspberrypi:~/srcs$ wget http://wiki.qemu-project.org/download/qemu-2.2.0.tar.bz2
pi@raspberrypi:~/srcs$ tar xf qemu-2.2.0.tar.bz2

We need to apply this patch to make sure the QEMU runs our guest in the core we isolated with option isolcpus:

pi@raspberrypi:~/srcs$ cd qemu-2.2.0/
pi@raspberrypi:~/srcs/qemu-2.2.0$ patch -p1 < ~/qemu-cpu-affinity.patch

Now run the configure script with options for enabling KVM (it is enabled automatically if supported, but this way we ensure we get warned if it’s not going to be build with it) and for building the ARM emulation target only:

pi@raspberrypi:~/srcs/qemu-2.2.0$ ./configure --enable-kvm --target-list=arm-softmmu

Finally, build and install it (by default, the new binaries will reside in /usr/local/bin):

pi@raspberrypi:~/srcs/qemu-2.2.0$ make
pi@raspberrypi:~/srcs/qemu-2.2.0$ sudo make install

A kernel for the Guest

When running QEMU with KVM, the hardware emulated is a Versatile Express A15, one the reference platforms provided by ARM Holdings. So, for our Guest we need a kernel which supports this board, and the corresponding dtb.

I’ve uploaded prebuilt binaries for both kernel (md5sum: 7c4831e852d6dda2145dd04fe3c2b464) and dtb (md5sum: 249885543f0fcca2ce7a011ef5157e7d).

If you want to build the kernel yourself, you’ll need a vanilla kernel (the one from RPi2’s repo wouldn’t build). In case of doubt, just grab the lastest stable from kernel.org. Use the default configuration for Vestatile Express A15 (make ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- vexpress_defconfig) and make sure you enable these options:

  * `General setup -> Configure standard kernel features (expert users)`
  * `General setup -> open by fhandle syscalls`
  * `Enable the block layer -> Support for large (2TB+) block devices and files`

Then build both kernel and dtb:

make ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- zImage dtbs

This will generate linux/arch/arm/boot/zImage and linux/arch/arm/boot/dts/vexpress-v2p-ca15-tc1.dtb. Copy both files to your Raspberry.

Running our first Guest

In addition to the kernel, we also need a root filesystem with the distribution we want to use for our userland. As we’re going to virtualize an ARMv7 CPU, the best option is using a earmv7hf distribution. In this guide, we’re going to use a minimal OpenSuSE image (JeOS), but you can choose the one of your preference.

Create a directory with the files needed by the Guest:

pi@raspberrypi:~$ mkdir -p ~/kvm-arm/opensuse
pi@raspberrypi:~$ cd ~/kvm-arm/opensuse

Now we create a raw image with OpenSuSE’s userland:

pi@raspberrypi:~/kvm-arm/opensuse$ wget http://download.opensuse.org/ports/armv7hl/factory/images/openSUSE-Factory-ARM-JeOS.armv7-rootfs.armv7l-Current.tbz
pi@raspberrypi:~/kvm-arm/opensuse$ qemu-img create -f raw opensuse-factory.img 1G
Formatting 'opensuse-factory.img', fmt=raw size=1073741824
pi@raspberrypi:~/kvm-arm/opensuse$ sudo losetup /dev/loop0 opensuse-factory.img
pi@raspberrypi:~/kvm-arm/opensuse$ sudo mkfs.ext4 /dev/loop0
mke2fs 1.42.5 (29-Jul-2012)
Discarding device blocks: done                            
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
65536 inodes, 262144 blocks
13107 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=268435456
8 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks: 
    32768, 98304, 163840, 229376

Allocating group tables: done                            
Writing inode tables: done                            
Creating journal (8192 blocks): done
Writing superblocks and filesystem accounting information: done

pi@raspberrypi:~/kvm-arm/opensuse$ sudo mount /dev/loop0 /mnt
pi@raspberrypi:~/kvm-arm/opensuse$ sudo tar xf openSUSE-Factory-ARM-JeOS.armv7-rootfs.armv7l-Current.tbz -C /mnt
pi@raspberrypi:~/kvm-arm/opensuse$ sudo umount /mnt
pi@raspberrypi:~/kvm-arm/opensuse$ sudo losetup -d /dev/loop0

And finally, launch your first Guest:

pi@raspberrypi:~/kvm-arm/opensuse$ sudo qemu-system-arm -enable-kvm -smp 1 -m 256 -M vexpress-a15 -cpu host -kernel /home/pi/vexpress-zImage -dtb /home/pi/vexpress-v2p-ca15-tc1.dtb -append "root=/dev/vda console=ttyAMA0 rootwait" -drive if=none,file=/home/pi/opensuse-factory.img,id=factory -device virtio-blk-device,drive=factory -net nic,macaddr=02:fd:01:de:ad:34 -net tap -monitor null -serial stdio -nographic
audio: Could not init `oss' audio driver
Booting Linux on physical CPU 0x0
Initializing cgroup subsys cpuset
Linux version 3.19.1 (slp@linux-ni2o) (gcc version 4.5.3 (GCC) ) #1 SMP Wed Mar 18 10:52:22 CET 2015
CPU: ARMv7 Processor [410fc075] revision 5 (ARMv7), cr=10c5387d
CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache
Machine model: V2P-CA15
Memory policy: Data cache writealloc
PERCPU: Embedded 9 pages/cpu @8fddd000 s7232 r8192 d21440 u36864
Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 65024
Kernel command line: root=/dev/vda console=ttyAMA0 rootwait
PID hash table entries: 1024 (order: 0, 4096 bytes)
Dentry cache hash table entries: 32768 (order: 5, 131072 bytes)
Inode-cache hash table entries: 16384 (order: 4, 65536 bytes)
Memory: 252936K/262144K available (4839K kernel code, 184K rwdata, 1324K rodata, 256K init, 147K bss, 9208K reserved, 0K cma-reserved)
[  OK  ] Reached target Multi-User System.
[  OK  ] Reached target Graphical Interface.
         Starting Update UTMP about System Runlevel Changes...
[  OK  ] Started Update UTMP about System Runlevel Changes.

Welcome to openSUSE Factory "Tumbleweed" - Kernel 3.19.1 (ttyAMA0).

linux login:

Now we can start playing with this Guest. The default credentials are root/linux:

linux login: root
Last login: Tue Mar 17 18:01:41 on ttyAMA0
Have a lot of fun...
linux:~ # cat /proc/cpuinfo
processor   : 0
model name  : ARMv7 Processor rev 5 (v7l)
BogoMIPS    : 38.40
Features    : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm 
CPU implementer : 0x41
CPU architecture: 7
CPU variant : 0x0
CPU part    : 0xc07
CPU revision    : 5

Hardware    : ARM-Versatile Express
Revision    : 0000
Serial      : 0000000000000000

The end And that’s it, now you can extract the full potential of your Raspberry Pi 2!

Speaking for myself, I’m going to work on getting MMIO support on NetBSD (without this, the performance is really awful), and then I’ll be able to build an hybrid Linux/NetBSD SD card image, which was my initial motivation for all this work ;-)