Previously on OpenStack Crime Investigation … Two load balancers running as virtual machine in our OpenStack based cloud, sharing a keepalived based highly available IP address started to flap, switching the IP address back and forth. After ruling out a misconfiguration of keepalived and issues in the virtual network, I finally got the hint that the problem might originate not in the virtual, but in the bare metal world of our cloud. Maybe high IO was causing the gap between the keep alive VRRP packets.
When I arrived at baremetal host node01, hosting virtual host loadbalancer01, I was anxious to see the IO statistics. The machine must be under heavy IO load when the virtual machine’s messages are waiting for up to five seconds.
I switched on my iostat flash light and saw this:
1$ iostat 2Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn 3sda 12.00 0.00 144.00 0 144 4sdc 0.00 0.00 0.00 0 0 5sdb 6.00 0.00 24.00 0 24 6sdd 0.00 0.00 0.00 0 0 7sde 0.00 0.00 0.00 0 0 8sdf 20.00 0.00 118.00 0 118 9sdg 0.00 0.00 0.00 0 0 10sdi 22.00 0.00 112.00 0 112 11sdh 0.00 0.00 0.00 0 0 12sdj 0.00 0.00 0.00 0 0 13sdk 21.00 0.00 96.50 0 96 14sdl 0.00 0.00 0.00 0 0 15sdm 9.00 0.00 64.00 0 64
Nothing? Nothing at all? No IO on the disks? Maybe my bigger flash light iotop could help:
1$ iotop
Unfortunately, what I saw was too ghastly to show here and therefore I decided to omit the screenshots of iotop 1 . It was pure horror. Six qemu processes eating the physical CPUs alive in IO.
So, no disk IO, but super high IO caused by qemu. It must be network IO then. But all performance counters show almost no network activity. What if this IO wasn’t real, but virtual? It could be the virtual network driver! It had to be the virtual network driver.
I checked the OpenStack configuration. It was set to use the para-virtualized network driver vhost_net.
I checked the running qemu processes. They were also configured to use the para-virtualized network driver.
1$ ps aux | grep qemu 2libvirt+ 6875 66.4 8.3 63752992 11063572 ? Sl Sep05 4781:47 /usr/bin/qemu-system-x86_64 3 -name instance-000000dd -S ... -netdev tap,fd=25,id=hostnet0,vhost=on,vhostfd=27 ...
I was getting closer! I checked the kernel modules. Kernel module vhost_net was loaded and active.
1$ lsmod | grep net 2vhost_net 18104 2 3vhost 29009 1 vhost_net 4macvtap 18255 1 vhost_net
I checked the qemu-kvm configuration and froze.
1$ cat /etc/default/qemu-kvm 2# To disable qemu-kvm's page merging feature, set KSM_ENABLED=0 and 3# sudo restart qemu-kvm 4KSM_ENABLED=1 5SLEEP_MILLISECS=200 6# To load the vhost_net module, which in some cases can speed up 7# network performance, set VHOST_NET_ENABLED to 1. 8VHOST_NET_ENABLED=0 9 10# Set this to 1 if you want hugepages to be available to kvm under 11# /run/hugepages/kvm 12KVM_HUGEPAGES=0
vhost_net was disabled by default for qemu-kvm. We were running all packets through userspace and qemu instead of passing them to the kernel directly as vhost_net does! That’s where the lag was coming from!
I acted immediately to rescue the victims. I made the huge, extremely complicated, full 1 byte change on all our compute nodes by modifying a VHOST_NET_ENABLED=0 to a VHOST_NET_ENABLED=1, restarted all virtual machines and finally, after days of constantly screaming in pain, the flapping between both load balancers stopped.
I did it! I saved them!
But I couldn’t stop here. I wanted to find out, who did that to the poor little load balancers. Who’s behind this conspiracy of crippled network latency?
I knew there was only one way to finally catch the guy. I set a trap. I installed a fresh, clean, virgin Ubuntu 14.04 in a virtual machine and then, well, then I waited — for apt-get install qemu-kvm to finish:
1$ sudo apt-get install qemu-kvm 2Reading package lists... Done 3Building dependency tree 4Reading state information... Done 5The following extra packages will be installed: 6 acl cpu-checker ipxe-qemu libaio1 libasound2 libasound2-data libasyncns0 7 libbluetooth3 libboost-system1.54.0 libboost-thread1.54.0 libbrlapi0.6 8 libcaca0 libfdt1 libflac8 libjpeg-turbo8 libjpeg8 libnspr4 libnss3 9 libnss3-nssdb libogg0 libpulse0 librados2 librbd1 libsdl1.2debian 10 libseccomp2 libsndfile1 libspice-server1 libusbredirparser1 libvorbis0a 11 libvorbisenc2 libxen-4.4 libxenstore3.0 libyajl2 msr-tools qemu-keymaps 12 qemu-system-common qemu-system-x86 qemu-utils seabios sharutils 13Suggested packages: 14 libasound2-plugins alsa-utils pulseaudio samba vde2 sgabios debootstrap 15 bsd-mailx mailx 16The following NEW packages will be installed: 17 acl cpu-checker ipxe-qemu libaio1 libasound2 libasound2-data libasyncns0 18 libbluetooth3 libboost-system1.54.0 libboost-thread1.54.0 libbrlapi0.6 19 libcaca0 libfdt1 libflac8 libjpeg-turbo8 libjpeg8 libnspr4 libnss3 20 libnss3-nssdb libogg0 libpulse0 librados2 librbd1 libsdl1.2debian 21 libseccomp2 libsndfile1 libspice-server1 libusbredirparser1 libvorbis0a 22 libvorbisenc2 libxen-4.4 libxenstore3.0 libyajl2 msr-tools qemu-keymaps 23 qemu-kvm qemu-system-common qemu-system-x86 qemu-utils seabios sharutils 240 upgraded, 41 newly installed, 0 to remove and 2 not upgraded. 25Need to get 3631 kB/8671 kB of archives. 26After this operation, 42.0 MB of additional disk space will be used. 27Do you want to continue? [Y/n] < 28... 29Setting up qemu-system-x86 (2.0.0+dfsg-2ubuntu1.3) ... 30qemu-kvm start/running 31Setting up qemu-utils (2.0.0+dfsg-2ubuntu1.3) ... 32Processing triggers for ureadahead (0.100.0-16) ... 33Setting up qemu-kvm (2.0.0+dfsg-2ubuntu1.3) ... 34Processing triggers for libc-bin (2.19-0ubuntu6.3) ...
And then, I let the trap snap:
1$ cat /etc/default/qemu-kvm 2# To disable qemu-kvm's page merging feature, set KSM_ENABLED=0 and 3# sudo restart qemu-kvm 4KSM_ENABLED=1 5SLEEP_MILLISECS=200 6# To load the vhost_net module, which in some cases can speed up 7# network performance, set VHOST_NET_ENABLED to 1. 8VHOST_NET_ENABLED=0 9 10# Set this to 1 if you want hugepages to be available to kvm under 11# /run/hugepages/kvm 12KVM_HUGEPAGES=0
I could not believe it! It was Ubuntu’s own default setting. Ubuntu, the very foundation of our cloud decided that despite all modern hardware supporting vhost_net to turn it off by default. Ubuntu was convicted and I could finally rest.
This is the end of my detective story. I found and arrested the criminal Ubuntu default setting and were able to prevent him from further crippling our virtual network latencies.
Please feel free to leave comments and ask questions about details of my journey. I’am already negotiating to sell the movie rights. But maybe there will be another season of OpenStack Crime Investigation in the future. So stay tuned on codecentric Blog.
Footnotes
More articles
fromLukas Pustina
Your job at codecentric?
Jobs
Agile Developer und Consultant (w/d/m)
Alle Standorte
More articles in this subject area
Discover exciting further topics and let the codecentric world inspire you.
Gemeinsam bessere Projekte umsetzen.
Wir helfen deinem Unternehmen.
Du stehst vor einer großen IT-Herausforderung? Wir sorgen für eine maßgeschneiderte Unterstützung. Informiere dich jetzt.
Hilf uns, noch besser zu werden.
Wir sind immer auf der Suche nach neuen Talenten. Auch für dich ist die passende Stelle dabei.
Blog author
Lukas Pustina
Do you still have questions? Just send me a message.
Do you still have questions? Just send me a message.