Running a standalone OpenStack Neutron server

One of the great advantage for an OpenStack developer is the ease with which a dev environment can be created. I cannot say enough good things about devstack. Devstack is a tool that provides a very flexible way of creating development environment for OpenStack.

Devstack is very flexible and can be configured using simple config file (local.conf). Another advantage of running devstack based environment is that it hardly needs any special hardware prerequisite. A VM on your laptop is good enough to bring-up an all-in-one OpenStack environment, although a good amount of RAM and CPU for your VM will yield better results.

As a developer interested in OpenStack networking (Neutron) my interest lies mostly on the Neutron service and most of the time I find a lot of OpenStack services are not really required for my day-to-day activity. So I decided to tweak the my devstack config file to start only the minimum services, just enough to run networking service and save a little on my devstack VMs RAM and CPU requirement.

The OpenStack networking service itself depends on the common set of infrastructure services like the database server, rabbitmq etc. The following is the local.conf that I used for this purpose.

[[local|localrc]]
ADMIN_PASSWORD=XXXXXXXXX
DATABASE_PASSWORD=$ADMIN_PASSWORD
RABBIT_PASSWORD=$ADMIN_PASSWORD
SERVICE_PASSWORD=$ADMIN_PASSWORD
 
GIT_BASE=https://git.openstack.org
LOGFILE=/opt/stack/logs/stack.sh.log 
#Q_PLUGIN_EXTRA_CONF_PATH += '/etc/neutron/fwaas_driver.ini'
RECLONE=yes
LIBS_FROM_GIT=python-neutronclient
 
disable_all_services

enable_service rabbit
enable_service database
enable_service mysql
enable_service infra
enable_service keystone
enable_service q-svc
enable_service neutron

With the above local.conf only the network service and basic infrastructure are started by devstack. Here is a list of windows in my screen session

0$ shell  1$(L) key  2$(L) key-access  3$(L) q-svc   4$ code*  5-$ log

The Last two windows are manually created for browsing the code and looking at my custom logs.

Advertisements

SSH Jump Host and Connection Multiplexing

Jump Hosts

While working with libvirt as my primary hypervisor to launch test VMs I need a way to connect to the VMs easily over SSH. As libvirt uses private network and SNAT for connecting the VMs to external world getting SSH access to the VMs requires Port Forwarding or DNAT.

I recently came to know about SSH Jump Host configuration. It which uses the SSH ProxyCommand to tunnel the SSH connection through intermediate hosts. I found it very useful to connect to my VMs hosted by Libvirt KVM on private network. Here is the command that I use to connect to the VMs

ssh -t -o ProxyCommand='ssh hypervisor_user@my-hypervisor1 nc vm1 22' vm_user@vm1

What is more amazing is SSH allows multiple intermediate Jump Hosts in the path.

Here is another trick taken from Gentoo wiki. Add the following configuration to your ssh config file at ~/.ssh/config

Host *+*
   ProxyCommand ssh $(echo %h | sed 's/+[^+]*$//;s/\([^+%%]*\)%%\([^+]*\)$/\2 -l \1/;s/:/ -p /') exec nc -w1 $(echo %h | sed 's/^.*+//;/:/!s/$/ %p/;s/:/ /')

with this config in place we can specify multiple intermediate jump hosts in the following format

ssh user1%host1:port1+user2%host2:port2+ host3:port3 -l user3

Connection Multiplexing

Connection multiplexing is a way to optimize creation of SSH connection between the client and server when frequent requests are made from the client to the server. Instead of creating a new SSH connection for each request and closing it down which incurs delays, it is easier to reuse an existing SSH connection.

ssh -M -S ~/.ssh/controlmasters/user1@server1:22 server1
ssh -S ~/.ssh/controlmasters/user1@server1:22 server1

It is easier to set this up with the ssh config file, here is an example:

 

Host Server1
       HostName server1
       ControlPath ~/.ssh/controlmasters/%r@%h:%p
       ControlMaster auto
       ControlPersist 10m

 

Ref: https://en.wikibooks.org/wiki/OpenSSH/Cookbook/Multiplexing

 

Test-Driving OSPF on RouterOS – Interoperability

So I wrote about OSPF on RouterOS in my previous post. It was a nice experiment to learn about routing protocols.

I wanted to take it a little further and test Interoperability of RouterOS with other open source solutions.

This post is an update from the previous one and I will add OSPF neighbor nodes to the setup. I decided to use Quagga the most talked about open-source routing protocol suit and XORP the eXtensible Open Router Platform.

Updated Setup

The following is the updated setup for the Interoperability test. I have added two new Ubuntu nodes as OSPF neighbor.

  • Quagga on Ubuntu
  • XORP on Ubuntu

Slide3.jpg

Configuration

Quagga

The following configuration was added to Quagga node

Screenshot from 2016-03-27 12:33:55.png

XORP

The XORP node did not advertise any new subnet but received OSPF updates.

XORP_Conf.png

Results

  • All the nodes could discover their neighbors

Screenshot from 2016-03-27 00:03:27.png

  • All nodes got route updates.

Screenshot from 2016-03-27 01:54:34.png

  • OSPF Traces

Screenshot from 2016-03-27 01:57:34.png

Test-driving OSPF on RouterOS

I came across RouterOS by MikroTik© which provides advances routing protocol support. What is more amazing is they provide a RouterOS in a virtual form-factor called Cloud Hosted Router (CHR) that can be installed on hypervisors like KVM/VirtualBox/VMware.

Please look at licensing model at http://wiki.mikrotik.com/wiki/Manual:CHR#CHR_Licensing

This is perfect for learning purposes and experimenting at home. So I decided to test OSPF routing with Router OS.

The Setup

The following diagram describes my network setup. All for these are installed as VMs on my home desktop. Slide2

The footprint of the router VMs are quite small. MikroTik© recommends 128 MB RAM and 128 MB of HDD as minimal hardware requirements. I used virt-manager to setup the test network. Here is a typical VM configuration.

The actual setup however needs some hosts on the network to test the connectivity after implementing OSPF. To keep things lite weight I used NameSpaces to simulate hosts connected to the routers. Linux bridges were used to connect the routers and the hosts. The following figures show the final setup. Slide1

OSPF Configuration

For testing purpose I restricted my setup to area 0 to which both routers are connected. Following configuration is used on the routers.

Router1

/routing ospf instance
set [ find default=yes ] router-id=10.0.1.1
/ip address
add address=192.168.122.101/24 interface=ether1 network=192.168.122.0
add address=10.0.12.1/24 interface=ether2 network=10.0.12.0
add address=10.0.1.1 interface=loopback network=10.0.1.1
add address=10.10.0.1/24 interface=ether4 network=10.10.0.0
/routing ospf network
add area=backbone network=10.0.12.0/24
add area=backbone network=10.10.0.0/24
/system identity
set name=router1
[admin@router1] >

Router2

/routing ospf instance
set [ find default=yes ] router-id=10.0.2.1
/ip address
add address=192.168.122.102/24 interface=ether1 network=192.168.122.0
add address=10.0.12.2/24 interface=ether3 network=10.0.12.0
add address=10.20.0.1/24 interface=ether4 network=10.20.0.0
add address=10.0.2.1 interface=loopback network=10.0.2.1
/routing ospf network
add area=backbone network=10.0.12.0/24
add area=backbone network=10.20.0.0/24
/system identity
set name=router2
[admin@router2] >

Config-1

Results

I was able to get OSPF running with RouterOS in no time. Here are the test results.

  • Routing tables on the routers

OSPF-route

  • Routing tables on the hosts

HOST-route

  • Ping tests

PING

  • OSPF Traces

OSPF-ROS

Test driving OpenWRT

Recently I have been looking at tools for managing and monitoring my home network. In my previous post I talked about using a Network Namespace to control the download limit.

Now I wanted to look at more advanced tools for the job. OpenWRT is a Linux based firmware, which supports a lot of networking hardware. I am exploring the possibility of flashing OpenWRT on my backup router at home.

To test OpenWRT I used a KVM image (which can be found here) and started a VM on my desktop. The following diagram shows the network topology.

Slide1

Little tweaking is required for making OpenWRT work with libvirtd. The idea is to push the incoming traffic to OpenWRT and apply traffic monitoring/policy.

Libvirt provides dnsmasq service which listens on bridge virbr0 and provides DHCP ip to the VMs. It also configures NAT rules for traffic going out of the VMs through the virbr0.

  • For this test we will remove the NAT rules on the bridge virbr0. All applications on the desktop will communicate through this bridge to OpenWRT which will route the traffic to the Internet.
  • I also stopped the odhcpd and dnsmasq server running on OpenWRT. Started a dhsclient on the lan interface (br-lan) to request a IP from libvirtd.

Once OpenWRT is booted you can login to the web interface of the router to configure it.

The following figure shows the networking inside OpenWRT router.

Slide2

The routing table on my desktop is as followsScreenshot from 2016-03-06 20:34:41

The routing table on the OpenWRT server is show belowScreenshot from 2016-03-06 20:34:29

OpenWRT allows installation of extra packages to enhance its functionality.I could find packages like quagga, bird etc which will be interesting to explore.

Screenshot from 2016-03-06 17:51:13.png

It provides traffic monitoring and classifications.

Screenshot from 2016-03-06 19:41:27

Openwrt provider firewall configuration using iptables.

Screenshot from 2016-03-06 17:48:57

I will be exploring more of its features before deciding if I will flash it on my backup home router.

Rate Limiting ACT broadband on Ubuntu

ISPs have started to provide high bandwidth connections while the FUP (Fair Usage Policy) limit is still not enough (I am using ACT Broadband). Once you decide to be on youtube most of the time the download limit gets exhausted rather quickly.

As I use Ubuntu for my desktop, I decided to use TC to throttle my Internet bandwidth to bring in some control over my Internet bandwidth usage. Have a look at my previous posts about rate limiting and  traffic shaping on Linux to learn about usage of TC.

Here is my modest network setup at home.

Slide1

The problem is that TC can throttle traffic going out on an interface but traffic shaping will not impact the download bandwidth.

The Solution

To get around this problem I introduced a Linux network namespace into the topology. Here is how the topology looks now.

Slide2

I use this script to setup the upload/download bandwidth limit.

Results

Here are readings before and after applying the throttle

Before

media-20160302

After rate-limiting to 1024Kbps upload and download

media-20160302-1.png

Implementing Basic Networking Constructs with Linux Namespaces

In this post, I will explain the use of Linux network namespace to implement basic networking constructs like a L2 switching network and Routed network.

Lets start by looking at the basic commands to create, delete and list network namespaces on Linux.

The next step is to create a LAN, we will use namespaces to simulate two nodes connected to a bridge and simulate a LAN inside the Linux host. We will implement a topology like the one shown below

lan

Finally, let see how to simulate a router to connect two LAN segments. We will implement the simplest of the topology with just two nodes connected to a router on different LAN segments

router

Test driving traffic shaping on Linux

In my last post, I shared a simple setup that does bandwidth limiting on Linux using TBF (Token Bucket Filter). The TBF based approach applies a bandwidth throttle on the NIC as a whole.

The situation in reality might be more complex then what the post described. Normally the users like to control bandwidth based on the type of application generating the traffic.

Lets take a simple example; the user may like to allow his bandwidth to be shared by application traffic as follows

  • 50% bandwidth available to web traffic
  • 30% available to mail service
  • 20% available for rest of the application

Traffic Control on Linux provides ways to achieve this using classful queuing discipline.

In essence, this type of traffic control can be achieved by first classify the traffic in to different classes and applying traffic shaping rules to each of those classes.

The Hierarchical Token Bucket Queuing Discipline

Although Linux provides various classful queuing discipline, in this post, we will look at Hierarchical Token Bucket (HTB) to solve the bandwidth sharing and control problem.

HTB is actually a Hierarchy of TBF (Token Bucket filter we described in the last post) applied to a network interface. HTB works by classifying the traffic and applying TBF to individual class of traffic. So to understand HTB we must understand Token Bucket Filer first.

How Token Bucket Filter works

Lets get a deeper look at how the Token Bucket Filter (TBF) works.

The TBF works by having a Bucket of tokens attached to the network interface and each time a packet needs to be passed over the network interface a token is consumed from the Bucket. The kernel produces these tokens at a fixed rate to fill-up the bucket.

When the traffic is flowing at a slower pace then the rate of token generation the bucket will fill up. Once filled up the bucket will reject all the extra tokens generated by the kernel. The tokens accumulated in the bucket can help in passing a burst of traffic (limited by the size of the bucket) over the interface.

When the traffic is flowing at a pace higher then the rate of token generation the packet must wait until the next token is available in the bucket before being allowed to pass over the network interface.

TBF

In the tc command line the size of the bucket is related to burst size, rate of token generation is related to rate and the latency parameter provides the amount of time a packet can be in the queue waiting for a token before being dropped.

The HTB queuing discipline

The following figure describes the working of HTB queuing discipline.

HTB

To apply HTB discipline we have to go through the following steps

  • Define different classes with their rate limiting attributes
  • Add rules to classify the traffic in to different classes

In this example we will try to implement the same traffic sharing requirement as mentioned in the introduction section. Web traffic will get 50% of the bandwidth while mail gets 30% and 20% is shared by all other traffic.

The following rules define the various classes with the traffic limits

tc qdisc add dev eth0 root handle 1: htb default 30
tc class add dev eth0 parent 1: classid 1:1 htb rate 100kbps ceil 100kbps
tc class add dev eth0 parent 1:1 classid 1:10 htb rate 50kbps ceil 100kbps
tc class add dev eth0 parent 1:1 classid 1:20 htb rate 30kbps ceil 100kbps
tc class add dev eth0 parent 1:1 classid 1:30 htb rate 20mbps ceil 100kbps

Slide1

Now we must classify the traffic into their classes based on some match condition. The following rules classify the web and mail traffic in to class 10 and 20. All other traffic are pushed to class 30 by default

tc filter add dev eth0 protocol ip parent 1:0 prio 1 u32 match ip dport 80 0xffff flowid 1:10 
tc filter add dev eth0 protocol ip parent 1:0 prio 1 u32 match ip dport 25 0xffff flowid 1:20

Verifying the results

We will use iperf to verify the results of the traffic control changes. Use the following command to start 2 instances of iperf on the server

iperf -s -p <port number> -i 1

For our example, we use the following commands to start the servers

iperf -s -p 25 -i 1
iperf -s -p 80 -i 1

The clients are started with the following commands

iperf –c <server ip> -p <port number> -t <time period to run the test>

For this example we used the following commands to run the test for 60 secs. The server IP of 192.168.90.4 was used.

iperf -c 192.168.90.4 -p 25 -t 60
iperf -c 192.168.90.4 -p 25 -t 60

Here is a snapshot of output from the test.

HTB

It show the web traffic to be close to 500kbps while mail traffic to be close to 300kbps, the same ration we wanted to shape the traff. When excess bandwidth is available HTB will split it in the same ratio that we configured for the classes.

Network Bandwidth Limiting on Linux with TC

On Linux Traffic Queuing Discipline attached to a NIC can be used to shape the outgoing bandwidth. By default, Linux uses pfifo_fast as the queuing discipline. Use the following command to verify the setting on your network card

# tc qdisc show dev eth0
qdisc pfifo_fast 0: root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1

Measuring the default bandwidth

I am using iperf tool to measure the bandwidth between my VirtualBox instance (10.0.2.15) and my Desktop (192.168.90.4) acting as the iperf server.

Start the iperf server with the following command

iperf -s

The client can then connect using the following command

iperf -c <server address>

The following figure shows the bandwidth available with default queue setting

TBF1

Limiting Traffic with TC

We will use Token Bucket Filter to throttle the outgoing traffic. The following command sets an egress rate of 1024kbit at a latency of 50ms and a burst rate of 1540

# tc qdisc add dev eth0 root tbf rate 1024kbit latency 50ms burst 1540

Use the tc qdisc show command to verify the setting

# tc qdisc show dev eth0
tc qdisc add dev eth0 root tbf rate 1024kbit latency 50ms burst 1540

Verifying the result

I measured the bandwidth again to make sure the new queuing configuration is working and sure enough, the result from iperf confirmed it.

TBF2

The following command shows the detailed statistics of the queuing discipline

TBF3Impact of different parameters for Token Bucket Filter (TBF)

Decreasing the latency number leads to packet drops, follow figure captures the result after latency was dropped to 1ms.

TBF5

Providing a big burst buffer defeats the rate limiting

TBF4TBF6

Both the parameters needs to be chosen properly to avoid packet loss and spike in traffic beyond the rate limit

Test driving CRIU – Live Migrate any process on Linux

CRIU (Checkpoint and Restore In User-space) on Linux enables the users to backup and restore any live user-space process. This means that the process state can be frozen in time and stored as image files. These images can be used to restore the process.

Some interesting use cases that can be supported by CRIU are.

  • Process persistence across server reboot: Even after you have rebooted the server the image file can be used to restore the process
  • VMotion like Live migration for Processes: The image files can be copied over to another server and the process can be restored in the new server.

In this post, I will be exploring CRIU to checkpoint and restore a simple webserver process. We will also explore migration of the process across servers

Lets start by installing CRIU packages. I am using Ubuntu vivid and the package is available in the Ubuntu repository. We can use the following command to check and install CRIU

apt-cache search criu
apt-get install criu

The webserver process

For this experiment, I wanted a simple server process with an open network port and some internal state information to verify if CRIU can successfully restore the network connectivity as well as the state of the process.

I will start a simple webserver using this python script. This webserver keeps count of the request from the client and maintains an in memory list of all the previous requests it served. Create a new directory as shown below and change to this directory. Use wget to download the script ash shown below.

chandan@chandan-VirtualBox:~$ mkdir criu
chandan@chandan-VirtualBox:~$ cd criu
chandan@chandan-VirtualBox:~/criu$ wget https://bitbucket.org/api/2.0/snippets/xchandan/jge8x/9464e8e341c4c845aebf3a21e9d20e472baa4c5e/files/server.py

Now start the webserver from this directory by executing the following command

chandan@chandan-VirtualBox:~/criu$ python server.py 8181

Verify that the webserver is running by pointing your browser to http://localhost:8181 and refresh the page a few times to build the application’s internal state. Every refresh should increase the request number

You should see output similar to this.

CRIU1

Keep this process running and open another terminal. Use ps command to find the process id (PID) of the webserver

chandan@chandan-VirtualBox:~/criu/dump_dir$ ps aux|grep server.py
chandan 32601 0.0 0.1 40696 12760 pts/18   S+   20:40   0:00 python server.py 8181
chandan 32717 0.0 0.0   9492 2252 pts/1   S+   20:47   0:00 grep --color=auto server.py

We now have the PID of the webserver that is 32601

Checkpoint the webserver process

Checkpointing the webserver will freeze its process state and dump this state information into a directory. Make a new directory and go to the new directory.

Now execute the “criu dump -t <process id> –shell-job” command to checkpoint the process. Flag “–shell-job” is required if you want to use CRIU with processes directly started from a shell.

chandan@chandan-VirtualBox:~/criu/dump_dir$ sudo criu dump -t 32601 --shell-job

When the process exists, the directory will have many new files, which stores the state of the webserver process in the form of image files.

The dump command actually kills the webserver process; you can verify the same with the ps and grep command. This can also be verified by trying to browse the webserver address using your browser (which should fail).

CRIU2NOTE: with the CLI option “–leave-stopped” the dump command leaves the process in stopped state instead of killing it. This way the process can be restored in case a migration fails

Restoring the process

To restore the process go to the directory where the image files for the process are stored and execute the following command

chandan@chandan-VirtualBox:~/criu/dump_dir$ sudo criu restore --shell-job

This command will not return, as it is now the web server process. Keep this process running and verify that you can open the webserver URL.

You should see the output similar to this, the request count should continue from where it was before the application checkpoint was made. In this case we continue from Request No: 15 and all the state information is successfully restored as shown in the screenshot.

CRIU3

Restoring after machine reboot

You can now reboot your machine and again try to restore the webserver process. You should to able to restore the process and it should again continue from the check pointed request no.

Migrating webserver to another machine

process-migration

To migrate the webserver process we need an exact match of the runtime environment of the process on the target machine. This means the working directories, any resources like files, ports etc should be present on the target system. This is why process migration with CRIU will make more sense in a container based environment where the environment for the process can be closely controlled.

To start the migration, first copy the image files to the target machine

chandan@chandan-VirtualBox:~/criu$ scp -r dump_dir/ chandan@192.168.90.3:

Make sure that the environment for the process is present on the target machine, in my case i had to create the current working directory for the webserver after CRIU prompted with an error message.

chandan@chandan-ubuntu15:~/dump_dir$ sudo criu restore --shell-job 32601: Error (files-reg.c:1024): Can't open file home/chandan/criu on restore: No such file or directory
 32601: Error (files-reg.c:967): Can't open file home/chandan/criu: No such file or directory
 32601: Error (files.c:1070): Can't open cwd
Error (cr-restore.c:1185): 32601 exited, status=1
Error (cr-restore.c:1838): Restoring FAILED.
chandan@chandan-ubuntu15:~/dump_dir$ 
chandan@chandan-ubuntu15:~/dump_dir$ mkdir ~/criu
chandan@chandan-ubuntu15:~/dump_dir$ sudo criu restore --shell-job
192.168.90.2 - - [10/Aug/2015 01:26:47] &quot;GET / HTTP/1.1&quot; 200 -
192.168.90.2 - - [10/Aug/2015 01:26:47] code 404, message File not found
192.168.90.2 - - [10/Aug/2015 01:26:47] &quot;GET /favicon.ico HTTP/1.1&quot; 404 -
192.168.90.2 - - [10/Aug/2015 01:26:47] code 404, message File not found
192.168.90.2 - - [10/Aug/2015 01:26:47] &quot;GET /favicon.ico HTTP/1.1&quot; 404 -

Here is a screenshot of the webserver restored on a remote machine side-by-side of the local machine. You can see that both the processes start with the same internal state and continue on different path by looking at the state info for Request No 15.

CRIU4

Conclusion

In this post we saw how to checkpoint and restore any linux application. We could verify that the application could be restarted on a different server and its internal state can be restored.

In my future post I will explore using CRIU with containers to provide migration of containers

Update:

I found this interesting post describing live migration of LXD/LXC containers, and a demo video of the live migration of container  running the game Doom. Here is one more post about running a live migration of Docker container running Quake