Test driving App Firewall with IPTables

With more and more application moving to the cloud, web based applications have become ubiquitous. They are ideal for providing access to applications sitting on the cloud (over HTTP through a standard web browser). This has removed the need to install specialized application on the client system, the client just needs to install is a fairly modern browser.

While this is good for reducing load on the client, the job of the firewall has become much tougher.

Traditionally firewall rules look at the Layer 3 and Layer 4 attributes of a packet to identify a flow and associate it with applications generating the traffic. To a traditional firewall looking at L3/L4 headers all the traffic between the client and different web apps looks like http communication. Without proper classification of traffic flows the firewall is not be able to apply a security policy.

It has now become important to look at the application layer to identify the traffic associated with a web service or web application and enforce effective security and bandwidth allocation policy.

In this blog, I will look at features provided by IPTables that can be used to classify packets by application Layer header and how this can be used to implement security and other network policy.

Looking in to the application layer

IPTables are the de-facto choice for implementing firewall on Linux. It provides extensive packet matching, classification, filtering and many more facilities. Like any traditional firewall the core features of IPTables allow packet matching with Layer3 and Layer4 header attributes. These features as we discussed in the introduction may not be sufficient to differentiate between traffic from various web apps.

While researching for a solution to provide APP Firewall on Linux I came across an IPTables extensions called NFQUEUE.

The NFQUEUE extension provides a mechanism to pass a packet to a user-space program which can run some kind of test on the packet and tell IPTables what action(accept/drop/mark) to perform for the packet.


This gives a lot of flexibility for the IPTables user to hook up custom tests for the packets before it is allowed to pass through the firewall.

To understand how NFQUEUE can help classify and filter traffic based on application layer headers, let’s try to implement a web app filter providing URL based access control. In this test we will extract the request URL from the HTTP header and the filter will allow access based on this URL

A simple web app

For this experiment, we will use python bottle to deploy two application. Access will be allowed for the first app(APP1) while access to the second app(APP2) will be denied.

We will use the following code to deploy the sample Apps

from bottle import Bottle

app1 = Bottle()
app2 = Bottle()

def app1_route():
    return 'Access to APP1!\n'

def app2_route():
    return 'Access to APP2!\n'

if __name__ == '__main__':
    app1.run(server='eventlet', host="", port=8081)

The web application will bind to port 8081 and local IP of

NOTE: we need a eventlet based bottle server else the application hangs after a deny from the app filter(connections are not closed and the next request is not processed)

To access the web apps use the curl commands


Configuring the IPTables NFQUEUE

The next step is to configure IPTables to forward the client traffic accessing the web apps to our user space web-app filter.

The NFQUEUE IPTables extension works by adding a new target to IPTables called NFQUEUE. This target allows IPTables to put the matching packet on a queue. These packets can then be read from this queue by a filter application in user space. The filter application can then perform custom tests and provide a verdict to allow or deny the packet.

The NFQUEUE extension provides 65535 different queues. It also provides fail-safe options like what action IPTables should take if a queue is created but no filter is attached to it, load balancing of packets across multiple queues. Also, there are knobs in the /proc filesystem to control how much of the packet data will be copied to user space. A complete list of options can be found in the iptables extensions man page

To enable NFQUEUE for the web-app traffic we will add the following rule to IPTables.

iptables -I INPUT -d -p tcp --dport 8081 -j NFQUEUE --queue-num 10 --queue-bypass

The –queue-num option selects the NFQUEUE number to which the packet will be queued. The –queue-bypass option allows the packet to be accepted if no custom filter is attached to queue number 10, without this option if no filter is attached to the queue, packets will be dropped.

Implementing a simple APP filter

With the above IPTables rule the packets destined for our sample web app will be pushed into NFQUEUE number 10.  I am going to use the python bindings for NFQUEUE called nfqueue-bindings to develop the filter. Let’s run a simple print and drop filter.


# need root privileges

import struct
import sys
import time

from socket import AF_INET, AF_INET6, inet_ntoa

import nfqueue
from dpkt import ip

def cb(i, payload):
    print "python callback called !"
    return 1

def main():
    q = nfqueue.queue()
    print "setting callback"
    print "open"
    q.fast_open(10, AF_INET)
    print "trying to run"
    except KeyboardInterrupt, e:
    	print "interrupted"
    print "unbind"
    print "close"

if __name__ == '__main__':

Now we have tested that the packets trying to access our web app are passing through a app filter implemented in user space. Next we need to unpack the packet and look at the HTTP header to extract the URL that the user is trying to access. For unpacking the headers we will use python dpkt library. The following code will let us access to APP1 and deny access to APP2


# need root privileges

import struct
import sys
import time

from socket import AF_INET, AF_INET6, inet_ntoa

import nfqueue
import dpkt
from dpkt import ip

count = 0

def cb(i, payload):
    global count
    count += 1
    data = payload.get_data()

    pkt = ip.IP(data)
    if pkt.p == ip.IP_PROTO_TCP:
        # print "  len %d proto %s src: %s:%s    dst %s:%s " % (
        #        payload.get_length(),
        #        pkt.p, inet_ntoa(pkt.src), pkt.tcp.sport,
        #        inet_ntoa(pkt.dst), pkt.tcp.dport)
        tcp_pkt = pkt.data
        app_pkt = tcp_pkt.data
            request = dpkt.http.Request(app_pkt)
            if "APP1" in request.uri:
                print "Allowing APP1"
            elif "APP2":
                print "Denying APP2"
                print "Denying by default"
        except (dpkt.dpkt.NeedData, dpkt.dpkt.UnpackError):
        print "  len %d proto %s src: %s    dst %s " % (
               payload.get_length(), pkt.p, inet_ntoa(pkt.src), 

    return 1

def main():
    q = nfqueue.queue()

    print "setting callback"

    print "open"
    q.fast_open(10, AF_INET)


    print "trying to run"
    except KeyboardInterrupt, e:
        print "interrupted"

    print "%d packets handled" % count

    print "unbind"
    print "close"
if __name__ == '__main__':

Here are the result of the test on the client


The output from the filter on the firewall


What else can be done with App based traffic classification

Firewall is just one use-case of the advance packet classification. With the flows identified and associated to different applications we can apply different routing and forwarding policy. NFQUEUE based filter can be used to set different firewall marks on the classified packets. The firewall marks can then be used to implement policy based routing in Linux.

IPTables: Matching A GRE packet based on tunnel key

I was trying to figure out a way to match packets with a certain GRE key and take some action. IPTables does not provide a direct solution to this problem but has the u32 extension modules that can be used to extract 4 bytes of the IP header and match against a pattern.

So, I decided to give a try to this extension.

Prepare the setup

I created a tunnel between 2 of my VMs and assign IP address to the tunnel interfaces

On VM1

sudo ip tunnel add tun2 mode gre remote local ttl 255 key 22

sudo ifconfig tun2 up

On VM2

sudo ip tunnel add tun2 mode gre remote local ttl 255 key 22

sudo ifconfig tun2 up

Start with a basic rule

Next, created a IPTables rule on the receiving system to generate logs for packet match, but you can also create an ACCEPT rule and check the builtin packet counter for the rule.

sudo iptables -I INPUT -p 47 -m limit --limit 20/min -j LOG --log-prefix "IPT GRE" --log-level 4

Now start ping from VM2 to VM2


You can keep a watch on the packet counters with the following command

watch "sudo iptables -L -v -n"

The GRE header

Next, a look at the GRE header format (taken from RFC https://tools.ietf.org/html/rfc2890). The header format is described in the RFC and it contains an optional 32bit key, which is the data of our interest.

0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
|C| |K|S| Reserved0       | Ver |         Protocol Type         |
|      Checksum (optional)      |       Reserved1 (Optional)    |
|                         Key (optional)                        |
|                 Sequence Number (Optional)                    |

Run the following tcpdump command to capture the packets(My VMs don’t have GUI)

sudo tcpdump -s 0 -n -i ens3 proto GRE -w dump.pcap

The captured packets can be analyzed using wireshark


Understanding the iptables u32 extension match rule

Basically u32 module is able to extract 4 byte of data from the IP header at a given offset and match with the given hex number or range. Here is an example of a u32 match rule from the man-page. It matches packets within a certain length. The man page describes the format of the rule, you provide an offset , u32 extracts 4 byte from the offset position, and then we AND it with the MASK and finally compare with the HEX value


match IP packets with total length >= 256
The IP header contains a total length field in bytes 2-3.

--u32 "0 & 0xFFFF = 0x100:0xFFFF"

read bytes 0-3
AND that with 0xFFFF (giving bytes 2-3), 
and test whether that is in the range [0x100:0xFFFF]

The man page has more details.

Craft a match for GRE Key

The IP header length is 20 bytes and the GRE key starts at 24 bytes, as can be confirmed from the wireshark. At the beginning of the rule match starts at the IP header(highlighted in the wireshark screenshot)


Based on the example from the man page I crafted the following rule to match the GRE key.

sudo iptables -I INPUT -p 47 -m u32 --u32 "24 & 0xFFFFFFFF = 0x16" -m limit --limit 20/min -j LOG --log-prefix "IPT GRE key 22" --log-level 4

Checking for Key Present Flag

But the key can be optional. So, add match for Key-Present Flag.

sudo iptables -I INPUT -p 47 -m u32 --u32 "20 & 0x20000000 = 0x20000000 && 24 & 0xFFFFFFFF = 0x16" -m limit --limit 20/min -j LOG --log-prefix "IPT GRE key 22" --log-level 4

Here is a screen capture of the iptables packet counters

Chain INPUT (policy ACCEPT 294K packets, 78M bytes)
 pkts bytes target prot opt in out source destination
 711 79632 LOG 47 -- * * u32 "0x14&0x20000000=0x20000000&&0x18&0xffffffff=0x16" limit: avg 20/min burst 5 LOG flags
 0 level 4 prefix "IPT GRE key 22"

The above rule is simplistic and good to get you started but has short comings, e.g. it assumes a constant IP header length.

The man page describes examples of how to handle variable length headers, fragmentation check etc.

Test-driving multiboot on Raspberry Pi – without BerryBoot/Noobs


Recently I got a Raspberry Pi 3 board and wanted to try out various OS options on it. I realized quite quickly that to try a new OS I would need to  block copy (dd) the OS image to my SD card every time. I am running short on micro SD cards and it has a size limit too.

While I have a bunch of USB sticks lying around unused. So was thinking if the USB sticks could be used.

A little research on the Internet came up with 2 prominent options BerryBoot and Noobs. Both options allow multibooting your Pi board with different OS distros. While this is a good enough solution I wanted to know how things work internally and if there was a simple way to achieve multiboot  without using any tools (and more so for my learning purpose).

On the Internet there is a lot of information on how to install and boot Linux from USB sticks for Raspberry Pi. The process is summarized in the following section.

Run Pi with Linux from USB stick

In simple words, booting Linux involves loading the kernel, which initializes the hardware, and then mounting the root filesystem, which has all the user applications. Usually the kernel images are kept in the first partition and this partition is mounted on /boot directory.

Going through the Pi documentation, it looks like Pi boards recognize the SD card as the only boot device. So the trick to run Linux on Pi from USB stick involves installing the kernel images on the SD card while keeping the root file system on the USB stick and providing the information about the root filesystem location to the kernel in the boot command line.

If we look ate the space usage, a typical kernel image is only around 10mb in size. With all the data in the /boot directory it is still within 30Mb of space, while the root file system size can be much bigger based on the user application and data.

How Boot loading on Pi works

The boot process on Pi expects the SD card to have a FAT32 based first partition. To boot Linux, the kernel image must be present on this partition. For Pi0, Pi1 models the default kernel image file name is kernel.img  and for Pi2 and Pi3 models the default kernel image file is called kernel7.img . So the boot loader will look for the correct kernel image file for your model of Pi.

In addition to the kernel image, there are two configuration files, which are interesting to understand the booting process.

  • cmdline.txt
  • config.txt

The first file cmdline.txt configures the command line parameter passed while starting the boot process. This file is more close to the grub/syslinux command line

Following is an example of content of cmdline.txt

dwc_otg.lpm_enable=0 console=ttyAMA0,115200 kgdboc=ttyAMA0,115200 console=tty1 elevator=deadline 
root=/dev/mmcblk0p2 rootfstype=ext4 fsck.repair=yes rootwait

The second file is the configuration file config.txt, which is the equivalent of bios settings for the Raspberry Pi SoC. Here is an example file content


The documentation for all the options is available at https://www.raspberrypi.org/documentation/configuration/config-txt.md

Simple non-destructive way to tryout multiple OS on Pi

Now that we have a broad understanding of the booting process and the config files, lets looks at how we can use this to try different OS distributions for Pi3 using USB sticks.

If you look at the documentation for options supported by config.txt, you will notice that the kernel file name is configurable using the parameter “kernel”

So if we combine the USB booting mechanics of Pi we discussed in the previous section with the configurable kernel filename in config.txt we have a method to have multiple OS on different USB sticks.

And this is how to make it works…


  • Use different USB sticks to hold the root filesystem for different OS distros. Normally all the Pi distros have a 2 partition based layout with the first partition being the FAT32 based boot partition, while the second is usually ext4 based root filesystem. So if you just dd the OS image on to the USB you root filesystem should be the second partition. Assuming this is the only usb stick attached to the Pi, the root partition should be recognized as /dev/sda2.
  • Format your SD card for a sufficiently big FAT32 based first partition
  • Store the kernel images for all the OS distros on this partition with different filenames. You can get the kernel images from the first partition of each of the USB sticks. If you attach both the SD card and the USB stick to a windows machine you should be able to just copy-paste the kernel images to the FAT32 partition on the SD card.
  • Update the cmdline.txt file to point to the root filesystem partition e.g. root=/dev/sda2
  • Finally update config.txt to point to the correct kernel filename. The one that you want to boot currently.

For this scheme to work you need to match the kernel filename for the distro configured in config.txt to the correct USB stick that you have attached to the Pi board.

NOTE: if you make a mistake just attached your SD card to another machine and edit the config.txt to fix it

The above processes can obviously be scripted to make it user-friendlier, but my purpose for the exercise was to get an understanding of the boot process on Pi and have some fun 🙂

Test-Driving OSPF on RouterOS – Interoperability

So I wrote about OSPF on RouterOS in my previous post. It was a nice experiment to learn about routing protocols.

I wanted to take it a little further and test Interoperability of RouterOS with other open source solutions.

This post is an update from the previous one and I will add OSPF neighbor nodes to the setup. I decided to use Quagga the most talked about open-source routing protocol suit and XORP the eXtensible Open Router Platform.

Updated Setup

The following is the updated setup for the Interoperability test. I have added two new Ubuntu nodes as OSPF neighbor.

  • Quagga on Ubuntu
  • XORP on Ubuntu




The following configuration was added to Quagga node

Screenshot from 2016-03-27 12:33:55.png


The XORP node did not advertise any new subnet but received OSPF updates.



  • All the nodes could discover their neighbors

Screenshot from 2016-03-27 00:03:27.png

  • All nodes got route updates.

Screenshot from 2016-03-27 01:54:34.png

  • OSPF Traces

Screenshot from 2016-03-27 01:57:34.png

Test driving OpenWRT

Recently I have been looking at tools for managing and monitoring my home network. In my previous post I talked about using a Network Namespace to control the download limit.

Now I wanted to look at more advanced tools for the job. OpenWRT is a Linux based firmware, which supports a lot of networking hardware. I am exploring the possibility of flashing OpenWRT on my backup router at home.

To test OpenWRT I used a KVM image (which can be found here) and started a VM on my desktop. The following diagram shows the network topology.


Little tweaking is required for making OpenWRT work with libvirtd. The idea is to push the incoming traffic to OpenWRT and apply traffic monitoring/policy.

Libvirt provides dnsmasq service which listens on bridge virbr0 and provides DHCP ip to the VMs. It also configures NAT rules for traffic going out of the VMs through the virbr0.

  • For this test we will remove the NAT rules on the bridge virbr0. All applications on the desktop will communicate through this bridge to OpenWRT which will route the traffic to the Internet.
  • I also stopped the odhcpd and dnsmasq server running on OpenWRT. Started a dhsclient on the lan interface (br-lan) to request a IP from libvirtd.

Once OpenWRT is booted you can login to the web interface of the router to configure it.

The following figure shows the networking inside OpenWRT router.


The routing table on my desktop is as followsScreenshot from 2016-03-06 20:34:41

The routing table on the OpenWRT server is show belowScreenshot from 2016-03-06 20:34:29

OpenWRT allows installation of extra packages to enhance its functionality.I could find packages like quagga, bird etc which will be interesting to explore.

Screenshot from 2016-03-06 17:51:13.png

It provides traffic monitoring and classifications.

Screenshot from 2016-03-06 19:41:27

Openwrt provider firewall configuration using iptables.

Screenshot from 2016-03-06 17:48:57

I will be exploring more of its features before deciding if I will flash it on my backup home router.

Rate Limiting ACT broadband on Ubuntu

ISPs have started to provide high bandwidth connections while the FUP (Fair Usage Policy) limit is still not enough (I am using ACT Broadband). Once you decide to be on youtube most of the time the download limit gets exhausted rather quickly.

As I use Ubuntu for my desktop, I decided to use TC to throttle my Internet bandwidth to bring in some control over my Internet bandwidth usage. Have a look at my previous posts about rate limiting and  traffic shaping on Linux to learn about usage of TC.

Here is my modest network setup at home.


The problem is that TC can throttle traffic going out on an interface but traffic shaping will not impact the download bandwidth.

The Solution

To get around this problem I introduced a Linux network namespace into the topology. Here is how the topology looks now.


I use this script to setup the upload/download bandwidth limit.


Here are readings before and after applying the throttle



After rate-limiting to 1024Kbps upload and download


Visualizing KVM (libvirt) network connections on a Hypervisor

At times I have found the need to visualize the network connectivity of KVM instances on a Hypervisor Host. I normally use libvirt and KVM for launching my VM workloads. In this post we will look at a simple script that can parse the information available with libvirt and the host kernel to plot the network connectivity between the KVM instances. The script can parse Linux Bridge and OVS based switches.

It can generate a GraphViz based topology rendering (requires pygraphviz), can use networkx and d3js to produce a webpage which is exposed using a simple webserver or just a json output describing the network graph.

The source of the script is available here .

The following is a sample output of my hypervisor host.

SVG using d3js

PNG using GraphViz

Json text

Implementing Basic Networking Constructs with Linux Namespaces

In this post, I will explain the use of Linux network namespace to implement basic networking constructs like a L2 switching network and Routed network.

Lets start by looking at the basic commands to create, delete and list network namespaces on Linux.

The next step is to create a LAN, we will use namespaces to simulate two nodes connected to a bridge and simulate a LAN inside the Linux host. We will implement a topology like the one shown below


Finally, let see how to simulate a router to connect two LAN segments. We will implement the simplest of the topology with just two nodes connected to a router on different LAN segments


Test driving traffic shaping on Linux

In my last post, I shared a simple setup that does bandwidth limiting on Linux using TBF (Token Bucket Filter). The TBF based approach applies a bandwidth throttle on the NIC as a whole.

The situation in reality might be more complex then what the post described. Normally the users like to control bandwidth based on the type of application generating the traffic.

Lets take a simple example; the user may like to allow his bandwidth to be shared by application traffic as follows

  • 50% bandwidth available to web traffic
  • 30% available to mail service
  • 20% available for rest of the application

Traffic Control on Linux provides ways to achieve this using classful queuing discipline.

In essence, this type of traffic control can be achieved by first classify the traffic in to different classes and applying traffic shaping rules to each of those classes.

The Hierarchical Token Bucket Queuing Discipline

Although Linux provides various classful queuing discipline, in this post, we will look at Hierarchical Token Bucket (HTB) to solve the bandwidth sharing and control problem.

HTB is actually a Hierarchy of TBF (Token Bucket filter we described in the last post) applied to a network interface. HTB works by classifying the traffic and applying TBF to individual class of traffic. So to understand HTB we must understand Token Bucket Filer first.

How Token Bucket Filter works

Lets get a deeper look at how the Token Bucket Filter (TBF) works.

The TBF works by having a Bucket of tokens attached to the network interface and each time a packet needs to be passed over the network interface a token is consumed from the Bucket. The kernel produces these tokens at a fixed rate to fill-up the bucket.

When the traffic is flowing at a slower pace then the rate of token generation the bucket will fill up. Once filled up the bucket will reject all the extra tokens generated by the kernel. The tokens accumulated in the bucket can help in passing a burst of traffic (limited by the size of the bucket) over the interface.

When the traffic is flowing at a pace higher then the rate of token generation the packet must wait until the next token is available in the bucket before being allowed to pass over the network interface.


In the tc command line the size of the bucket is related to burst size, rate of token generation is related to rate and the latency parameter provides the amount of time a packet can be in the queue waiting for a token before being dropped.

The HTB queuing discipline

The following figure describes the working of HTB queuing discipline.


To apply HTB discipline we have to go through the following steps

  • Define different classes with their rate limiting attributes
  • Add rules to classify the traffic in to different classes

In this example we will try to implement the same traffic sharing requirement as mentioned in the introduction section. Web traffic will get 50% of the bandwidth while mail gets 30% and 20% is shared by all other traffic.

The following rules define the various classes with the traffic limits

tc qdisc add dev eth0 root handle 1: htb default 30
tc class add dev eth0 parent 1: classid 1:1 htb rate 100kbps ceil 100kbps
tc class add dev eth0 parent 1:1 classid 1:10 htb rate 50kbps ceil 100kbps
tc class add dev eth0 parent 1:1 classid 1:20 htb rate 30kbps ceil 100kbps
tc class add dev eth0 parent 1:1 classid 1:30 htb rate 20mbps ceil 100kbps


Now we must classify the traffic into their classes based on some match condition. The following rules classify the web and mail traffic in to class 10 and 20. All other traffic are pushed to class 30 by default

tc filter add dev eth0 protocol ip parent 1:0 prio 1 u32 match ip dport 80 0xffff flowid 1:10 
tc filter add dev eth0 protocol ip parent 1:0 prio 1 u32 match ip dport 25 0xffff flowid 1:20

Verifying the results

We will use iperf to verify the results of the traffic control changes. Use the following command to start 2 instances of iperf on the server

iperf -s -p <port number> -i 1

For our example, we use the following commands to start the servers

iperf -s -p 25 -i 1
iperf -s -p 80 -i 1

The clients are started with the following commands

iperf –c <server ip> -p <port number> -t <time period to run the test>

For this example we used the following commands to run the test for 60 secs. The server IP of was used.

iperf -c -p 25 -t 60
iperf -c -p 25 -t 60

Here is a snapshot of output from the test.


It show the web traffic to be close to 500kbps while mail traffic to be close to 300kbps, the same ration we wanted to shape the traff. When excess bandwidth is available HTB will split it in the same ratio that we configured for the classes.

Network Bandwidth Limiting on Linux with TC

On Linux Traffic Queuing Discipline attached to a NIC can be used to shape the outgoing bandwidth. By default, Linux uses pfifo_fast as the queuing discipline. Use the following command to verify the setting on your network card

# tc qdisc show dev eth0
qdisc pfifo_fast 0: root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1

Measuring the default bandwidth

I am using iperf tool to measure the bandwidth between my VirtualBox instance ( and my Desktop ( acting as the iperf server.

Start the iperf server with the following command

iperf -s

The client can then connect using the following command

iperf -c <server address>

The following figure shows the bandwidth available with default queue setting


Limiting Traffic with TC

We will use Token Bucket Filter to throttle the outgoing traffic. The following command sets an egress rate of 1024kbit at a latency of 50ms and a burst rate of 1540

# tc qdisc add dev eth0 root tbf rate 1024kbit latency 50ms burst 1540

Use the tc qdisc show command to verify the setting

# tc qdisc show dev eth0
tc qdisc add dev eth0 root tbf rate 1024kbit latency 50ms burst 1540

Verifying the result

I measured the bandwidth again to make sure the new queuing configuration is working and sure enough, the result from iperf confirmed it.


The following command shows the detailed statistics of the queuing discipline

TBF3Impact of different parameters for Token Bucket Filter (TBF)

Decreasing the latency number leads to packet drops, follow figure captures the result after latency was dropped to 1ms.


Providing a big burst buffer defeats the rate limiting


Both the parameters needs to be chosen properly to avoid packet loss and spike in traffic beyond the rate limit