System Monitoring: Glances at Top

System monitoring is one of the topics that has got a lot of attention lately.

A lot of options already exists in the open but every-time I search a for a system monitoring tool I am presented with options suitable for large scale cluster of machines, capturing metrics over time, showing trends of resource utilization over time and requiring a storage system and moderate to heavy installation. These systems are good for Ops teams interested in managing the stable operation of systems and optimization of resource utilization over time.

My requirements on the other hand are more on the side of debugging and require a real time peak in to the monitored system. To be frank my favourite tool for the job has been to use top command on the system to look at the current status of the system and it still one of the invaluable tools linux provides. It provides info on the load average, ram utilization and per process metrics with various selection and sorting options.

My requirements

More than often I start with a couple of physical machines and start by running virtual machines on them. A quick and easy monitoring tool is what I need to debug. Till now I would have a tmux session with windows running top command over ssh to these physical machines. I am mostly interested in the status VMs (kvm processes) running on the system and the overall health of the hypervisor.

One problem with top is it does not cover the network and disk metrics (which are a good to have feature), but most importantly although top provides the per process stats but I could not find a way to get alerts when a process (KVM instance in my case) starts to go high on resources.

Glances a better top

I recently tried glances and was really impressed with its capability. The most important part is the easy installation steps. Although I created a virtual environment and installed the app within it, it is not a requirement. To install glances all you need to do is to install the python package using pip and that’s it, you are ready to go

pip install glances

It has the same look and feel as top (although not as may selection and sorting option) but what it provides is a complete view of the system in real time including disk and network stats.

Screenshot 2019-04-15 at 8.20.51 AM

In addition you can define process level alerts for resource utilization.

It can act as standalone, client and a server mode. In the standalone mode it can monitor the status of the local system.

Screenshot 2019-04-15 at 8.17.38 AM

While in the server mode (with the -s option) it can be sending statistics to a remote client.

Or you can run it in the web server mode to monitor the system over a web browser(make sure you have bottle install for the webserver mode to work).

glances -w
Glances Web User Interface started on http://0.0.0.0:61208/

In the client mode you can create a list of servers that you need to monitor and start the client in the browse mode (with –browse option). The client will present you with a list of monitored server with a summary of their status. You can connect to any of them to get the detailed view.

Screenshot 2019-04-15 at 8.26.48 AM

The client connected to a remote server has the same look and feel as a standalone glances instance

My current installation

I am monitoring three physical machines with glances. I have glances running in the server mode on the physical machines and also in the web server mode (Currently you need to run 2 instances of Glances if you need both server and web server mode, there is a open request to enable both modes on the same instance of Glances)

I have installed glances in a virtual environment to keep its dependencies separated for the rest of the system.

I am running it under supervisord to make sure we have the process always running and restarted after reboot.

Advertisements

Test Driving transmission for multi-site file sync

As the industry moves towards more distributed deployment of services, syncing files across multiple location is a problem that often needs to be solved. In the world of file synching there are two algorithms that are outstanding. One being rsync which is a very efficient tool for synching files. It works great when you have a few remotes that need to be kept in synch, but as soon as the number of remotes grow we hit the problem of bandwidth and load. Rsync being a single-source-to-destination syncing mechanism, puts a lot of load on the source as the source node needs to transfer a lot of data to each of the remotes. 

filesync (1)

This is the use case where sync based on bittorrent protocol shines. As bittorrent uses peer-to-peer file distribution there is no single source that distributes the files. Files are broken down into chunks and each chunk is associated with its hash. To get a copy of the file a participating peer asks for all the chunks of the file from all other peers. Any peer that has the chunk can be the source for this transfer. Additionally the file chunks are not transferred sequentially, but the transfer is randomized to make sure that possibility of finding the requested chunk in any other peer (rather than the original source) increases.     

filesync

In this blog I will explore all the components needed to set up a bittorrent based file sync. A lot of these steps can be automated to come up with an efficient multi-site file synching solution. All the setup is done on a single Ubuntu 16.04 VM. Following diagram shows my test setup.

filesync

How to use a bittorrent network to share your data ?

To transfer your data using torrents you will need the following things

  • The full copy of the source files that you want to transfer
  • Torrent file generated for the source files
  • Some way to distribute this torrent file to all your remotes (peers)
  • A torrent tracking server
  • A torrent client

Installing transmission client

Although you can use any torrent client, transmission is one of the popular torrent client for linux. It provides a CLI (and RPC API) interface which is handy for integration with other projects.

To install transmission on your machine use the following command

apt-get install transmission-cli transmission-common transmission-daemon transmission-gtk transmission-remote-gtk 

This should get transmission installed. To configure the torrent client edit the settings file

/etc/transmission-daemon/settings.json 

You might be interested to change the default download directory which by default points to

/var/lib/transmission-daemon/downloads/

Make sure that transmission has write access to the new download directory you set.

Transmission uses systemd unit to manage the transmission client daemon. If you want to make changes to transmission daemon follow the standard practice of customizing any systemd service. For my experiment I disabled authentication for RPC calls. To do this follow the steps as below.

cp /lib/systemd/system/transmission-daemon.service     /etc/systemd/system/transmission-daemon.service 

Then edit the system file in /etc/systemd/system/<service> with your local changes

diff -Nur /etc/systemd/system/transmission-daemon.service /lib/systemd/system/transmission-daemon.service 
--- /etc/systemd/system/transmission-daemon.service 2019-03-15 22:30:59.736229363 +0530
+++ /lib/systemd/system/transmission-daemon.service 2018-02-06 23:25:40.000000000 +0530
@@ -5,7 +5,7 @@
 [Service]
 User=debian-transmission
 Type=notify
-ExecStart=/usr/bin/transmission-daemon -f --log-error -T
+ExecStart=/usr/bin/transmission-daemon -f --log-error
 ExecStop=/bin/kill -s STOP $MAINPID
 ExecReload=/bin/kill -s HUP $MAINPID

Start the transmission daemon as follows

systemctl start transmission-daemon

And check its status with the following command

systemctl status transmission-daemon

Next step is to create a torrent file for your source files. Before you can create a torrent file you need to provide a tracker url. The next section will show you how to setup your own tracker server or you may choose to use public tracker server

What is a torrent tracker ?

A torrent tracker is a server which tracks all the peers that are interested in a torrent file distribution. It also helps peers interested in a file transfer to find each other. When a torrent client adds a new torrent file, it looks for the track url embedded in the torrent file and contacts the tracker server and gets a list or other peers that are interested and participating in the source file distribution. The client can then contact all its peers and start requesting chunks of the source file.

You can either use a public torrent tracker server, but if you are using torrents to share private data probably you will want to use a private torrent tracker server. For my experiment I tried

https://github.com/chihaya/chihaya.git

To setup the tracker use the following steps at https://github.com/chihaya/chihaya/blob/master/README.md

You will need to install Golang for compiling it. Once compiled you should have a binary named chihaya in your sources top dir. An example configuration file is provided as part of the code example_config.yaml

cp example_config.yaml config.yaml

Then edit it to your liking. As I am running this experiment on my local machine, I changed the http addr to “127.0.0.1:6969”

http:
    addr: "127.0.0.1:6969"

In a real deployment your tracker should be accessible from all the peers, so it must listen on a public interface. The default setting is to listen on “0.0.0.0:6969”.

Once done start the tracker server as follows

./chihaya --config config.yaml 

This will start the server and your tracker url will be http://127.0.0.1:6969/announce or the IP you configured for the http listen address.

How to create a torrent ?

To create the  torrent file use the following command.

transmission-create <source file|source dir> -t <tracker url>

For me

transmission-create <source file|source dir> -t http://127.0.0.1:6969/announce

The above command will create the torrent file for your source file(s). The torrent file will contain the chunks definition of the source file along with its hash and the tracker url. Each time a new peer adds the torrent file. The torrent client will send a request to the tracker to add itself to the list of peers interested in the source file.

How to publish your files with torrent ?

To publishing the source file you need to add its torrent (like we created in the steps above). The only difference between all the other peers and the publisher is that the publisher will be the peer with the complete source file to start with. The peers with the complete files are also known as seeders while all the peers who have incomplete source files are called leechers. In the beginning the publisher will be the only seeder but as more and more peers get a complete source file they will also become seeders.

To add a torrent file and become a seeder use the following command.

 transmission-remote --add <torrent file> -w <parent dir of complete source files> [ -u <upload bandwidth limit in kb/s>]

The upload bandwidth limit is optional.

How to download files ?

This should be the the most familiar part to everyone. Although for this blog we will use the transmission client for the reasons mentioned above. All the peers need to add the torrent file we created above to the transmission client. The torrent file can be hosted on a web server or can be sent to the peers over mail or any other means.  To add a torrent file to all other peers use the following command

transmission-remote --add <torrent file> -w <parent dir where the source file will be saved> [ -u <upload bandwidth limit in kb/s>]

You can use the following commands to list the currently added torrents in transmission

transmission-remote -l
transmission-remote -t <torrent_id> -r

Where torrent_id is the first column in the output of transmission-remote -l. You can also use the web or gtk interface if you like to use transmission

http://localhost:9091/transmission

Or run transmission-gtk to use the GUI

Once you have added the torrent with the transmission-remote –add <torrent file> -w <parent dir where the source file will be saved> command, transmission with verify that the available source file matches the hash in the torrent file.

Once completed your transmission client will be marked as a seeder. For my test I used another torrent client to download the source files by using the torrent file.

In production you would like to use the transmission-remote on the peers to add the torrent file and start the download.

Summary

If you have the need to distribute files among multiple remotes, torrents can be a very efficient mechanism. You can relieve the source file server of the synching load and also benefit from better bandwidth utilization of the peers, not to mention the increased speed of distribution

Test Driving Inter Regional VPC peering in AWS

Connect AWS VPCs hosted in different regions.

AWS Virtual Private Cloud(VPC) provides a way to isolate a tenant’s cloud infrastructure. To a tenant a VPCs provide a view of his own virtual infrastructure in the cloud that is completely isolated, has its own compute, storage, network connectivity, security settings etc.

In the physical world, Amazon’s data centers are organized into different geographical location called Regions. Each Regions has multiple Availability Zones which are data centers with independent power and connectivity to protect against any natural disaster.
A VPC is associated with a single Region. A VPC may have multiple associated subnets (one per Availability Zone). Subnets in VPC do not span across Availability Zones.

What is VPC Peering ?

Although as an AWS tenant, VPCs provide you with a secure way to isolate and control access to your cloud resources; it also creates its own challenges. As your infrastructure grows, you are forced to create multiple VPCs within or across multiple Regions. Question is: How would you now connect your instances deployed in different VPCs?

One way to connect your resources across VPCs is over the Public IP. This meant the traffic has to traverse the internet to travel between your cloud resources hosted in different regions.

Peering provides a solution to this problem by allowing instances in different VPCs to connect over a peering link. Till recently VPC peering was only allowed within a single region. This was before AWS allowed creating VPC Peering across Regions. With Inter Regional VPC peering the traffic between instances in VPC in different regions never have to leave the Amazon network.

Note: One thing to keep in mind while peering VPCs is that the peered VPCs should not have overlapping subnets CIDRs. As VPC peering involves route table updates to add routes to remote subnets, any CIDR overlap between the local and remote subnet will not work.

Creating Inter Regional VPC peering

To test Inter Regional VPC peering, you will need to create two custom VPCs in different Regions. I have explored custom VPC and Internet access in my previous post. For this test I am peering custom VPCs created in N. Virginia (CIDR 10.0.1.0/24) and London (CIDR 10.0.2.0/24) Region.

The peering links can be created in the “VPC” dashboard under the “Networking & Content Delivery” heading. To create a peering link navigate to

Networking & Content Delivery > VPC > VPC Peering Connections and click on the Create peering Connection.

VPC10

To create a peering connection you must know the VPC id of the local and peer VPC in the remote Region. While the local VPC can be chosen for the drop down menu, you must find the remote VPC id from the remote Region’s VPC console. On completing the above step a VPC peering request is created with status as pending

VPC11

To activate the peering link, the VPC in the remote Region must accept the peering request.

VPC13

Click “Accept Request” and “Yes, Accept”

VPC14

Once accepted the status of the peering link changes to Active.

VPC16

The next step is to update the route table of your VPC and add a route to the subnets associated with your peered VPC using the VPC peering link interface which are named as pcx-xxxxx…

Note: The route table must be updated on both the local and remote VPC to provide a complete forward and reverse path for the traffic.

VPC17

And that’s all. Now you should be able to communicate between the peered VPCs. You can try running ping or SSH between the instances on the VPCs. You may have to adjust the Security group rules to make this work.

VPC18

No transitive routing with VPC peering and problem of manual route table update

One of the limitation of VPC peering is that it only allows communication across directly connected VPCs.[As of Oct 2018]

As an example if VPC-A peers with VPC-B and VPC-B peers with VPC-C, the traffic is only allow between VPC-A , VPC-B and VPC-B, VPC-C. The peering link is not transitive this means VPC-A cannot send or receive traffic from VPC-C. To enable this VPC-A must directly peer with VPC-C.

VPC22

This means multiple VPCs wanting to communicate with each other in the group must form a full mesh of peering links. Additionally the route table associated with each of the VPC needs to be manually updated with the subnet CIDRs of all its peers and correct peering link. This can be quite tricky to maintain with manual updates.

Although third party routing solutions can be used to enable transitive routing, but it will undermine the routing infrastructure provided by AWS.

Recent Updates

AWS now supports accessing load balancers over Inter Regional Peering Links

https://aws.amazon.com/about-aws/whats-new/2018/10/network-load-balancer-now-supports-inter-region-vpc-peering/

Custom VPC and Internet Access in AWS

Create your VPC, launch EC2 instances and get internet access with Public IP.

With a Virtual Private Cloud(VPC), tenants can create his own cloud based infrastructure in AWS. While AWS provides a default VPC for a new tenant, there are always use cases that need creation of custom VPC.

While exploring custom VPC, I found that getting the my EC2 instances on the custom VPC to connect to the internet was not a straight forward and involves a few steps.

The next sections describe the steps needed to make the EC2 instances get internet access with Public IP

1. Creating the Custom VPC and associated Subnet

To create a custom VPC navigate to Networking & Content Delivery > VPC > VPCs  and click on “Create VPC” and provide a name for your custom VPC and an associated CIDR.

VPC1

VPCs are tied to your logged in Region. If you have multiple Availability Zones in this region you may want to create one Subnet per Availability Zone but further sub-netting the CIRD associated with the VPC. In my case I have created only one Subnet for the custom VPC.

VPC2

2. Creating an Internet Gateway for the custom VPC

Next step is to create an internet gateway. The internet gateway connects the VPC to the internet. To create it navigate to Networking & Content Delivery > VPC > Internet Gateways and click on “Create internet gateway

VPC4Next associate this Internet Gateway to your Custom VPC using the Actions menu after selecting the newly created Internet Gateway. After this step the Gateway should have an associated VPC

VPC6

3. Update Routing Table for the VPC to allow traffic to and from Internet

Next we have to update the routing table to add a default route to send and receive traffic from internet.

To do this navigate to Networking & Content Delivery > VPC > Route Table and select the routing table associate with the custom VPC and click on the “Routes” tab then click on edit and add a default route. Next click “Save” to update the routing table.VPC7

4. Launching EC2 instances on the custom VPC and checking connectivity

Finally we are ready to launch our EC2 instance and make it internet accessible using Public IP.

To do this navigate to Compute > EC2 and click on “Launch Instance

Follow the usual steps to Launch the instance except in the “Configure Instance” stage select your custom VPC for the network(this will automatically load the associated Subnet) and enable Auto-assign Public IP.VPC21You will be asked to create a new Security Group for the VPC (allow SSH access for testing).

Once launched you should be able to SSH to the instance using its Public IP.

VPC9

 

 

Home network traffic analysis with a Raspberry Pi 3 and Ntop

I had the Raspberry Pi laying around for some time without doing any major function and so was the NetGear switch [1]. So, I decided to do a weekend project to implement traffic analysis on my home network.

I have a PPPoE connection to my ISP that connects to my home router [2]. The router provides both wire and wifi connectivity. As with most people I have very few devices that connect to the router over an Ethernet cable, most devices are wifi capable. This makes traffic monitoring a bit of a problem on the LAN side.

To get around the problem I decided to put the traffic monitor on the WAN side of the router.

The following figure shows the connectivity.

Slide1

Tapping the WAN side with port mirroring

The NetGear GS105E switch provides the capability of port mirroring. I used this to mirror traffic arriving through the router and the ISP connection. The mirrored traffic is passed on to the Raspberry Pi. All traffic monitoring happens on the Pi.

 

Screenshot from 2018-02-11 01:26:51

Monitoring tools

Once the traffic is available on the mirrored port, I was able to run traffic monitors like wireshark, tshark and tcpdump on the mirror port to analyze all the traffic between the router and ISP. These tools give a live view of the packets going through my home network.

To monitor traffic over long time I used Ntop [3]. It can aggregate and produce nice traffic analysis summary. I used the Rasbian [4] image for the pi and Ntopng can be easily installed from their repository using apt.

Accessing the Monitoring result

As the Gigabit port of the Pi is used to receive mirrored traffic, the monitoring dashboard is accessed over the wlan0 interface. This will keep the monitored traffic separate from the monitoring traffic.

Refs:

[1] https://www.netgear.com/support/product/GS105Ev2.aspx

[2] https://www.amazon.in/3G-4G-LTE-Router-Multi-WAN/dp/B00N0W4FTM

[3] https://www.ntop.org/products/traffic-analysis/ntop/

[4] https://www.raspberrypi.org/downloads/raspbian/

 

SecureNet: Simulating a Secure Network with Mininet

I have been working with OpenStack(devstack) for a while and I must say it is quite convenient to bring up a test setup using devstack. At times, I still feel it is an overkill to use devstack for a quick test to verify your understanding of the network/security rules/routing etc.

This is where Mininet shines. It is very lite on resources and extremely fast in getting your topology up and running. It cuts the setup to the absolute necessities and needless to say, it has proven invaluable to me while trying out various topologies and tests.

Lacking Security Device simulation

The default Mininet toolbox comes with a switch and hosts nodes. The switch is primarily a controller based SDN switch like OpenVSwitch or IVS. The primary focus of Mininet has been L2 networks with SDN controllers.

While for me the goal was to test routing and security much like what is available on OpenStack. I found an example topology in the Mininet code simulating a LinuxRouter. Basically, a Host node (Namespace) was configured to do l3 forwarding.

This give me the initial idea of implementing security devices with Mininet, after all the reference network services (routing and firewall) in OpenStack are based on Namespaces

A Simulated Perimeter Firewall

As IPTables are available within the namespace I decided to use them to implement a Perimeter Firewall that would inspect the traffic between the Networks. I used a separate table to have my firewall rules and redirected all traffic hitting the FORWARD table to my custom table PFW. The last rule in the FORWARD table was to drop all traffic.

So now when I started my topology with the perimeter Firewall the ping test confirmed that no traffic was flowing between the networks.

MN1

To allow traffic we need to specifically configure the Firewall to allow packets.

MN3

This took care of securing the Network boundary, but how about traffic flowing within the network. OpenStack supports this using micro segmentation with Security Groups.

Simulating Micro Segmentation with OpenStack like Security Groups

Well as I was using OpenVSwitch for my topology in standalone mode, I decided to configure the OVS packet filtering capabilities to implement a firewall within the Network itself. Here is some sample OVS rules to do L3 packet match and filter.

ovs-ofctl add-flow switch1 dl_type=0x0800,nw_src=30.0.0.100,nw_dst=30.0.0.101s, action=DROP

And produces the following setting (can be verified with ovs-ofctl dump-flows switch1)

cookie=0x0, duration=9.050s, table=0, n_packets=0, n_bytes=0, idle_age=9, ip,nw_src=30.0.0.100,nw_dst=30.0.0.101 actions=drop

Updated CLI to allow Firewall Rules

All this looks good but it’s a lot of work to manually configure all the rules. And I was thinking, how can this be automated?

I extended the switch node in Mininet to implement a Secure Switch and while at it I also implement the Perimeter Firewall as a specialized Host Node. My intention of implementing these special nodes was to add the capability of configuring the nodes with security settings. By this time, it was already looking like an interesting Weekend Project 🙂

So, got into it an implemented a set of CLI commands for Mininet to configure the firewall capabilities on the topology(using the security nodes that I introduced). I was calling it the Secure Network.

I wanted to keep things simple to start with and so kept the rule definitions very coarse, namely only L3 filtering only(I may look at L4 in future). So here is the CLI I came up with.

Securenet] secrule add allow [addr1] to [addr2]
Securenet] secrule add deny [addr1] to [addr3]

Here addr1, addr2 and addr3 are Micro Segments(or Security Groups)

Now, to define Firewall rules with Micro Segments (let’s call it Security Groups as this is what is was trying to simulate at the first place) I needed a way to define the Security Groups and associate Hosts to it.

To keep things simple I went for an automatic creation of Security Groups and when a Host association command is entered. This was done by extending the host commands. Here is an example of creating a Security Group and associating a Host with it.

Securenet] h30 bind sg1

Above command creates a Security Group called SG1 and associates host h30 to it.

To add another Host to the same SG issue the same command with a different Host name

Securenet] h31 bind sg1

This adds h31 to sg1. To view the members of the security group use the sghosts command

Securenet] sghosts sg1
0: h30
1: h31

Reacting to Security Group changes with events

By this time, I was already too excited to stop and was thinking of the interactions between the Firewall rules and Security Group definitions.

As the high-level Firewall rules are composed of Security Groups, the firewall must react to the changes to Security Group definition. This means the change in Security Group definition must produce some kind of event that is observable by the Firewall and then it must update its configuration accordingly to keep the firewall rule constrains satisfied.

To do this I extended the Topology and Mininet class so that any change in the SG definition can trigger a re-evaluation of the Firewall rules.

MN4

Again to keep things reasonably simple I went with a rip and replace of Firewall configuration (both IPTables and OVS) and did not bother about the traffic impact (we are in simulation after all). But this might be an interesting area to explore in future.

Simulating IPSets

One thing that I was missing while defining the Firewall rules was a way to target a set of external IP address. As the IP addresses associated to a Security Group definition was derived out of its member Hosts, so there was no way to define rules based on addresses that are not part of the topology.

So, another set of CLI commands were introduced to define groups based on IP addresses manually and without any topology based constrains.

Securenet] secrule ipg add ipg1 1.1.1.1
Securenet] secrule ipg add ipg2 2.1.1.1
Securenet] secrule ipg add ipg3 2.1.1.2

CLI to view a list of IPGs

Securenet] secrule ipg list
0: ipg2
1: ipg3
2: ipg1

And to view its content use the following

Securenet] secrule ipg show ipg1
0: ipgh-1.1.1.1/32

With IPGs it was possible to have rules with arbitrary addresses.

Conclusion

Here is a run of a simple set of commands on the secure net topology

MN5

The updated IPTables config in the Perimeter Firewall.

MN6

And the security rules in OVS

MN7

This was more of a fun long weekend project for me. Once I dived into the project I realized a number of challenges that are posed by such an undertaking like rule optimization, minimizing the config changes, targeting the right device to push a security rule, figuring out duplicate rules and conflicts etc. Also, as I thought through more use cases I realized that the firewall rule definition needs a language of its own.

But at the end of the weekend I feel mostly it was a lot of fun.

Test driving App Firewall with IPTables

With more and more application moving to the cloud, web based applications have become ubiquitous. They are ideal for providing access to applications sitting on the cloud (over HTTP through a standard web browser). This has removed the need to install specialized application on the client system, the client just needs to install is a fairly modern browser.

While this is good for reducing load on the client, the job of the firewall has become much tougher.

Traditionally firewall rules look at the Layer 3 and Layer 4 attributes of a packet to identify a flow and associate it with applications generating the traffic. To a traditional firewall looking at L3/L4 headers all the traffic between the client and different web apps looks like http communication. Without proper classification of traffic flows the firewall is not be able to apply a security policy.

It has now become important to look at the application layer to identify the traffic associated with a web service or web application and enforce effective security and bandwidth allocation policy.

In this blog, I will look at features provided by IPTables that can be used to classify packets by application Layer header and how this can be used to implement security and other network policy.

Looking in to the application layer

IPTables are the de-facto choice for implementing firewall on Linux. It provides extensive packet matching, classification, filtering and many more facilities. Like any traditional firewall the core features of IPTables allow packet matching with Layer3 and Layer4 header attributes. These features as we discussed in the introduction may not be sufficient to differentiate between traffic from various web apps.

While researching for a solution to provide APP Firewall on Linux I came across an IPTables extensions called NFQUEUE.

The NFQUEUE extension provides a mechanism to pass a packet to a user-space program which can run some kind of test on the packet and tell IPTables what action(accept/drop/mark) to perform for the packet.

appfw

This gives a lot of flexibility for the IPTables user to hook up custom tests for the packets before it is allowed to pass through the firewall.

To understand how NFQUEUE can help classify and filter traffic based on application layer headers, let’s try to implement a web app filter providing URL based access control. In this test we will extract the request URL from the HTTP header and the filter will allow access based on this URL

A simple web app

For this experiment, we will use python bottle to deploy two application. Access will be allowed for the first app(APP1) while access to the second app(APP2) will be denied.

We will use the following code to deploy the sample Apps

from bottle import Bottle

app1 = Bottle()
app2 = Bottle()

@app1.route('/APP1/')
def app1_route():
    return 'Access to APP1!\n'


@app2.route('/APP2/')
def app2_route():
    return 'Access to APP2!\n'


if __name__ == '__main__':
    app1.merge(app2)
    app1.run(server='eventlet', host="192.168.121.22", port=8081)

The web application will bind to port 8081 and local IP of 192.168.121.22.

NOTE: we need a eventlet based bottle server else the application hangs after a deny from the app filter(connections are not closed and the next request is not processed)

To access the web apps use the curl commands

curl http://192.168.121.22:8081/APP1/
curl http://192.168.121.22:8081/APP2/

Configuring the IPTables NFQUEUE

The next step is to configure IPTables to forward the client traffic accessing the web apps to our user space web-app filter.

The NFQUEUE IPTables extension works by adding a new target to IPTables called NFQUEUE. This target allows IPTables to put the matching packet on a queue. These packets can then be read from this queue by a filter application in user space. The filter application can then perform custom tests and provide a verdict to allow or deny the packet.

The NFQUEUE extension provides 65535 different queues. It also provides fail-safe options like what action IPTables should take if a queue is created but no filter is attached to it, load balancing of packets across multiple queues. Also, there are knobs in the /proc filesystem to control how much of the packet data will be copied to user space. A complete list of options can be found in the iptables extensions man page

To enable NFQUEUE for the web-app traffic we will add the following rule to IPTables.

iptables -I INPUT -d 192.168.121.22 -p tcp --dport 8081 -j NFQUEUE --queue-num 10 --queue-bypass

The –queue-num option selects the NFQUEUE number to which the packet will be queued. The –queue-bypass option allows the packet to be accepted if no custom filter is attached to queue number 10, without this option if no filter is attached to the queue, packets will be dropped.

Implementing a simple APP filter

With the above IPTables rule the packets destined for our sample web app will be pushed into NFQUEUE number 10.  I am going to use the python bindings for NFQUEUE called nfqueue-bindings to develop the filter. Let’s run a simple print and drop filter.

#!/usr/bin/python

# need root privileges

import struct
import sys
import time

from socket import AF_INET, AF_INET6, inet_ntoa

import nfqueue
from dpkt import ip


def cb(i, payload):
    print "python callback called !"
    payload.set_verdict(nfqueue.NF_DROP)
    return 1

def main():
    q = nfqueue.queue()
    
    print "setting callback"
    q.set_callback(cb)
    
    print "open"
    q.fast_open(10, AF_INET)
    q.set_queue_maxlen(50000)
    
    print "trying to run"
    try:
    	q.try_run()
    except KeyboardInterrupt, e:
    	print "interrupted"
    
    print "unbind"
    q.unbind(AF_INET)
    print "close"
    q.close()

if __name__ == '__main__':
    main()

Now we have tested that the packets trying to access our web app are passing through a app filter implemented in user space. Next we need to unpack the packet and look at the HTTP header to extract the URL that the user is trying to access. For unpacking the headers we will use python dpkt library. The following code will let us access to APP1 and deny access to APP2

#!/usr/bin/python

# need root privileges

import struct
import sys
import time

from socket import AF_INET, AF_INET6, inet_ntoa

import nfqueue
import dpkt
from dpkt import ip

count = 0

def cb(i, payload):
    global count
    
    count += 1
    
    data = payload.get_data()

    pkt = ip.IP(data)
    if pkt.p == ip.IP_PROTO_TCP:
        # print "  len %d proto %s src: %s:%s    dst %s:%s " % (
        #        payload.get_length(),
        #        pkt.p, inet_ntoa(pkt.src), pkt.tcp.sport,
        #        inet_ntoa(pkt.dst), pkt.tcp.dport)
        tcp_pkt = pkt.data
        app_pkt = tcp_pkt.data
        try:
            request = dpkt.http.Request(app_pkt)
            if "APP1" in request.uri:
                print "Allowing APP1"
                payload.set_verdict(nfqueue.NF_ACCEPT)
            elif "APP2":
                print "Denying APP2"
                payload.set_verdict(nfqueue.NF_DROP)
            else:
                print "Denying by default"
                payload.set_verdict(nfqueue.NF_DROP)
        except (dpkt.dpkt.NeedData, dpkt.dpkt.UnpackError):
            pass
    else:
        print "  len %d proto %s src: %s    dst %s " % (
               payload.get_length(), pkt.p, inet_ntoa(pkt.src), 
               inet_ntoa(pkt.dst))


    sys.stdout.flush()
    return 1

def main():
    q = nfqueue.queue()

    print "setting callback"
    q.set_callback(cb)

    print "open"
    q.fast_open(10, AF_INET)

    q.set_queue_maxlen(50000)

    print "trying to run"
    try:
        q.try_run()
    except KeyboardInterrupt, e:
        print "interrupted"

    print "%d packets handled" % count

    print "unbind"
    q.unbind(AF_INET)
    print "close"
    q.close()
    
if __name__ == '__main__':
    main()

Here are the result of the test on the client

result1

The output from the filter on the firewall

result2

What else can be done with App based traffic classification

Firewall is just one use-case of the advance packet classification. With the flows identified and associated to different applications we can apply different routing and forwarding policy. NFQUEUE based filter can be used to set different firewall marks on the classified packets. The firewall marks can then be used to implement policy based routing in Linux.

IPTables: Matching A GRE packet based on tunnel key

I was trying to figure out a way to match packets with a certain GRE key and take some action. IPTables does not provide a direct solution to this problem but has the u32 extension modules that can be used to extract 4 bytes of the IP header and match against a pattern.

So, I decided to give a try to this extension.

Prepare the setup

I created a tunnel between 2 of my VMs and assign IP address to the tunnel interfaces

On VM1

sudo ip tunnel add tun2 mode gre remote 192.168.122.103 local 192.168.122.134 ttl 255 key 22

sudo ifconfig tun2 6.5.5.1/24 up

On VM2

sudo ip tunnel add tun2 mode gre remote 192.168.122.134 local 192.168.122.103 ttl 255 key 22

sudo ifconfig tun2 6.5.5.2/24 up

Start with a basic rule

Next, created a IPTables rule on the receiving system to generate logs for packet match, but you can also create an ACCEPT rule and check the builtin packet counter for the rule.

sudo iptables -I INPUT -p 47 -m limit --limit 20/min -j LOG --log-prefix "IPT GRE" --log-level 4

Now start ping from VM2 to VM2

ping 6.5.5.1

You can keep a watch on the packet counters with the following command

watch "sudo iptables -L -v -n"

The GRE header

Next, a look at the GRE header format (taken from RFC https://tools.ietf.org/html/rfc2890). The header format is described in the RFC and it contains an optional 32bit key, which is the data of our interest.

0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|C| |K|S| Reserved0       | Ver |         Protocol Type         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|      Checksum (optional)      |       Reserved1 (Optional)    |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                         Key (optional)                        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                 Sequence Number (Optional)                    |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Run the following tcpdump command to capture the packets(My VMs don’t have GUI)

sudo tcpdump -s 0 -n -i ens3 proto GRE -w dump.pcap

The captured packets can be analyzed using wireshark

untitled1

Understanding the iptables u32 extension match rule

Basically u32 module is able to extract 4 byte of data from the IP header at a given offset and match with the given hex number or range. Here is an example of a u32 match rule from the man-page. It matches packets within a certain length. The man page describes the format of the rule, you provide an offset , u32 extracts 4 byte from the offset position, and then we AND it with the MASK and finally compare with the HEX value

Example:

match IP packets with total length >= 256
The IP header contains a total length field in bytes 2-3.

--u32 "0 & 0xFFFF = 0x100:0xFFFF"

read bytes 0-3
AND that with 0xFFFF (giving bytes 2-3), 
and test whether that is in the range [0x100:0xFFFF]

The man page has more details.

Craft a match for GRE Key

The IP header length is 20 bytes and the GRE key starts at 24 bytes, as can be confirmed from the wireshark. At the beginning of the rule match starts at the IP header(highlighted in the wireshark screenshot)

untitled

Based on the example from the man page I crafted the following rule to match the GRE key.

sudo iptables -I INPUT -p 47 -m u32 --u32 "24 & 0xFFFFFFFF = 0x16" -m limit --limit 20/min -j LOG --log-prefix "IPT GRE key 22" --log-level 4

Checking for Key Present Flag

But the key can be optional. So, add match for Key-Present Flag.

sudo iptables -I INPUT -p 47 -m u32 --u32 "20 & 0x20000000 = 0x20000000 && 24 & 0xFFFFFFFF = 0x16" -m limit --limit 20/min -j LOG --log-prefix "IPT GRE key 22" --log-level 4

Here is a screen capture of the iptables packet counters

Chain INPUT (policy ACCEPT 294K packets, 78M bytes)
 pkts bytes target prot opt in out source destination
 711 79632 LOG 47 -- * * 0.0.0.0/0 0.0.0.0/0 u32 "0x14&0x20000000=0x20000000&&0x18&0xffffffff=0x16" limit: avg 20/min burst 5 LOG flags
 0 level 4 prefix "IPT GRE key 22"

The above rule is simplistic and good to get you started but has short comings, e.g. it assumes a constant IP header length.

The man page describes examples of how to handle variable length headers, fragmentation check etc.

Test-driving multiboot on Raspberry Pi – without BerryBoot/Noobs

media-20160403.jpg

Recently I got a Raspberry Pi 3 board and wanted to try out various OS options on it. I realized quite quickly that to try a new OS I would need to  block copy (dd) the OS image to my SD card every time. I am running short on micro SD cards and it has a size limit too.

While I have a bunch of USB sticks lying around unused. So was thinking if the USB sticks could be used.

A little research on the Internet came up with 2 prominent options BerryBoot and Noobs. Both options allow multibooting your Pi board with different OS distros. While this is a good enough solution I wanted to know how things work internally and if there was a simple way to achieve multiboot  without using any tools (and more so for my learning purpose).

On the Internet there is a lot of information on how to install and boot Linux from USB sticks for Raspberry Pi. The process is summarized in the following section.

Run Pi with Linux from USB stick

In simple words, booting Linux involves loading the kernel, which initializes the hardware, and then mounting the root filesystem, which has all the user applications. Usually the kernel images are kept in the first partition and this partition is mounted on /boot directory.

Going through the Pi documentation, it looks like Pi boards recognize the SD card as the only boot device. So the trick to run Linux on Pi from USB stick involves installing the kernel images on the SD card while keeping the root file system on the USB stick and providing the information about the root filesystem location to the kernel in the boot command line.

If we look ate the space usage, a typical kernel image is only around 10mb in size. With all the data in the /boot directory it is still within 30Mb of space, while the root file system size can be much bigger based on the user application and data.

How Boot loading on Pi works

The boot process on Pi expects the SD card to have a FAT32 based first partition. To boot Linux, the kernel image must be present on this partition. For Pi0, Pi1 models the default kernel image file name is kernel.img  and for Pi2 and Pi3 models the default kernel image file is called kernel7.img . So the boot loader will look for the correct kernel image file for your model of Pi.

In addition to the kernel image, there are two configuration files, which are interesting to understand the booting process.

  • cmdline.txt
  • config.txt

The first file cmdline.txt configures the command line parameter passed while starting the boot process. This file is more close to the grub/syslinux command line

Following is an example of content of cmdline.txt

dwc_otg.lpm_enable=0 console=ttyAMA0,115200 kgdboc=ttyAMA0,115200 console=tty1 elevator=deadline 
root=/dev/mmcblk0p2 rootfstype=ext4 fsck.repair=yes rootwait

The second file is the configuration file config.txt, which is the equivalent of bios settings for the Raspberry Pi SoC. Here is an example file content

gpu_mem=128 
disable_overscan=1

The documentation for all the options is available at https://www.raspberrypi.org/documentation/configuration/config-txt.md

Simple non-destructive way to tryout multiple OS on Pi

Now that we have a broad understanding of the booting process and the config files, lets looks at how we can use this to try different OS distributions for Pi3 using USB sticks.

If you look at the documentation for options supported by config.txt, you will notice that the kernel file name is configurable using the parameter “kernel”

So if we combine the USB booting mechanics of Pi we discussed in the previous section with the configurable kernel filename in config.txt we have a method to have multiple OS on different USB sticks.

And this is how to make it works…

Slide1.jpg

  • Use different USB sticks to hold the root filesystem for different OS distros. Normally all the Pi distros have a 2 partition based layout with the first partition being the FAT32 based boot partition, while the second is usually ext4 based root filesystem. So if you just dd the OS image on to the USB you root filesystem should be the second partition. Assuming this is the only usb stick attached to the Pi, the root partition should be recognized as /dev/sda2.
  • Format your SD card for a sufficiently big FAT32 based first partition
  • Store the kernel images for all the OS distros on this partition with different filenames. You can get the kernel images from the first partition of each of the USB sticks. If you attach both the SD card and the USB stick to a windows machine you should be able to just copy-paste the kernel images to the FAT32 partition on the SD card.
  • Update the cmdline.txt file to point to the root filesystem partition e.g. root=/dev/sda2
  • Finally update config.txt to point to the correct kernel filename. The one that you want to boot currently.

For this scheme to work you need to match the kernel filename for the distro configured in config.txt to the correct USB stick that you have attached to the Pi board.

NOTE: if you make a mistake just attached your SD card to another machine and edit the config.txt to fix it

The above processes can obviously be scripted to make it user-friendlier, but my purpose for the exercise was to get an understanding of the boot process on Pi and have some fun 🙂

Test-Driving OSPF on RouterOS – Interoperability

So I wrote about OSPF on RouterOS in my previous post. It was a nice experiment to learn about routing protocols.

I wanted to take it a little further and test Interoperability of RouterOS with other open source solutions.

This post is an update from the previous one and I will add OSPF neighbor nodes to the setup. I decided to use Quagga the most talked about open-source routing protocol suit and XORP the eXtensible Open Router Platform.

Updated Setup

The following is the updated setup for the Interoperability test. I have added two new Ubuntu nodes as OSPF neighbor.

  • Quagga on Ubuntu
  • XORP on Ubuntu

Slide3.jpg

Configuration

Quagga

The following configuration was added to Quagga node

Screenshot from 2016-03-27 12:33:55.png

XORP

The XORP node did not advertise any new subnet but received OSPF updates.

XORP_Conf.png

Results

  • All the nodes could discover their neighbors

Screenshot from 2016-03-27 00:03:27.png

  • All nodes got route updates.

Screenshot from 2016-03-27 01:54:34.png

  • OSPF Traces

Screenshot from 2016-03-27 01:57:34.png