Retaking control of your network: Part 2

Computer networks are becoming increasingly complex with more and more devices connected each day. Gaining network visibility is absolutely crucial to ensure your traffic flows smoothly and transit costs are kept low. In this mini-series I will show you how to set-up sFlow sampling on Linux, aggregate the data and finally present it in flashy graphs.

Part 2: Collecting and analyzing sFlow

In this episode we will talk about how to collect, process and analyze sFlow using a tool called pmacct. In addition, we are going to store results in a timeseries database InfluxDB.

What is pmacct?

According to their homepage, pmacct is a

small set of multi-purpose passive network monitoring tools. It can account, classify, aggregate, replicate and export forwarding-plane data, ie. IPv4 and IPv6 traffic; collect and correlate control-plane data via BGP and BMP; collect and correlate RPKI data; collect infrastructure data via Streaming Telemetry.

We are going to feed pmacct with sFlow, generate per-ASN and per-country traffic level statistics and store the resulting data in InfluxDB.

Installing pmacct

Enough with the preface, let’s get to it.

  # git clone https://github.com/pmacct/pmacct.git --branch 1.7.3
  # cd pmacct/
  # ./autogen.sh 
  # ./configure --prefix=/usr --sysconfdir=/etc --localstatedir=/var --enable-rabbitmq --enable-mysql --enable-ipv6 --enable-l2 --enable-debug --enable-plabel --enable-64bit --enable-threads --enable-jansson --enable-geoipv2 --enable-bgp-bin
  # make
  # make install

We just downloaded, compiled and installed pmacct. The configuration is stored in /etc/pmacct and we opted to compile with GeoIP2 module (for MaxMind’s GeoIP database), BGP and IPv6 support.

Configuring pmacct

We can start by creating a config file in /etc/pmacct/sfacctd.conf with the following content:

debug: false
pidfile: /var/run/sfacctd.pid
! remember to configure logrotate if you use logfile
!logfile : /var/log/sfacct.log

! returns warning messages in case of data loss
! look at CONFIG-KEYS for details
! bufferization of data transfers between core process and active plugins (default 4MB)
plugin_pipe_size: 10240000

! The value has to be <= the size defined by 'plugin_pipe_size' and keeping a ratio < 1:1000 between the two
! Once a buffer is filled, it is delivered to the plugin
plugin_buffer_size: 10240

! automatically renormalizes byte/packet counters value basing on information acquired
! and take count of the sampling_rate in the (s)Flow sample
sfacctd_renormalize: true

sql_history_since_epoch: true

networks_file: /etc/pmacct/netmap.txt

plugins: print[print]
! check primitives list in CONFIG-KEYS
aggregate[print]: etype,src_host_country,dst_host_country,proto,src_as,dst_as

geoipv2_file: /etc/pmacct/GeoLite2-Country/GeoLite2-Country.mmdb

print_output_file[print]: /tmp/5m_avg.json
print_output[print]: json
print_history[print]: 5m
print_history_roundoff[print]: m
print_refresh_time[print]: 300
print_trigger_exec[print]: /etc/pmacct/pma2influx.sh

sfacctd_as_new: file
sfacctd_as_new[print]: file
sfacctd_net: file
sfacctd_net[print]: file

networks_no_mask_if_zero: false

snaplen: 700

We told pmacct how we want to aggregate data, that we want to export the aggregates to a file in /tmp in a JSON format every 5 minutes, and that a script has to run each time the data is purged.

Adding ASNs to the data

Because pmacct has no idea which prefixes belong to which ASN (since this information doesn’t come with sFlow), there are two options on how we can fix this – we can either use BGP to feed a DFZ to pmacct or specify the ASN:prefix mapping in a file. In this guide, I am choosing to do the latter using a file called /etc/pmacct/netmap.txt.

Since the internet routing and IP assignments constantly change, we need to update this information automatically. For this I chose to create the following script which needs to be placed in /etc/pmacct/update_prefixes.sh:

#!/bin/bash

echo "Starting prefix update"

cd /tmp
# Clean old data
rm -f oix-full-snapshot-latest.dat.bz2
rm -f oix-full-snapshot-latest.dat

wget http://archive.routeviews.org/oix-route-views/oix-full-snapshot-latest.dat.bz2
echo "Routeviews downloaded"

bzip2 -d oix-full-snapshot-latest.dat.bz2 
echo "Routeviews unpacked"

cat /tmp/oix-full-snapshot-latest.dat | grep -iv "0.0.0.0" | awk 'FNR > 5 { print $(NF-1)","$2 }' |
grep -iv { | uniq > /etc/pmacct/netmap.txt

echo "Prefix update finished"

I like to run it automatically using the following crontab entry

0 2 * * * root /etc/pmacct/update_prefixes.sh && systemctl restart pmacct

Adding GeoIP to the data

In order to add country information to the data, we are going to use MaxMind’s GeoLite 2 Country database, which can be downloaded from their site. The full path to the database file itself needs to be /etc/pmacct/GeoLite2-Country/GeoLite2-Country.mmdb. I did not bother automating GeoIP updates since I do not need very precise data and MaxMind doesn’t update this database very often anyway (weekly).

Creating a Systemd unit file

Since we want pmacct to be controlled through systemd, we need a unit file in /etc/systemd/system/pmacct.service with the following contents:

[Unit]
Description=Pmacct

[Service]
ExecStart=/usr/sbin/sfacctd -f /etc/pmacct/sfacctd.conf

[Install]
WantedBy=multi-user.target

Installing InfluxDB

Let’s take a break from pmacct for a while, we need to install and configure our database. I would suggest to follow official instructions on InfluxDB’s website.

Storing data in InfluxDB

At this point, pmacct should run and export aggregated data to the file in /tmp we specified. However, we want to do something useful with the data as well.

The following script takes the raw aggregated data, parses it and finally imports it into InfluxDB.

$ cat /etc/pmacct/pma2influx.sh

#!/usr/bin/env bash

DATABASE='sflow'

# Header for influx import
echo -e "# DML \n# CONTEXT-DATABASE: $DATABASE" > /tmp/pma2influx.txt

# We will import all the primitives of sfacctd as tags into influx (with the same name)
# only bytes are saved as field value
# these records are stored in a MEASUREMENT name "traffic"
# sfacctd BYTE size is w/o L2 informations, We need to add them to be more accurate with SNMP counters
# (26 bytes w/o VLAN tag, 30 bytes with) * PACKETS count

cat /tmp/5m_avg.json |
cut -d, -f1-7,10- |
sed 's/{//; s/}//; s/\":/\"=/g; s/"//g; s/\ //g; s/=,/=null,/g;' |
sed 's/,packets=/\ /;s/,bytes=/ /'| awk '{print "traffic,"$1,"bytes="$2*26+$3}' >> /tmp/pma2influx.txt

# This is how we import data into influx
influx -import -path=/tmp/pma2influx.txt

Configuring InfluxDB

There’s a small change we need to do in InfluxDB’s config before we can continue. We need to set max-values-per-tag = 0 in the [data] section, otherwise we won’t be able to store a lot of flows.

 

In the end, I would like to thank the following person for writing a post about a very similar topic, which helped me get started with pmacct: afenioux.fr

The next part is going to explain how to visualize our data in Grafana.

2 Comments

Add yours

  1. DOES NOT WORK

  2. great work 🙂

Leave a Reply...