Fosdem 2016, part 3: enter the netflow
In the last part we used NBAR2 to classify traffic. To do this the router need to investigate every traffic flow which it sees.
Of course, it would be interesting to get this information out of the router and into some logs. This is the ‘how we did this’ post and it is a bit technical. I will discuss the results in future post.
In order to export the information the router collects we use netflow. In the past netflow was pretty limited with regards to what you could export, but now with Flexible Netflow (FnF) one can select what fields to export and this not only from the normal TCP/IP based tuples, but from extra information attached from NBAR or AVC as well, for example from the performance monitors you can select the jitter statistics.
There are precious few examples on how to set this up available, that is why I’m going into some detail here.
However first a warning: netflow exporting using FnF is feature mostly used by big Cisco Prime or IWAN setups, and only a few people do this by hand and it can be quite tricky to do so. One of the problems you might commonly face is that the CLI will accept configurations which the underlying platform doesn’t support. This will seem to work, but it won’t. In order to find out if the configuration worked, one needs to look at what the underlying platform, in this example the ASR 1006, did with the configuration commands using show flow monitor <name> internal. If you see new entries like FNF errors/Invalid argument then you know you did something the platform doesn’t like. To fix it you will need to deconfigure everything and start over.
You may be wondering, why would we want to export the NBAR2 flow information using netflow?
For Fosdem a vexing question has always been: “how many visitors do we have?”. To answer this questions we wanted to capture the HTTP user-agent header field. This looks a bit like for example “Dalvik/1.6.0 (Linux; U; Android 4.4.2; HTC One mini 2 Build/KOT49H)”. Which, in this case, tells us the user is using Android version 4.4.2 on an HTC One mini 2. As most people will not have more then one smartphone with them we can use this to guess the number of visitors.
Collecting this information ‘ per user’ or ‘ per client’ would be great, however netflow does not do this and only gives the information per flow. One way to determine could be the IP of the device, but as the IP changes over time we need to collect the relationship between the IP and the MAC address of the client. The MAC address should not change and is unique per client. So we are going to try to correlate everything to this MAC address.
In order to collect the Mac address to IP relationship we use SNMP, walking the tree beneath 188.8.131.52.184.108.40.206.1.4 / ipNetToPhysicalPhysAddress from RFC 4293. We save this relationship and use it later to associate an IP at a particular time to a MAC address.
To get the netflow itself we first need to describe what information we want to collect in a “flow record”:
flow record FR-IPv4 match ipv4 source address match ipv4 destination address match transport source-port match transport destination-port match ipv4 protocol collect counter bytes long collect counter packets collect timestamp sys-uptime first collect timestamp sys-uptime last collect interface input collect application http user-agent collect application name collect transport tcp flags flow record FR-IPv6 match ipv6 extension map match interface output match ipv6 protocol match ipv6 source address match ipv6 destination address match ipv6 traffic-class match ipv6 flow-label match transport source-port match transport destination-port collect transport tcp flags collect interface input collect counter bytes long collect counter packets collect timestamp sys-uptime first collect timestamp sys-uptime last collect application name collect application http user-agent
In this case we gather the standard tuple, interface information, flow statistics and the http user-agent which is the real information we are after.
Then we need to define an “exporter” to send this flow data to:
flow exporter FE-FM-v4 destination <some IP> transport udp 9995 export-protocol ipfix template data timeout 30 option interface-table timeout 30 option application-table timeout 30 option metadata-version-table timeout 30 option sub-application-table timeout 30 option application-attributes timeout 30
We also need a similar exporter for IPv6 traffic but crucially to a different port number, as one cannot export two different records to the same exporter.
From the content of the exporter one can see that we will use the FnF format (ipfix) and send the tables which are key to decode the application and interface ID’s.
Then we configure a monitor to collect and send the data:
flow monitor FE-FM-IPv4 exporter FE-FM-v4 cache entries 1000000 statistics packet protocol statistics packet size record FR-IPv4
Again we have a similar configuration for IPv6 traffic. The cache is important to keep the connection table in, as monitor we are defining by default only sends the information when the flow is considered ‘done’, which is after a short timeout following a FIN packets or a larger timeout if there is just no more activity for that flow.
Finally we attach this monitor to the internal interfaces with:
interface ... ip flow monitor FE-FM-IPv4 unicast input ip flow monitor FE-FM-IPv4 unicast output ipv6 flow monitor FE-FM-IPv6 unicast input ipv6 flow monitor FE-FM-IPv6 unicast output
We don’t monitor the outside interface as too much scanning traffic arrives there.
Here are some commands to verify that everything is working:
asr1k#show flow exporter FE-FM-v4 statistics Flow Exporter FE-FM-v4: Packet send statistics (last cleared 1w2d ago): Successfully sent: 9076680 (11964321569 bytes) Reason not given: 1971 (2389176 bytes) ... asr1k#show flow record FR-IPv6 flow record FR-IPv6: Description: User defined No. of users: 1 Total field space: 80 bytes Fields: ... asr1k#show flow monitor FE-FM-IPv6 Flow Monitor FE-FM-IPv6: Description: User defined Flow Record: FR-IPv6 Flow Exporter: FE-FM-v6 Cache: Type: normal (Platform cache) Status: allocated Size: 1000000 entries Inactive Timeout: 15 secs Active Timeout: 1800 secs Trans end aging: off Stats: protocol distribution size distribution
Finally we can look into the current contents of the cache:
asr1k#show flow monitor FE-FM-IPv4 cache format table Cache type: Normal (Platform cache) Cache size: 1000000 Current entries: 1329 High Watermark: 27245 Flows added: 34290645 Flows aged: 34294826 - Active timeout ( 1800 secs) 7907 - Inactive timeout ( 15 secs) 34286919 IPV4 SRC ADDR IPV4 DST ADDR TRNS SRC PORT TRNS DST PORT IP PROT ... =============== =============== ============= ============= ======= ... 220.127.116.11 18.104.22.168 47594 53 17 ...
These are the source of the ‘currently we see <N> IPv4 flows‘ I reported on twitter.
The creation of this cache and the collecting of flows consumes memory on the QFP of the router, you should monitor this usage with show platform resources.
This the data that the exporter sends to the netflow exporter’s target. Decoding this proved a bit of a struggle and I had to resort to writing a custom decoder in perl to log the flows. Unfortunately while the user-agent, IP and time information was recorded correctly, my perl script had a bug and the packets and byte counter information was not recorded properly.
In the next part we will learn what these flows teach us about the Fosdem 2016 network….