Linux kernel
,
Linux system development
,
Networking
Monitoring Linux networking state using netlink
Oleg Kutkov
February 14, 2018
Once in my work, I needed to monitor all changes in the Linux networking subsystem: adding or deleting IP addresses, routes, etc.
Maybe the best way to do this is to use socket-based Netlink technology. Using Netlink, we can “subscribe” to some network-related notifications from the kernel. It’s also possible to send commands to the network stack and change the routing table, interface configurations, and packet filtering. For example, popular utilities like “iproute2” are also using Netlink to do their job.
The easiest way to access Netlink sockets from the userspace is to use
a libnetlink
library, which provides many macros, defines, and functions.
The worst part of this library and whole Netlink technology is a lack of good examples.
In this case, a good solution is using iproute2 source code to discover things you interesting in. This article is also may be used as a good startup point.
Introduction in Netlink
The Netlink is a socket-based Linux kernel interface used for inter-process communication (IPC) between both the kernel and userspace processes and between different userspace processes, in a way similar to the Unix domain sockets.
Like the Unix domain sockets, unlike INET sockets, Netlink communication cannot traverse host boundaries.
However, while the Unix domain sockets use the file system namespace, Netlink processes are addressed by process identifiers (PIDs).
Communication with Netlink is made using a separate socket’s family –
AF_NETLINK
.
Every Netlink message contains a header, represented with
nlmsghdr
structure. After the header may be attached some payload: some special structure or RAW data.
Netlink can split big messages into multiple parts. In such a case, every “partial” package is marked with
NLM_F_MULTI
flag, and the last package is marked with
NLMSG_DONE
flag.
There are a lot of useful macros that can help us to parse Netlink messages.
Everything is defined in Netlink.h and rtnetlink.h header files.
Creating of Netlink socket is pretty standard.
socket(AF_NETLINK, SOCK_RAW, NETLINK_ROUTE)
where:
AF_NETLINK
— netlink domain
SOCK_RAW
— raw socket
NETLINK_ROUTE
— required protocol.
In particular,
NETLINK_ROUTE
is used for routing and link information.
All available protocols can be found in the documentation. Here is a list of the most interesting:
NETLINK_ROUTE —
routing and link information, monitoring and configuration routines
NETLINK_FIREWALL
— transfer packets to userspace from the firewall
NETLINK_INET_DIAG
— information about sockets of various protocol families
NETLINK_NFLOG
— Netfilter/iptables ULOG
NETLINK_SELINUX
— SELinux event notifications
NETLINK_NETFILTER
— communications with Netfilter subsystem
NETLINK_KOBJECT_UEVENT
— get kernel messages
NETLINK_USERSOCK
— reserved for user-defined protocols
Communication
All communications through the Netlink socket is made with two well-known structures:
msghdr
and
iovec
.
struct iovec
void *iov_base; // data buff
__kernel_size_t iov_len; // size of the data
This structure contains a link to the actual message buffer with some data and its size.
struct msghdr {
void *msg_name; // client addr (socket name)
int msg_namelen; // length of the client addr
struct iovec *msg_iov; // pointer to the iovec structure with message data
__kernel_size_t msg_iovlen; // count of the data blocks
void *msg_control; // points to a buffer for other protocol control-related messages or miscellaneous ancillary data.
__kernel_size_t msg_controllen; // length of the msg_control
unsigned msg_flags; // flags on received message
struct msghdr can be directly passed to socket’s recvmsg and sendmsg and used to minimize the number of directly supplied arguments.
This structure is defined in <sys/socket.h>
See recvmsg and sendmsg for details.
A Netlink message stored in iovec typically contains a Netlink message header (struct nlmsghdr) and the payload attached. The payload can consist of arbitrary data but usually contains a fixed size protocol-specific header followed by a stream of attributes.
struct nlmsghdr
__u32 nlmsg_len; // message size, include this header
__u16 nlmsg_type; // message type (see below)
__u16 nlmsg_flags; // message flags (see below)
__u32 nlmsg_seq; // sequence number
__u32 nlmsg_pid; // sender identifier (typically - process id)
The following standard message types are defined:
NLMSG_NOOP – No operation, a message must be discarded
NLMSG_ERROR – Error message or ACK, see Error Message respectively ACKs
NLMSG_DONE – End of multipart sequence, see Multipart Messages
NLMSG_OVERRUN – Overrun notification (Error)
Every netlink protocol is free to define own message types. Note that message type values < NLMSG_MIN_TYPE (0x10) are reserved and may not be used.
The following standard flags are defined:
NLM_F_REQUEST — Request message
NLM_F_MULTI — Part of the multipart message
NLM_F_ACK — Acknowledge requested
NLM_F_ECHO — Request to echo this request; typical direction is from kernel to user
NLM_F_ROOT — Return based on the root of the tree
NLM_F_MATCH — Return all matching entries
NLM_F_ATOMIC — Is obsolete now, used to request an atomic operation
NLM_F_DUMP — Same as NLM_F_ROOT|NLM_F_MATCH
The client’s identifications (user and kernel spaces) are made with structure sockaddr_nl.
struct sockaddr_nl
sa_family_t nl_family; // always AF_NETLINK
unsigned short nl_pad; // typically filled with zeros
pid_t nl_pid; // client identifier (process id)
__u32 nl_groups; // mask for senders/recivers group
nl_pid – unique socket identifier, for the kernel sockets, this value is always zero. On the userspace, typically used current process id. This may cause problems in multithreading applications if multiple threads are trying to create and use Netlink sockets.
To work around this, we can initialize every nl_pid with this construction:
pthread_self() << 16 | getpid()
nl_groups — is a special bitmask of Netlink groups. This value is used after calling bind() on the Netlink socket to “subscribe” to specified groups’ events.
This is what we gonna use in our current task – network monitoring.
The definition of all groups can be found in the Netlink header file.
Here is some of them, which we can use in the current situation:
RTMGRP_LINK — notifications about changes in network interface (up/down/added/removed)
RTMGRP_IPV4_IFADDR — notifications about changes in IPv4 addresses (address was added or removed)
RTMGRP_IPV6_IFADDR — same for IPv6
RTMGRP_IPV4_ROUTE — notifications about changes in IPv4 routing table
RTMGRP_IPV6_ROUTE — same for IPv6
Netlink message payload
As I already said – after the header, we can found some payload, which may be split into parts. Libnetlink contains several macros that are extremely helpful in accessing and checking message payload.
Some most useful:
NLMSG_DATA — Get pointer to the message payload
NLMSG_PAYLOAD — Get the actual size of the message payload
NLMSG_ALIGN — Rounds the message size to the nearest aligned value
NLMSG_LENGTH — Get the size of the payload and returns a correct aligned value
NLMSG_SPACE — Get the actual size of the data in the Netlink packet
NLMSG_NEXT — Get the next part of the multipart message. When using these macros, it’s important to check for NLMSG_DONE message flag to avoid buffer overruns.
NLMSG_OK — Returns true if the message is correct and was successfully parsed
Practical usage of Netlink
Okay, I think that it’s enough of boring theory 🙂
Time to write some code and testing of the application.
Here is the full source code:
#include <errno.h>
#include <stdio.h>
#include <memory.h>
#include <net/if.h>
#include <arpa/inet.h>
#include <sys/socket.h>
#include <linux/rtnetlink.h>
// little helper to parsing message using netlink macroses
void parseRtattr(struct rtattr *tb[], int max, struct rtattr *rta, int len)
memset(tb, 0, sizeof(struct rtattr *) * (max + 1));
while (RTA_OK(rta, len)) { // while not end of the message
if (rta->rta_type <= max) {
tb[rta->rta_type] = rta; // read attr
rta = RTA_NEXT(rta,len); // get next attr
int main()
int fd = socket(AF_NETLINK, SOCK_RAW, NETLINK_ROUTE); // create netlink socket
if (fd < 0) {
printf("Failed to create netlink socket: %s\n", (char*)strerror(errno));
return 1;
struct sockaddr_nl local; // local addr struct
char buf[8192]; // message buffer
struct iovec iov; // message structure
iov.iov_base = buf; // set message buffer as io
iov.iov_len = sizeof(buf); // set size
memset(&local, 0, sizeof(local));
local.nl_family = AF_NETLINK; // set protocol family
local.nl_groups = RTMGRP_LINK | RTMGRP_IPV4_IFADDR | RTMGRP_IPV4_ROUTE; // set groups we interested in
local.nl_pid = getpid(); // set out id using current process id
// initialize protocol message header
struct msghdr msg;
msg.msg_name = &local; // local address
msg.msg_namelen = sizeof(local); // address size
msg.msg_iov = &iov; // io vector
msg.msg_iovlen = 1; // io size
if (bind(fd, (struct sockaddr*)&local, sizeof(local)) < 0) { // bind socket
printf("Failed to bind netlink socket: %s\n", (char*)strerror(errno));
close(fd);
return 1;
// read and parse all messages from the
while (1) {
ssize_t status = recvmsg(fd, &msg, MSG_DONTWAIT);
// check status
if (status < 0) {
if (errno == EINTR || errno == EAGAIN)
usleep(250000);
continue;
printf("Failed to read netlink: %s", (char*)strerror(errno));
continue;
if (msg.msg_namelen != sizeof(local)) { // check message length, just in case
printf("Invalid length of the sender address struct\n");
continue;
// message parser
struct nlmsghdr *h;
for (h = (struct nlmsghdr*)buf; status >= (ssize_t)sizeof(*h); ) { // read all messagess headers
int len = h->nlmsg_len;
int l = len - sizeof(*h);
char *ifName;
if ((l < 0) || (len > status)) {
printf("Invalid message length: %i\n", len);
continue;
// now we can check message type
if ((h->nlmsg_type == RTM_NEWROUTE) || (h->nlmsg_type == RTM_DELROUTE)) { // some changes in routing table
printf("Routing table was changed\n");
} else { // in other case we need to go deeper
char *ifUpp;
char *ifRunn;
struct ifinfomsg *ifi; // structure for network interface info
struct rtattr *tb[IFLA_MAX + 1];
ifi = (struct ifinfomsg*) NLMSG_DATA(h); // get information about changed network interface
parseRtattr(tb, IFLA_MAX, IFLA_RTA(ifi), h->nlmsg_len); // get attributes
if (tb[IFLA_IFNAME]) { // validation
ifName = (char*)RTA_DATA(tb[IFLA_IFNAME]); // get network interface name
if (ifi->ifi_flags & IFF_UP) { // get UP flag of the network interface
ifUpp = (char*)"UP";
} else {
ifUpp = (char*)"DOWN";
if (ifi->ifi_flags & IFF_RUNNING) { // get RUNNING flag of the network interface
ifRunn = (char*)"RUNNING";
} else {
ifRunn = (char*)"NOT RUNNING";
char ifAddress[256]; // network addr
struct ifaddrmsg *ifa; // structure for network interface data
struct rtattr *tba[IFA_MAX+1];
ifa = (struct ifaddrmsg*)NLMSG_DATA(h); // get data from the network interface
parseRtattr(tba, IFA_MAX, IFA_RTA(ifa), h->nlmsg_len);
if (tba[IFA_LOCAL]) {
inet_ntop(AF_INET, RTA_DATA(tba[IFA_LOCAL]), ifAddress, sizeof(ifAddress)); // get IP addr
switch (h->nlmsg_type) { // what is actually happenned?
case RTM_DELADDR:
printf("Interface %s: address was removed\n", ifName);
break;
case RTM_DELLINK:
printf("Network interface %s was removed\n", ifName);
break;
case RTM_NEWLINK:
printf("New network interface %s, state: %s %s\n", ifName, ifUpp, ifRunn);
break;
case RTM_NEWADDR:
printf("Interface %s: new address was assigned: %s\n", ifName, ifAddress);
break;
status -= NLMSG_ALIGN(len); // align offsets by the message length, this is important
h = (struct nlmsghdr*)((char*)h + NLMSG_ALIGN(len)); // get next message
usleep(250000); // sleep for a while
close(fd); // close socket
return 0;
The compilation is straightforward, nothing additional:
gcc netmon.c -o netmon
And run:
./netmon
Now you can try to play with your network interfaces – unplug and plug back of the Ethernet cable, reconnect WiFi, and so on.
You will get something like this:
It’s alive! 🙂
Data processing
In this example, you can find some new structures:
struct ifinfomsg
unsigned char ifi_family; // interface family
unsigned short ifi_type; // device type
int ifi_index; // interface index
unsigned int ifi_flags; // device flags
unsigned int ifi_change; // reserved, currently always 0xFFFFFFFF
struct ifinfomsg represents a network device and contains some useful fields, like device flags and index.
struct ifaddrmsg
unsigned char ifa_family; // Adress type (AF_INET or AF_INET6)
unsigned char ifa_prefixlen; // Length of the network mask
unsigned char ifa_flags; // Address flags
unsigned char ifa_scope; // Address scope
int ifa_index; // Interface index, same as in struct ifinfomsg
struct ifaddrmsg represents the network address assigned to the device
struct rtattr
unsigned short rta_len; // Length of the option
unsigned short rta_type; // Type of the option
/* data */
struct rtattr is a helper structure used to store some parameters of the address or network link
After the successful creation of the Netlink socket, we initializing sockaddr_nl structure by setting a mask of the groups which messages we want to receive:
RTMGRP_LINK, RTMGRP_IPV4_IFADDR and RTMGRP_IPV4_ROUTE.
Also, at this point, we are allocating message structure and data buffer with a length of 8192 bytes.
After all of this, we can call bind() on a socket, subscribing to group events.
We get new messages from the socket in the infinity cycle and then parsing this message using Netlink macro.
Checking nlmsg_type field, we can detect the type of the received message. In the case of some interface/address event, we are digging deeper and getting all the interesting data.
All information is stored as an array of attributes with struct rtattr.
Using the little helper function parseRattr we can parse all attributes and extract readable information from this array.
struct ifinfomsg *ifi = (struct ifinfomsg*) NLMSG_DATA(h); // where h is netlink message header
parseRtattr(tb, IFLA_MAX, IFLA_RTA(ifi), h->nlmsg_len);
char* ifName = (char*)RTA_DATA(tb[IFLA_IFNAME]); // readable interface name, eth0 for example
You can check rtnetlink manual page to get more information about rtattr arrays and possible attributes indexes.
I believe that all other code in this example is pretty obvious and didn’t require detailed explanations.
But if you have some questions – please ask in the comments.
I hope this article will be helpful.
Additional materials:
tools.ietf.org/html/rfc3549
http://man7.org/linux/man-pages/man7/netlink.7.html
http://man7.org/linux/man-pages/man7/rtnetlink.7.html
http://linuxjournal.com/article/7356
Related
Tagged linux, netlink, network, socket
Thanks for the article!
There are a few weird not documented macros directly from linux kernel – how did you figure out what they are supposed to do? Seems like the only way to work with netlink without libs like libnl is to debug and pull things from iproute2…
Hello! Yep, I spent a lot of the time trying to figure out how it supposed to work.
I digged into the kernel and iproute2 sources, debugging and experimenting.
Hello Oleg,
I would be grateful if you checked this SO question
https://stackoverflow.com/questions/55614270/how-to-asynchronously-check-if-an-ipv6-netwrok-interface-changes-state-from-tent
and if possible provided answers/ideas, etc.
Thanks in advance!!!
NLMSG_DATA(h) is first casted to ifinfomsg and then ifaddrmsg in your code. Can you explain how it works? I thought we have ifinfomsg in case of NEW_LINK,DEL_LINK and ifaddrmsg in case of NEW_ADDR and DEL_ADDR
Hello Oleg,
Another question dear Oleg:). In function parseRtattr(), “h->nlmsg_len” is passed as “len” where it is the size of whole netlink message (nlmsghdr+ifinfomsg/ifaddrmsg+rtattrs). Then, this len is checked in RTA_OK and updated in RTA_NEXT macros. I think this size should be just size of rtattrs so that RTA_OK be valid.
RTM_DELLINK event is not triggered in any case. Do you know why ? I thought it ll be triggered when I remove my cable but not
Hi Oleg,
Thanks for such an informative blogpost on netlink.
Is there any specific post related on how to access nested attribute such as IFLA_LINKINFO.