linux | Interpipes

Edited 2023-05-10 to add:
Please check the comments at the bottom for a Debian 'interfaces' example and a netplan example provided by generous visitors.
Netplan surely seems the easiest way to stand up VRFs, which does not particularly come as a surprise to me.
You will still need to configure your services to run inside the VRFs though. I thought it possible that there might be a more elegant way to do this in systemd now, but the existence of this issue against systemd suggests not.
Edited 2025-12-31 to add:
The systemd issue has been marked as closed in the last week by this PR, so it should now be possible to more easily run a service in a VRF from systemd. I don't currently have time to experiment with this, so if you want to comment below on what changes the PR enables and how to use it, that would doubtless be super useful for others.

This was remarkably difficult to find a simple explanation for on one page, and whilst not all that complex to achieve – if you understand all of the component parts – sometimes it is useful to have a complete explanation in one place and so, hopefully, someone will find this howto useful.

There are a number of reasons to have one or more VRFs (VRF stands for Virtual Routing and Forwarding) added to a system – researching and discussing the *why* of doing this is not in scope for this article – I’m going to assume you know why you’d want to do this.

If you somehow don’t really know what a VRF is beyond suspecting it’s what you want, in essence each VRF has it’s own routing table and this allows you to partition – in networking terms – a system to service two or more entirely different networks with their own routing tables (eg: each can have it’s own default route, and their own routes to what would otherwise be overlapping IP ranges).

NB: It’s important to note that the work you’re doing here can break your existing management access, if you’re already relying on the interface you want to move into the VRF to access the server in the first place. Ensure you can access the server over an interface OTHER than the one you want to move into the VRF – be it over a different NIC or using the local console / IPMI / ILO / DRAC etc.

Example environment

Let’s say you have a Linux box with two interfaces, eth0 and eth1 (even if systemd’s “predictable” naming is more common now).

eth0 carries your production traffic. This has a default gateway to reach the Internet, or whatever production network you have, and it’s configuration is ultimately irrelevant.

eth1 faces your management network. For demonstration purposes, our IP is 10.0.0.2/24, the default gateway we want to use for management traffic will be 10.0.0.1, and this is the interface you want to be in a separate VRF to completely segment out your management traffic.

All of the below instruction takes place as root – prepend commands with sudo if you prefer to sudo.

How do I create a VRF?

In Linux VRFs have a name, and an associated routing table number. Let’s say we want to create a VRF called Mgmt-VRF using table number 2 (the name and number is up to you – I’ve just chosen 2 – the number should just not be in use and if you don’t currently have any VRFs then 2 will be fine), and set it “up” to actually enable it.

ip link add Mgmt-VRF type vrf table 2
ip link set dev Mgmt-VRF up

Verify your VRF exists

ip vrf show

Which should show you:

Name              Table
-----------------------
Mgmt-VRF             2

Add your interface(s) to the new VRF (This will break your connection if you’re currently using them! Exercise caution!), here we add eth1 to Mgmt-VRF:

ip link set dev eth1 master Mgmt-VRF

You can now add routes to your new VRF like this, here we’re adding the default gateway of 10.0.0.1 to the routing table for our new VRF:

ip route add table 2 0.0.0.0/0 via 10.0.0.1

You can then validate that the default route exists in that table:

ip route show table 2

You should see something like:

default via 10.0.0.1 dev eth1
broadcast 10.0.0.0 dev eth1 proto kernel scope link src 10.0.0.2
10.0.0.0/24 dev eth1 proto kernel scope link src 10.0.0.2
local 10.0.0.2 dev eth1 proto kernel scope host src 10.0.0.2

At this point you could add any more static routes your new VRF might require, and you’re essentially done with configuring the VRF. The interface eth1 now exists in our new VRF.

Okay, how do I use the VRF?

Any tinkering will quickly reveal that your services which were bound to (or accessible over) the IP on eth1 don’t work anymore, at least if they only bind by IP and not by device.

You’ll also notice that when you use ping or traceroute or whatever it’ll run with the default routing table – even if you set the source IP to 10.0.0.2, it won’t work. This is because, like sshd, ping (and bash, and anything else) will run in the context of the default VRF unless you specifically request otherwise. Those processes will use the default routing table and will only have access to listen to IPs that are on interfaces also in that same VRF.

If the processes or services are be configured to bind to an interface however, they will operate in the VRF that the interface is configured for. A good example of a command with native support for binding to interfaces rather than IPs is traceroute:

traceroute -i eth1 8.8.8.8

But if you just want a generic way to execute commands inside a particular VRF, doing so is fairly easy using ip vrf exec, here, the same traceroute command without the need to specify an interface:

ip vrf exec Mgmt-VRF traceroute 8.8.8.8

If you’re going to be doing a lot of work in a particular VRF, you will probably find it most convenient to start your preferred shell (eg bash) using ip vrf exec as all child processes you start from that shell will also operate from that VRF, then exit the shell once you want to return to the default routing table:

ip vrf exec Mgmt-VRF /bin/bash
# do your work now, eg
traceroute 8.8.8.8
# time to go back to the default routing table
exit

Great, I can run traceroute. But what about my SERVICES?

For linux distributions running systemd – shifting services to run inside a VRF is actually relatively straightforward.

systemd calls processes and services under it’s purview “units”, and has so called unit files that describe services, how and when (using dependencies and targets) they should be started, etc

If you want to run a single instance of a service across all VRFs for some reason this is possible though beyond the scope of this article (look up net.ipv4.tcp_l3mdev_accept and net.ipv4.udp_l3mdev_accept).

Alternatively you might choose to have several copies of the service running, each in different VRFs (make sure they use different socks/pipes/pid files etc!), which is also beyond the scope of this article. It’s up to you to decide what suits your environment best.

However – if you only want to change your one existing copy of your service to run in a VRF, you just have to specify the new command that systemd executes in a so called override file.

You should use override files rather than modifying the main unit file because – in general – there will not be an override file in the distribution-provided package for your service, so when you do package upgrades you shouldn’t have any collisions with the package version of the file and your modified one which means that your modifications will be preserved. That said, you will have to keep an eye on whether you need to update your override ExecStart command if it changes in a breaking way between releases (check this first if a service you have overridden starts misbehaving after package updates!).

First you need to look in the unit file to get the current command that is executed to start the service:

systemctl cat sshd

You should see something like this (taken from a Debian 10 x64 system):

# /lib/systemd/system/ssh.service
[Unit]
Description=OpenBSD Secure Shell server
Documentation=man:sshd(8) man:sshd_config(5)
After=network.target auditd.service
ConditionPathExists=!/etc/ssh/sshd_not_to_be_run

[Service]
EnvironmentFile=-/etc/default/ssh
ExecStartPre=/usr/sbin/sshd -t
ExecStart=/usr/sbin/sshd -D $SSHD_OPTS
ExecReload=/usr/sbin/sshd -t
ExecReload=/bin/kill -HUP $MAINPID
KillMode=process
Restart=on-failure
RestartPreventExitStatus=255
Type=notify
RuntimeDirectory=sshd
RuntimeDirectoryMode=0755

[Install]
WantedBy=multi-user.target
Alias=sshd.service

The key configuration variable here is “ExecStart”. We need to modify ExecStart so that our sshd starts via ip vrf exec. Do so by creating (or opening, if you already have one!) the override file for sshd:

systemctl edit sshd

This will dump you into the default editor – probably nano unless you changed it – with either your existing override file if you have one, or a blank one if you don’t.

Due to the way systemd sanity checks your unit files, you have to deliberately *unset* ExecStart by first setting it to nothing, then specify the new ExecStart which you can see is the default ExecStart entry, but with

/bin/ip vrf exec Mgmt-VRF

prepended to the start. It’s important to specify the full path to the ip binary as when systemd executes this command, it will more likely than not do so without any PATH variable set, or with a different one to which your shell environment uses. Being explicit with paths ensures everything works as desired. (This is generally a good habit to get into)

If you have a blank file, in our example for sshd all you create is the following:

[Service]
ExecStart=
ExecStart=/bin/ip vrf exec Mgmt-VRF /usr/sbin/sshd -D $SSHD_OPTS

If you don’t have a blank file – well, I expect you know enough about what you’re doing here but if you do not already unset and reset ExecStart (or don’t have a [Service] section at all) then you can simply follow the above. If you’re already overriding ExecStart then you should prepend your override with the same /bin/ip vrf exec Mgmt-VRF

Force systemd to reload the unit files, and restart your service:

systemctl daemon-reload
systemctl restart sshd

That should be it – sshd is now running inside your new VRF; if you have a relatively up to date systemd build it should natively understand VRFs and so can show that it is running inside that vrf (see the CGroup section) – you can also see that it is using our override file as non-overridden services will not have a “Drop-In” section:

systemctl status sshd

● ssh.service - OpenBSD Secure Shell server
   Loaded: loaded (/lib/systemd/system/ssh.service; enabled; vendor preset: enabled)
  Drop-In: /etc/systemd/system/ssh.service.d
           └─override.conf
   Active: active (running) since Wed 2020-08-12 09:38:22 BST; 7h ago
     Docs: man:sshd(8)
           man:sshd_config(5)
 Main PID: 29107 (sshd)
    Tasks: 1 (limit: 4689)
   Memory: 2.8M
   CGroup: /system.slice/ssh.service
           └─vrf
             └─Mgmt-VRF
               └─29107 /usr/sbin/sshd -D

Aug 12 09:38:22 rt3 systemd[1]: Starting OpenBSD Secure Shell server...
Aug 12 09:38:22 rt3 sshd[29107]: Server listening on 10.0.0.2 port 22.
Aug 12 09:38:22 rt3 systemd[1]: Started OpenBSD Secure Shell server.
Aug 12 09:38:50 rt3 sshd[29116]: Accepted password for philb from 192.168.0.2 port 59159 ssh2
Aug 12 09:38:50 rt3 sshd[29116]: pam_unix(sshd:session): session opened for user philb by (uid=0)

Can’t connect?

If you’ve done all this, restarted your service, systemd confirms it’s running in the VRF, and you still can’t connect to it – make sure your service is not trying to bind to an IP that is on an interface in a different VRF to the one in which you started it. Remember that services can only successfully use local IPs that are in the same VRF, even if they start and give the impression of working.

Edit: Persisting VRFs between reboots

I actually forgot about this minor detail when I originally wrote this post – but you soon notice when you reboot and your VRFs are missing.

While I am aware there are probably half a dozen ways to skin this cat, some of which likely including learning how to use systemd-networkd, using systemd to simply execute a bash script at the correct time is by far the quickest solution requiring the least amount of explanation.

First, create a bash script that contains the commands you need to start your VRFs; /sbin/vrf.sh will do, containing, using the above VRF configuration for example:

#!/bin/bash
ip link add Mgmt-VRF type vrf table 2
ip link set dev Mgmt-VRF up
ip route add table 2 0.0.0.0/0 via 10.0.0.1
ip link set dev eth1 master Mgmt-VRF

As this is a script that will get executed as root on system start, make sure this file is owned by, and only writeable by, root! (chmod 700 is fine)

Then create a systemd service that runs this script at the correct time – first you need a service file – in my instance, I created /etc/systemd/system/vrf.service – containing:

[Unit]
Description=VRF creation
Before=network-pre.target
Wants=network-pre.target

[Service]
Type=oneshot
ExecStart=/sbin/vrf.sh

[Install]
WantedBy=multi-user.target

Then enable the service

systemctl enable vrf

You should see something like:

Created symlink /etc/systemd/system/multi-user.target.wants/vrf.service → /etc/systemd/system/vrf.service.

Your VRF(s) should now exist at the correct time during boot for the network services (eg sshd) that need to attach to them.

It’s pretty straightforward to install Nagios on a Debian system but if you want to be able to use the web interface to control the nagios process a little more work is required.

Starting with a blank slate (apt/dpkg will ensure any required prerequisites will be installed):

# apt-get install nagios3 apache2-suexec

You’ll be asked to set a password for the nagiosadmin user for the web interface.

Enable check_external_commands in Nagios to enable the ability to mute alarms, make comments, restart the nagios process etc from the web interface (pretty much invaluable, but be aware of the inherent risks in enabling the ability to influence the process from “outside”)

# sed -i -e 's/check_external_commands=0/check_external_commands=1/' /etc/nagios3/nagios.cfg
# /etc/init.d/nagios3 restart

Edit the nagios3 apache2 config include to make the web interface scripts run as the nagios user so that the web interface can write to the nagios command pipe; inserting the following at the top of /etc/nagios3/apache2.conf:

User nagios
Group nagios

Restart apache..

# /etc/init.d/apache2 restart

And you’re pretty much done! You can go to http://YOUR_HOST_NAME/nagios3/ and log in with your nagiosadmin password you set up when prompted at the start of this process.

Now, you can get started with creating host and service configuration files in /etc/nagios3/conf.d/ to monitor your servers/network/etc

Interpipes

Tag Archives: linux

Creating a VRF and Running Services Inside it on Linux

Example environment

How do I create a VRF?

Okay, how do I use the VRF?

Great, I can run traceroute. But what about my SERVICES?

Can’t connect?

Edit: Persisting VRFs between reboots

Tips for Configuring Nagios3 Efficiently – part 1

The LCHost Debian package mirror

Installing Nagios3 on Debian Wheezy

Reset virtualhost / domain file permissions plesk 9.x (and possibly 10.x?) linux

Example environment

How do I create a VRF?

Okay, how do I *use* the VRF?

Great, I can run traceroute. But what about my SERVICES?

Can’t connect?

Edit: Persisting VRFs between reboots

Okay, how do I use the VRF?