Efficiently Evading ISP Ingress Filters with Linux Tunneling and Policy Routing

Many people with cable modem or DSL service use a Linux system as a router between their home LANs and those services. The router can be configured as a security firewall, and it can allow all their home computers to share a single IP address on the cable or DSL service.

Even when multiple IP addresses are available from the cable or DSL service provider, the ability to share a single address is often desirable. Service providers usually charge for extra IP addresses, and they may be dynamically assigned and/or not guaranteed to be on the same IP subnet. This can significantly complicate intra-home networking. (Consider what happens if your addresses are dynamically allocated with short timeouts and the service goes down for an extended period).

By far the simplest and most popular way to share a single IP address is to configure your home router as a NAT (network address translator) and assign non-global addresses (e.g., in the range 10.x.x.x or 192.168.x.x) to the network interfaces on your local network. But NAT imposes some important limitations. Any servers running on the LAN behind the router cannot be accessed from the outside Internet, and certain transport and application protocols are incompatible with NAT even if they are initiated locally. For many people these limitations are not serious, so the simplicity of NAT makes it the way to go. If NAT works for you, you can stop reading now; all this complexity is not for you.

An alternative to NAT is tunneling. This is a more complex approach that requires the use of a tunnel endpoint machine on a remote network plus address space belonging to that remote network. But tunneling provides some significant advantages, such as complete transparency to all transport and application protocols and the ability to run externally-accessible servers behind the router on the home network.

Here is a simplified diagram of my home network. My home router is homer.ka9q.ampr.org ("homer" = "home router", get it?) running Linux 2.2.13. It has two Ethernet interfaces. Eth0 is attached to my house LAN, and eth1 is attached directly (with a point-to-point cable) to a Motorola cable modem on the San Diego Road Runner network. No other computers connect directly to the cable modem. The machine tunnel.qualcomm.com sits on the Qualcomm DMZ. It also runs Linux 2.2.13. The routers adjacent to tunnel route to it the address block 199.106.106.0/24 (this subnet is routed to Qualcomm by the Internet as a whole). Routing entries on tunnel "carve up" this block of addresses into subnets for various Qualcomm cable modem users. Each subnet is routed through the Linux "tunl0" pseudo-interface to the appropriate IP address assigned to that user by his cable provider.

[insert diagram here]

RR's DHCP server has assigned me 204.210.37.194, so the relevant routing entry on tunnel.qualcomm.com looks like this:

Destination Gateway Genmask Flags Iface
199.106.106.0 204.210.37.194 255.255.255.240 UG tunl0

Tunnel will take any packet destined for the range 199.106.106.0-15, encapsulate it in an IP packet with its own address as the source and 204.210.37.194 as the destination address, and route it back out to the Internet. There it will find its way to homer via CERFnet and Road Runner's network.

These exact situations also arise with Mobile IP; the tunnel machine is essentially nothing more than a Mobile IP "home agent". In fact, this would be a good application for Mobile IP, although at present the tunnels are maintained manually.

Tunneling can present some obscure problems. In theory, tunneling is needed only to carry packets from tunnel.qualcomm.com to homer; packets outbound from my home network could be sent directly to their destinations regardless of whether they originate on homer or on one of my other machines, such as bart, maggie and marge. But Road Runner, like many ISPs, blocks incoming packets from customers carrying source addresses other than those assigned to the customer, and such "alien" addresses include those of the user machines on the home network. Unfortunately, this is widely practiced as it is misguidedly perceived as a "security" feature.

Such "source address ingress filters" can easily be evaded by also tunneling in the outbound direction from homer, at the cost of less-than-optimum routing and a little more load on tunnel and its local network. (The ease with which this can be done shows why ingress filtering is such a misguided security feature.) The routing table on homer might look like this:

Destination      Gateway        Genmask            Flags    Iface
199.106.106.0    0.0.0.0        255.255.255.240    U        eth0
192.35.156.12    204.210.37.1   255.255.255.255    UGH      eth1
default          192.35.156.12 0.0.0.0            UG       tunl0

Here the home LAN subnet is 199.106.106.0, netmask 255.255.255.240; 192.35.156.12 is the IP address of tunnel.qualcomm.com, and 204.210.37.1 is the IP address of Road Runner's first-hop router. Interface 'eth0' is on the home LAN, while 'eth1' is connected directly to the cable modem. Interface 'tunl0' is a Linux tunnel pseudodevice whose source address is set to that assigned into 'eth1', i.e, the IP address assigned by the cable modem system. (This setting is essential so that encapsulated packets will make it through the ingress filter).

This configuration works pretty well. Outbound traffic from homer or any other local machine is automatically encapsulated in a packet with the Road Runner-assigned source IP address, thus evading the ingress filter. When the encapsulated packet reaches tunnel, the encapsulation is stripped off and the packet is forwarded normally to its destination from there. The host-specific route to tunnel is required to avoid the "encapsulation loop" that would otherwise occur when the encapsulated packet is re-routed for transmission.

But this configuration has some drawbacks. First, every outbound packet is sent through tunnel. This includes packets originated by homer, who owns the RR-assigned IP
address; they do not need to be tunneled to evade the filter. This results in unnecessarily nonoptimum routing for traffic originated by homer (as opposed to the necessarily nonoptimum routing for traffic originated by one of my other machines).

Second, machines other than homer on my home network cannot talk directly to (as opposed to through) tunnel.qualcomm.com. Those packets match homer's host-specific routing entry for tunnel.qualcomm.com, are routed to the cable modem without encapsulation, and are blocked because of their "alien" source addresses. Third, tunnel will be unable to talk to (as opposed to through) homer unless it uses homer's RR address; none of homer's fixed IP addresses can be used. Downstream packets to all of homer's addresses will arrive normally, but because homer must use the selected IP address in the source field of its reply packets, all but the RR address will be blocked by Road Runner. And because homer's RR-assigned address is dynamic and can change at any time, it is generally impractical to require tunnel's clients to use it.

These may seem like obscure problems not worthy of fixing, and ordinarily they are. But suppose tunnel provides additional services such as a web proxy cache and SMTP mail relaying for homer. Then they can become significant nuisances.

Linux Policy Routing

What homer needs is a way to make routing decisions that depend on a packet's source address, as well as its destination address. In particular, we want to relay a packet through tunnel if and only if its source address is "alien" to Road Runner. Otherwise we can just forward it to the first-hop Road Runner router without encapsulation.

The Linux 2.2 kernel supports policy routing, which is just the hook we need. This is still a fairly new feature, and it is not yet documented very well. But I've figured it out, and hope that this will be useful to others.

The straightforward approach to policy routing would augument the existing routing table with source address fields and other information on which you'd like to base a routing decision. But this is not how Linux does it. Instead, you get a new, separate "rule table" with templates for specified patterns in the IP source and destination address fields, the IP type of service field and the interface name. One action that can be taken when a packet matches a rule template is the discarding of the packet, with or without ICMP notification to the sender. This is how firewalling is now implemented in Linux. Another is to translate the address; this is how Linux can serve as a NAT.

Yet another action available is to select one of several routing tables to forward the packet. By default, the "main" routing table is used, but up to 255 other tables can be specified in this implementation. These are conventional routing tables, keyed only on destination address. But because we can select an entirely different table on the basis of a packet's source address, we can influence the routing decision by putting different entries in the various routing tables.

To manipulate these extra tables, you need the 'ip' command. It is included in the Debian iproute package, or you can build it from the current source distribution.

Setting up the policy rules

Here is how I set things up. First, we establish a rule that will match any packet with a source address on the local subnet:

ip rule add from 199.106.106.0/28 pref 1 table 1

This rule says that for any packet from an address on the local subnet (including homer's own address), we want to use special routing table number 1 rather than the default (main). The "pref 1" field says that in the event that multiple rules match the packet, we want this rule to be used unless it also matches the "local" rule (which applies only to packets addressed to the local machine).

Now we have to set up the two routing tables. Here is the "main" routing table, the one that will be used for all packets except those from the 199.106.106.0/28 net:

ip route add 199.106.106.0/28 dev eth0
ip route add default dev eth1 via 204.210.37.1 src 204.210.37.194 onlink

The first entry handles packets destined for the local network. The second entry establishes a default route that points at Road Runner's first-hop router (remember, the cable modem is plugged into eth1). The 'src 204.210.37.194' term says that when a task on homer leaves the source address unspecified when it opens a network connection to a destination that matches this entry, this is the source address to use (ie, the address assigned by Road Runner). The 'onlink' field forces the kernel to bypass its normal check that the specified next hop router is directly reachable. (Depending on the other routing entries, this flag may not be necessary, but it causes no harm.)

Now take a look at routing table 1, the table that will be used when routing packets from addresses on the 199.106.106/28 network:

ip route add 199.106.106.0/28 dev eth0
ip route add default dev tunl0 via 192.35.156.12 onlink

That's it! Packets being relayed to Road Runner with source addresses belonging to the local network will be routed to the tunnel pseudo-interface, where they are encapsulated in another IP packet. The destination address on the outer (newly added) IP header will be tunnel.qualcomm.com, and the source address will be that assigned by Road Runner, as specified by the src field on the default entry in the main table.

Now when this encapsulated packet is again presented to the Linux routing machinery, it will appear to have originated from homer. Thus the main routing table is used, which specifies (through the default route entry) that the packet should be forwarded directly to Road Runner's first hop router. Thus we avoid an encapsulation loop.

If your routing configuration is more complex, you will need to ensure that every entry is duplicated in both routing tables. In fact, the tables should be identical except for the default entry. The "main" table specifies that outbound packets are to be sent directly to the first hop Road Runner router, while table 1 says they are to be encapsulated to the remote tunnel.

I am interested in any comments on this article, particularly any corrections that may be required by my imperfect understanding of the Linux policy routing mechanism.

Phil Karn
23 Dec 1999